Unlock near real-time, no-ETL analytics at scale with Azure Synapse Link for Azure Cosmos DB
Co-authored by Sri Chintala and Ramnandan Krishnamurthy, Azure Cosmos DB Team
In today’s environment, adopting a data-driven culture has become critical for enterprises across industries. At the core of this culture is the need to deliver analytics over operational data in near real-time to optimize business processes and unlock new opportunities. Over the years, customers have chosen Azure Cosmos DB as the fully managed, blazingly fast NoSQL database for modern application development at any scale. With the growing volume of operational data and the opportunity to unlock powerful insights from this data, also come the challenges of handling and analyzing this data at scale.
Traditionally, to ensure there is no performance impact on operational workloads, enterprises have maintained separate online transactional processing (OLTP) system and online analytical processing (OLAP) system. While the OLTP system is optimized for mission-critical transactional workloads, the OLAP system is optimized independently for analytical workloads over larger volumes of data. The friction and time to get data from OLTP systems to OLAP systems can cause a significant delay of hours, days and even weeks, which results in sub-optimal decisions made on stale data.
Announcing Azure Synapse Link for Azure Cosmos DB for cloud-native HTAP
We’re excited to announce public preview of Azure Synapse Link for Azure Cosmos DB, a cloud-native hybrid transactional and analytical processing (HTAP) capability that enables near-real time analytics over operational data in Azure Cosmos DB. Azure Synapse Link finally breaks down the barrier that has long existed between the OLTP and OLAP systems. Azure Synapse Link “links” your Azure Cosmos DB to Synapse Analytics in Azure, providing the ability to get immediate insights on your business.
With just a ‘single click’ you can now analyze large volumes of operational data in Azure Cosmos DB in near real-time with no ETL pipelines and no performance impact on transactional workloads.
This capability unlocks new business scenarios to raise alerts based on live trends, build near real-time dashboards, and business experiences based on user behavior. Business analysts, data engineers and data scientists can now use Azure Synapse Analytics to run near real-time BI, AI and other big data analytical workloads over operational data in Azure Cosmos DB at scale.
Sneak-peek into the magic of Azure Synapse Link
Azure Synapse Link is comprised of two main components:
- Azure Cosmos DB analytical store: A fully-managed column-oriented ‘analytical store’ within containers in addition to the existing row-oriented ‘transactional store’. The analytical store is fully isolated from the transactional store such that queries over the analytical store have no impact on your transactional workloads. Updates to the operational data are automatically synced from transactional store to analytical store in near real-time within minutes. Learn more about Azure Cosmos DB analytical store.
- Azure Synapse Analytics run-time support: Native integration of the Azure Cosmos DB analytical store with the various analytics runtimes supported by Azure Synapse Analytics, which can query the analytical store directly. No further data transformations are required to analyze data from the analytical store. As of today, Azure Synapse Analytics supports Apache Spark and Synapse SQL serverless with Azure Cosmos DB analytical store.
Below you can see some examples of the power of Azure Synapse Link:
- Create Synapse Spark tables over Azure Cosmos DB containers with a simplified programming model. Then leverage the power of Spark’s distributed processing to perform joins and other complex aggregations between Spark tables. This allows you to transform and enrich the operational data in Azure Cosmos DB directly with Synapse Spark.
- Build ML models over operational data in Azure Cosmos DB directly with Azure Synapse Analytics’ native integration with Azure Machine Learning and Apache Spark ML. Then deploy these models for real-time scoring in applications.
- Query operational data from Azure Cosmos DB and build SQL views leveraging the native integration with Synapse SQL serverless and its full expressiveness of T-SQL language. Model and publish auto-refreshing BI dashboards over Azure Cosmos DB directly with familiar BI tools such as Power BI.
Here are a few common use cases across industry verticals which benefit from Azure Synapse Link:
- Supply Chain Analytics, Forecasting & Reporting
- Real-time personalization
- Predictive Maintenance, Anomaly Detection in IoT scenarios
Read more about Azure Synapse Link use cases.
With Azure Synapse Link for Azure Cosmos DB, enterprises can now focus on unlocking precious insights from their operational data in near real-time at scale. By exploiting near real-time data to power predictive analytics, enterprises can identify emerging marketing opportunities and make business decisions as quickly as the data arrives. This will truly differentiate the leaders from the followers in the ongoing digital transformation journey.
To learn more, check out our documentation on Azure Synapse Link for Azure Cosmos DB. To get started, check out our guide on how-to configure Azure Synapse Link and visit our FAQ page for answers to commonly asked questions. You can also check out samples to get started on Github.
For any feedback or suggestions to improve our product offering, please reach out to us directly at email@example.com.