Announcing General Availability of Change Data Capture (CDC) on Azure SQL Database
We are excited to announce the general availability (GA) of change data capture (CDC) in Azure SQL Database.
What is Change data capture (CDC)?
CDC provides historical change information for a user table by capturing both the fact that Data Manipulation Language (DML) changes (insert / update / delete) were made and the changed data. Changes are captured by using a capture process that reads changes from the transaction log and places them in corresponding change tables. These change tables provide a historical view of the changes made over time to source tables. CDC functions enable the change data to be consumed easily and systematically.
Why use Change data capture (CDC)?
CDC is a widely used feature by a wide range of customers for a variety of purposes as mentioned below, but not limited to:
- Tracking data changes for auditing
- Recording data changes on source database and integrating with other services (e.g. Azure Data Factory) to stream these changes to other destinations
- Performing analytics on change data
Change data capture (CDC) on Azure SQL database
CDC is now generally available on Azure SQL databases, enabling customers to track insert / update / delete data changes on their Azure SQL Database tables. On Azure SQL database, CDC offers a similar functionality to SQL Server and Azure SQL Managed Instance, providing a scheduler which automatically runs change capture and cleanup processes on the change tables. These capture and cleanup processes used to be run as SQL Server Agent jobs on SQL Server on premises and on Azure SQL Managed Instance, but now they run automatically through the scheduler in Azure SQL databases. Customers can still run scans and cleanup manually on demand.
CDC on Azure SQL database has been one of the most requested features through Azure feedback mechanisms. Multiple customers have successfully used CDC during public preview, such as Rubrik, a major cloud data management company. After using CDC, they highlighted: “Flawless implementation. An extremely useful addition to Azure SQL Databases that enabled us to track and capture changes from them meticulously.”
Enabling change data capture (CDC) on an Azure SQL database
Customers will be able to use CDC on Azure SQL databases higher than the S3 (Standard) tier or vCore equivalent. More specifically, CDC cannot be enabled on subcore databases.
Enabling CDC on an Azure SQL database is similar to enabling CDC on SQL Server or Azure SQL Managed Instance. Learn more here: Enable CDC
Sending CDC data changes to other destinations
Multiple Microsoft technologies such as Azure Data Factory can be used to move CDC change data to other destinations (e.g. other databases, data warehouses). Other third-party services also offer streaming capabilities for changed data from CDC. For instance, Striim and Qlik offer integration, processing, delivery, analysis, or visualization capabilities for CDC changes.
“Real-time information is vital to the health of the enterprises,” says Codin Pora, VP of Technology and partnership at Striim. “Striim is excited to support the new change data capture (CDC) capabilities of Azure SQL Database and help companies drive their digital transformation by bringing together data, people, and processes. Striim, through its Azure SQL Database CDC pipelines, provides real-time data for analytics and intelligence workloads, operational reporting, ML/AI implementations and many other use cases, creating value as well as competitive advantage in a digital-first world. Striim builds continuous streaming data pipelines with minimal overhead on the source Azure SQL Database systems, while moving database operations (inserts, updates, and deletes) in real time with security, reliability, and transactional integrity.”
“Joint customers are excited about the potential of leveraging Qlik Data Integration alongside CDC in Azure SQL DB and CDC for SQL MI to securely access more of their valuable data for analytics in the cloud,” said Kathy Hickey, Vice President, Product Management at Qlik. “We are happy to announce that in addition to support for Azure SQL MI as a source, the newly available MS-CDC capabilities will also allow us to support Azure SQL DB sources via our Early Access Program. We look forward to partnering with Microsoft on helping customers leverage these capabilities to confidently create new insights from their Azure managed data sources.”
To know more about change data capture (CDC), check out the documentation.
Learn module: Data replication on Azure SQL Databases – Learn | Microsoft Docs
Is it possible to connect Apache Kafka to Azure SQL Database? I would like to propagate the changes to downstream Apache Kafka. How can I achieve this?
You can explore some third-party tools in this case such as Debezium. Refer this – https://github.com/azure-samples/azure-sql-db-change-stream-debezium
I still see the CDC only in Preview mode for the Azure SQL Databases as per the Microsoft documentation. Do you know by when this will be in “General availability”?
Hello, Falcon. Thank you for bringing this to our attention…we are aware of the feature comparison page and will be updating it soon. Please consider this blog as an official announcement that CDC on Azure SQL DB is now GA.
Thank you for the confirmation.
One more question. Is it available for all locations?
It is mentioned “More specifically, CDC cannot be enabled on subcore databases”. But what are “subcore” databases?
@Sujai – CDC can only be enabled on databases tiers S3 and above. Sub-core (Basic, S0, S1, S2) Azure SQL Databases are not supported for CDC.