This article was co-authored by Sudhindra Sheshadrivasan, Staff Product Manager at Confluent.
We’re excited to announce the General Availability (GA) of the Confluent Cloud-managed V2 Kafka Connector for Azure Cosmos DB! This release marks a major milestone in our mission to simplify and streamline real-time data streaming from and into Azure Cosmos DB using Apache Kafka.
The V2 connector is now production-ready and available directly from Confluent Cloud’s connector catalog. This managed connector allows you to seamlessly integrate Cosmos DB with your Kafka-powered event streaming architecture, without worrying about provisioning, scaling, or managing the connector infrastructure.
Whether you’re building a real-time analytics pipeline, syncing operational data between services, or enabling event-driven microservices, this connector empowers you to unlock the full potential of change data capture (CDC) and low-latency writes across your Azure Cosmos DB containers.
Why V2 of the New Connectors?
The release of version 2.0 isn’t just a simple upgrade – it’s a complete architectural overhaul that addresses the limitations of version 1.0 and introduces scalability, performance, and flexibility across both the source and sink connectors. Check out migration guidance for detailed comparisons between V1 and V2 and what to keep in mind when upgrading.
Key motivations behind V2 include:
- Scalability Bottlenecks: In V1, every container required a separate Kafka task for CDC. V2 eliminates this constraint by allowing multiple containers to be processed by a single Kafka task, dramatically improving task efficiency and lowering operational cost.
- Improved Performance: By adopting the pull-based change feed model, the source connector reduces memory usage and thread contention—leading to more stable and performant stream processing.
- Flexible Write Patterns: The sink connector now supports multiple write strategies, enabling much finer control over how data lands in Cosmos DB, especially in concurrent or distributed write scenarios.
- Better Integration with Kafka Infrastructure: Metadata management now leverages Kafka’s native offset tracking, removing the dependency on Cosmos-specific lease containers and ensuring safer, more reliable checkpointing.
In short, V2 was built from the ground up to be cloud-native, more intelligent, and aligned with Kafka best practices.
Advantages of the New Connectors
The GA version of the Confluent-managed Cosmos DB Kafka connector introduces critical enhancements that make it a powerful tool for enterprise-grade streaming and event-sourcing workloads:
- High-Throughput Change Feed Reads Thanks to container batching and pull-model support, you can ingest high volumes of changes from many containers without a proportional increase in Kafka tasks.
- Advanced Sink Strategies
Choose from multiple write modes including:
- Item Override (default)
- Create Only
- Delete
- Update If ETag Matches These modes offer fine-grained control over upserts, deletes, and conditional updates – ideal for microservices architectures and stateful event handling.
- Built-in Throughput Control Prevent throttling and manage RU usage more effectively with native throughput control, essential for cost and performance optimization in Cosmos DB.
- Integrated Metrics and Observability Deep integration with Confluent monitoring and Azure SDK metrics means you get full visibility into connector health, lag, throughput, and errors—perfect for SRE teams and production troubleshooting.
- Improved Security and Authentication Now supports Azure Entra ID Service Principals using client secrets.
- More Reliable Metadata Handling Eliminates reliance on lease containers by adopting Kafka-native offset management and introduces a low-frequency metadata topic for robust handling of partition splits and merges—ensuring resilience during Cosmos DB’s dynamic scaling events.
With these capabilities, the V2 connector is now fully equipped to support high-scale, production-ready, event-driven solutions in the cloud.
Â
Getting started with the new connectors
The new connectors are very easy to setup. Before getting started on the setup itself, please note the pre-requisites:
Azure Cosmos DB Source Prerequisites
- Authorized access to a Confluent Cloud cluster on Amazon Web Services (AWS), Microsoft Azure (Azure), or Google Cloud.
- The Confluent CLI installed and configured for the cluster. See Install the Confluent CLI.
- Schema Registry must be enabled to use a Schema Registry-based format (for example, Avro, JSON_SR (JSON Schema), or Protobuf).
- Authorized access to read data in Azure Cosmos DB. For more information, see Secure access to data in Azure Cosmos DB.
- The Azure Cosmos DB is configured to use the NoSQL API.
Â
Azure Cosmos DB Sink connector Prerequisites
- Authorized access to a Confluent Cloud cluster on Amazon Web Services (AWS), Microsoft Azure (Azure), or Google Cloud.
- The Confluent CLI installed and configured for the cluster. See Install the Confluent CLI.
- Schema Registry must be enabled to use a Schema Registry-based format (for example, Avro, JSON_SR (JSON Schema), or Protobuf).
- At least one source Kafka topic must exist in your Confluent Cloud cluster before creating the sink connector.
- The Azure Cosmos DB and the Kafka cluster must be in the same region.
- The Azure Cosmos DB requires an id field in every record.
Connector Setup
Setting up of the new connectors is very similar to the regular setup experience with Confluent Cloud. Connectors can be set up via UI, CLI or API.
Steps to setup the Azure Cosmos DB Source V2 connector.
- Log in to Confluent Cloud:
- Create the Connector:
- Navigate to the “Connectors” tab in the left sidebar
- Click “Add connector”
- Search for “Azure Cosmos DB” and select “Azure Cosmos DB Source Connector v2”
- Authenticate with the kafka cluster
- Authenticate via service account or an API key.
- Azure Cosmos DB Connection Settings:
Connect to the DB with the following:
- Cosmos Endpoint: The Azure Cosmos DB endpoint URL.
- Cosmos Database name: The name of your Cosmos database.
- Cosmos Connection Auth Type: MasterKey or ServicePrincipal
- Configuration:
- Choose the appropriate output record value format (AVRO, JSON_SR, PROTOBUF)
- Setup a Topic-Container map: A comma-delimited list of Kafka topics mapped to Cosmos containers.
- Sizing:
- Specify the number of tasks required for the connector.
- Review and Launch:
- Verify all settings
- Click “Launch” to deploy the connector
Here is a video which details the source connector setup process:
Steps to setup the Azure Cosmos DB Sink V2 connector.
Â
- Log in to Confluent Cloud:
- Create the Connector:
- Navigate to the “Connectors” tab in the left sidebar
- Click “Add connector”
- Search for “Azure Cosmos DB” and select “Azure Cosmos DB Sink Connector v2”
- Choose the right topic from the topics list
- Authenticate with the kafka cluster
- Authenticate via service account or an API key.
- Azure Cosmos DB Connection Settings:
Connect to the DB with the following:
- Cosmos Endpoint: The Azure Cosmos database endpoint URL.
- Cosmos Database name: The name of your Cosmos database.
- Cosmos Connection Auth Type: MasterKey or ServicePrincipal
- Configuration:
- Choose the appropriate input record value format (AVRO, JSON_SR, PROTOBUF)
- Specify a Topic-Container map: A comma-delimited list of Kafka topics mapped to Cosmos containers.
- Choose the right Cosmos DB Item write Strategy. Note that this defaults to ItemOverwrite.
- Sizing:
- Specify the number of tasks required for the connector.
- Review and Launch:
- Verify all settings
- Click “Launch” to deploy the connector
Here is a video which details the sync connector setup process:
Conclusion
The release of new fully managed Azure Cosmos DB V2 Source and Sink connectors marks a significant milestone for organizations leveraging both Confluent Kafka and Microsoft’s globally distributed Azure Cosmos DB database service. These source and sink connectors eliminate the complexity previously associated with integrating these powerful technologies, providing a streamlined path for real-time data movement between Azure Cosmos DB and Kafka ecosystems.
By implementing these connectors, data teams can now:
- Capture Azure Cosmos DB change feeds directly into Confluent Kafka topics with minimal latency
- Stream data from Confluent Kafka into Azure Cosmos DB collections without custom code development
- Maintain consistency across distributed systems through configurable transformation options
- Leverage Confluent Cloud fully managed services to lower operational complexity and reduce costs
For organizations building event-driven architectures, these connectors represent a critical building block that simplifies infrastructure while improving reliability. The ability to seamlessly connect Azure Cosmos DB’s capabilities with Kafka’s streaming prowess opens new possibilities for real-time analytics, microservices synchronization, and data replication scenarios.
As cloud-native architectures continue to evolve, having certified, well-maintained integration points between key services becomes increasingly vital. We’re excited to see how these connectors will empower developers to build more responsive, resilient data pipelines that leverage the best of both platforms.
Leave a review
Tell us about your Azure Cosmos DB experience! Leave a review on PeerSpot and we’ll gift you $50. Get started here.
About Azure Cosmos DB
Azure Cosmos DB is a fully managed and serverless NoSQL and vector database for modern app development, including AI applications. With its SLA-backed speed and availability as well as instant dynamic scalability, it is ideal for real-time NoSQL and MongoDB applications that require high performance and distributed computing over massive volumes of NoSQL and vector data.
To stay in the loop on Azure Cosmos DB updates, follow us on X, YouTube, and LinkedIn.
0 comments
Be the first to start the discussion.