Latest NoSQL Java Ecosystem Updates: June 2024 - June 2025

Welcome to the latest roundup of key updates across the Azure Cosmos DB Java ecosystem!

The largest external customers of Azure Cosmos DB API for NoSQL, running some of the biggest and most mission critical workloads in Azure, are primarily Java users! From powerful new AI integrations to improvements in the Java SDK, Spring Data, Spark, and Kafka connectors, the past year has been transformative for developers building cloud-native and AI-powered applications. It’s never been easier or more powerful to build modern Java applications on Azure Cosmos DB!

Stay tuned for more updates in the future. Happy coding!

🤖 AI Integrations

In the past 12 months, Azure Cosmos DB has rolled out native support for AI development in Java, including integrations with Spring AI and LangChain4j, two leading frameworks for building AI applications.

Spring AI

Blog: Building multi-agent AI apps in Java with Spring AI and Cosmos DB
Sample: Multi-agent orchestration in Java using Azure Cosmos DB with Spring AI
Build scalable, multi-agent AI systems using Spring Boot and Cosmos DB as persistent memory and knowledge retrieval.
Spring AI now supports Azure Cosmos DB Vector Store, allowing developers to store and retrieve embeddings for tasks such as similarity search and RAG (retrieval-augmented generation).

LangChain4j

Sample: https://github.com/microsoft/AzureDataRetrievalAugmentedGenerationSamples/tree/main/Java/CosmosDB-NoSQL-Langchain
Java developers can now leverage Azure Cosmos DB as a document store or vector database within LangChain4j workflows for chat, summarization, and custom pipelines.

Azure Cosmos DB is now an ideal choice for building AI applications in Java. With native SDK support for vector indexing, full text search, hybrid search, and seamless integration with AI frameworks, developers can create intelligent apps that are fast, scalable, and easy to manage.

Java SDK Enhancements

Hybrid and Full Text Search (PR #42885)

Azure Cosmos DB now supports native Full Text Search (FTS) and Hybrid Search across structured and vector data. You can filter by semantic meaning and rank results by relevance.

Example queries:

SELECT TOP 50 c.id, c.abstract, c.title
FROM c
WHERE FullTextContainsAll(c.abstract, 'quantum', 'theory')
ORDER BY RANK FullTextScore(c.abstract, ['quantum', 'theory'])

SELECT TOP 50 c.id, c.abstract, c.title
FROM c
ORDER BY RANK RRF(FullTextScore(c.abstract, ['quantum']), VectorDistance(c.Embedding, [%s]))

Full Text Indexing Policy (PR #42278)

Azure Cosmos DB containers now support full text indexing natively through the indexing policy. This makes it easy to declare which paths should be searchable.

"fullTextPolicy": {
  "defaultLanguage": "en-US",
  "fullTextPaths": [
    { "path": "/abstract", "language": "en-US" }
  ]
}

Quantized Vector Indexing Enhancements for Flat, QuantizedFlat, and DiskANN (PR #42333)

Two new tuning knobs are now supported across vector index types – including Flat, quantizedFlat, and DiskANN:

quantizationByteSize: controls trade-off between recall and latency.
indexingSearchListSize: size of candidate list during index build.

These settings allow deeper control for advanced vector search scenarios.

For more info, see our documentation on Vector index and query vectors in Azure Cosmos DB for Java.

Dynamic Request Options (PR #40061)

This feature allows developers to modify request options at runtime, such as consistency level, diagnostic thresholds, or throughput control settings. It enables dynamic configuration changes without needing to restart the application.

Use-case: Integrate Cosmos DB with your custom configuration service and tune SDK behavior dynamically.

CosmosAsyncClient client = new CosmosClientBuilder()
    .endpoint("https://your-account.documents.azure.com")
    .key("your-key")
    .addOperationPolicy(cosmosOperationDetails -> {
        Properties config = new Properties();
        try (FileInputStream fis = new FileInputStream("app.config")) {
            config.load(fis);
            CosmosRequestOptions options = new CosmosRequestOptions();
            options.setConsistencyLevel(ConsistencyLevel.valueOf(config.getProperty("consistency")));
            cosmosOperationDetails.setRequestOptions(options);
        } catch (IOException e) {
            // Handle exception
        }
    })
    .buildAsyncClient();

Extract Sub-Range Continuation Tokens (PR #42156)

This utility allows customers to extract individual continuation tokens from a combined change feed token. This is helpful for breaking a query into multiple sub-ranges and processing them in parallel.

List<String> tokens = CosmosChangeFeedContinuationTokenUtils.extractContinuationTokens(continuationToken);

Complete Change Feed Queries (PR #42160)

New flag setCompleteAfterAllCurrentChangesRetrieved(true) lets change feed queries automatically finish after all current changes are read.

Ideal for batch workloads or event sourcing pipelines.

Read Consistency Strategy (Beta) (PR #45161)

Historically, Azure Cosmos DB has offered five consistency levels (Strong, Bounded Staleness, Session, Consistent Prefix, and Eventual). These consistency levels governed both write durability (RPO) and read freshness (data staleness), and while the write consistency level could be set per account, read consistency was limited to reductions only (e.g., using Eventual reads from an account configured with Strong consistency).

This model, while simplifying configuration, has sometimes created confusion and rigidity – especially for customers chosingBounded Staleness just to achieve quorum reads. In multi-region or multi-region write scenarios, Bounded Staleness has been shown to be misleading and problematic, notably interfering with newer high-availability features like Per-Partition Automatic Failover (PPAF). To address this, the SDK now introduces a new abstraction in beta preview: ReadConsistencyStrategy, enabling more flexible and accurate control over read behavior independently from write durability.

Key Benefits:

Customize read consistency per operation or at the client level.
Bypass misleading or limiting configurations like Bounded Staleness.
Improve compatibility with features like PPAF.
Enable developers to safely use Eventual or Session consistency defaults without sacrificing stronger read guarantees.

New ReadConsistencyStrategy Enum

public enum ReadConsistencyStrategy {
    DEFAULT,           // Honors default consistency level settings
    EVENTUAL,          // Eventually consistent read
    SESSION,           // Session consistency per session token
    LATEST_COMMITTED,  // Latest version committed in preferred region
    GLOBAL_STRONG      // Strong consistency across regions
}

Configure Strategy on Client

CosmosClient client = new CosmosClientBuilder()
    .readConsistencyStrategy(ReadConsistencyStrategy.LATEST_COMMITTED)
    .buildClient();

This overrides any consistency level set at the account or client level unless the strategy is explicitly overridden in request options.

Override Strategy Per Request

CosmosItemRequestOptions options = new CosmosItemRequestOptions()
    .setReadConsistencyStrategy(ReadConsistencyStrategy.GLOBAL_STRONG);

This example demonstrates setting a default strategy on the client and overriding it on a specific query operation.

This API allows granular control – e.g., a session-consistent read in an eventual-consistency environment.

🔁 If you override a request to use SESSION consistency while the client is not configured for it, be sure to enable session token capture explicitly via sessionCapturingOverrideEnabled(true).

Use cases unlocked by this change include:

Ensuring quorum reads in multi-region reads without needing bounded staleness.
Allowing globally strong reads without globally strong writes.
Facilitating hybrid strategies for read-heavy workloads.

This marks a major evolution in Azure Cosmos DB’s consistency model, allowing developers to fine-tune trade-offs between performance, freshness, and availability on a per-operation basis.

⚠️ Note: This is currently supported only when using direct mode. The feature is currently in beta (preview) only.

Per-Partition Automatic Failover (PR #44099)

Improves availability for single-write, multi-region accounts by automatically failing over reads and writes at the partition level when a region is unavailable.

Ideal for mission-critical apps that require high resilience and lower impact during regional outages.

Explore our full blog for more info on this game-changing feature for balancing high availability and consistency: Announcement blog

Spring Data for Azure Cosmos DB

Improved Exception Handling (PR #42902)

The Spring Data Cosmos module now throws more specific exceptions like CosmosBadRequestException and CosmosUnauthorizedException, instead of a generic CosmosAccessException.

This enables cleaner and more precise exception handling logic.

Improved `findAllByIds()` Performance with `readMany()` (PR #43759)

If the partition key matches the document ID, Spring Data will now automatically optimize findAllByIds() using the readMany() API for better performance.

Apache Spark Connector

CosmosClientBuilderInterceptor (PR #40714)

Spark developers can now inject logic into the CosmosClient creation process to attach custom monitoring or diagnostics.

spark.conf.set("spark.cosmos.account.clientBuilderInterceptors", "com.example.MyInterceptor")

Support for Non-Public Azure Clouds (PR #45310)

Run Spark workloads in government, China, or private Azure environments by configuring custom Entra ID and ARM endpoints.

spark.conf.set("spark.cosmos.account.azureEnvironment", "Custom")
spark.conf.set("spark.cosmos.account.azureEnvironment.management", "https://mygovcloud.management")
spark.conf.set("spark.cosmos.account.azureEnvironment.aad", "https://mygovcloud.aad")

UDFs for Partition Mapping (PR #43092)

Added GetFeedRangesForContainer and GetOverlappingFeedRange UDFs to make it easier to partition Databricks tables based on Cosmos DB feed ranges.

Improves performance and parallelism for distributed joins.

Continuation Token Size Config (PR #44480)

Limit continuation token size during queries to avoid token size overflows and client errors:

spark.conf.set("spark.cosmos.read.responseContinuationTokenLimitInKb", "16")

🎙️ Kafka Connector

Version 2 Now GA!

Fixes, patches, and enhancements

In addition to all of the above features, we have of course made a large number of smaller bug fixes, security patches, enhancements, and improvements. You can track all the changes for each client library, along with the minimum version we recommend you use, by viewing the change logs:

Java SDK change log
Spring Data Client Library change log
OLTP Spark Connector change log
Kafka Connectors change log

Get Started with Java in Azure Cosmos DB

About Azure Cosmos DB

Azure Cosmos DB is a fully managed and serverless distributed database for modern app development, with SLA-backed speed and availability, automatic and instant scalability, and support for open-source PostgreSQL, MongoDB, and Apache Cassandra. To stay in the loop on Azure Cosmos DB updates, follow us on X, YouTube, and LinkedIn.

To easily build your first database, watch our Get Started videos on YouTube and explore ways to dev/test free.

Latest NoSQL Java Ecosystem Updates: June 2024 – June 2025

🤖 AI Integrations

Spring AI

LangChain4j

Java SDK Enhancements

Hybrid and Full Text Search (PR #42885)

Full Text Indexing Policy (PR #42278)

Quantized Vector Indexing Enhancements for Flat, QuantizedFlat, and DiskANN (PR #42333)

Dynamic Request Options (PR #40061)

Extract Sub-Range Continuation Tokens (PR #42156)

Complete Change Feed Queries (PR #42160)

Read Consistency Strategy (Beta) (PR #45161)

Configure Strategy on Client

Override Strategy Per Request

Per-Partition Automatic Failover (PR #44099)

Spring Data for Azure Cosmos DB

Improved Exception Handling (PR #42902)

Improved `findAllByIds()` Performance with `readMany()` (PR #43759)

Apache Spark Connector

CosmosClientBuilderInterceptor (PR #40714)

Support for Non-Public Azure Clouds (PR #45310)

UDFs for Partition Mapping (PR #43092)

Continuation Token Size Config (PR #44480)

🎙️ Kafka Connector

Version 2 Now GA!

Fixes, patches, and enhancements

Get Started with Java in Azure Cosmos DB

About Azure Cosmos DB

Author

0 comments

Leave a commentCancel reply

Read next

Building a Modern Python API with Azure Cosmos DB: A 5-Part Video Series

Powering Real-Time Messaging at Scale with Azure Cosmos DB

🤖 AI Integrations

Spring AI

LangChain4j

Java SDK Enhancements

Hybrid and Full Text Search (PR #42885)

Full Text Indexing Policy (PR #42278)

Quantized Vector Indexing Enhancements for Flat, QuantizedFlat, and DiskANN (PR #42333)

Dynamic Request Options (PR #40061)

Extract Sub-Range Continuation Tokens (PR #42156)

Complete Change Feed Queries (PR #42160)

Read Consistency Strategy (Beta) (PR #45161)

Configure Strategy on Client

Override Strategy Per Request

Per-Partition Automatic Failover (PR #44099)

Spring Data for Azure Cosmos DB

Improved Exception Handling (PR #42902)

Improved findAllByIds() Performance with readMany() (PR #43759)

Apache Spark Connector

CosmosClientBuilderInterceptor (PR #40714)

Support for Non-Public Azure Clouds (PR #45310)

UDFs for Partition Mapping (PR #43092)

Continuation Token Size Config (PR #44480)

🎙️ Kafka Connector

Version 2 Now GA!

Fixes, patches, and enhancements

Get Started with Java in Azure Cosmos DB

About Azure Cosmos DB

Author

0 comments

Leave a commentCancel reply

Read next

Building a Modern Python API with Azure Cosmos DB: A 5-Part Video Series

Powering Real-Time Messaging at Scale with Azure Cosmos DB

Stay informed

Improved `findAllByIds()` Performance with `readMany()` (PR #43759)