Improve application performance with Java SDK v4 for Azure Cosmos DB

Ravi

The Java SDK v4 for Azure Cosmos DB has many improvements and new APIs to help increase the performance of your applications. In this blog post, I will highlight a few of these improvements that help improve application performance by reducing the number of network roundtrips and latency when using direct mode in the Java v4 SDK.

Here is what the SDK does behind the scenes for a query:

Image QueryPlan

Image: How the SDK handles queries

 

In direct mode, every time a query is executed via the SDK the first step is a network call to retrieve the query plan before the query can be executed on the replicas. Query plan, which is used internally, has the execution information required to execute a query on the replicas. This query plan retrieval adds overhead for every query. With the Java SDK you now have options to either minimize the query plan calls or eliminate them. Let us look at the options.

 

Query plan caching

The query plan, for a query scoped to a single partition, is cached on the client. This eliminates the need to make a call to the gateway to retrieve the query plan after the first call. The key for the cached query plan is the SQL query string. You need to make sure the query is parametrized. If not, the query plan cache lookup will often be a cache miss as the query string is unlikely to be identical across calls. Query plan caching is enabled by default for version 4.20.0 or above. Let us look at a couple of query strings to better understand the concept.

SELECT * FROM o WHERE o.category = @category and o.author= @author

Once cached, the query plan look-up for the above query string will be a cache hit irrespective of the actual parameter (@category and @author) values.

SELECT * FROM o WHERE o.category =“Databases” and o.author= “Ravi”

Once cached, the query plan look-up for the above query string will be a cache hit, only for a query string with parameter values “Database” and “Ravi”, because the query is not parameterized.

“category” is the partition key for both the queries. As I have mentioned previously, query caching only works for queries scoped to a single partition. The partition key value must be set using “CosmosQueryRequestOptions” object for query plan caching to work. 

Every time there is a cache hit for a query plan you will see the following message in the logs.

“Skipping query plan round trip by using the cached plan”

Query plan caching currently works for queries with filters. This feature can also be leveraged from the Spring connector when using method derived queries. Currently, queries defined with “@Query” spring data annotation, do not take advantage of Query plan caching.

 

CosmosAsyncContainer#readAllItems 

This API allows you to retrieve all documents with the same partition key value. Behind the scenes, the SDK generates and executes a query on the replicas, but there will be no call made to the gateway for query plan. This API is available from SDK version 4.15.0 or above. 

container.readAllItems(new PartitionKey(category), cosmosItemRequestOptions, Book.class)

 

CosmosAsyncContainer#readMany

The ReadMany API does something similar for retrieving multiple documents by ID and partition Key. Simply put, if you want to retrieve multiple documents by ID and partition key, instead of calling “CosmosAsyncContainer#readItem” point read API, multiple times you can use this API to retrieve all the documents in parallel. Once again, the SDK behind the scenes will generate and execute the query on the replicas without making a call to the gateway for query plan. This API is available from SDK version 4.5.0 or above.

CosmosItemIdentity itemIdentity1 = new CosmosItemIdentity(new PartitionKey("Programming Languages"), "4");
CosmosItemIdentity itemIdentity2 = new CosmosItemIdentity(new PartitionKey("Databases"), "6");
List<CosmosItemIdentity> itemIdentities = Lists.newArrayList(itemIdentity1,itemIdentity2);
container.readMany(itemIdentities,Book.class);

As always, we highly recommend upgrading to the latest version of the SDK for best results.

Next Steps:

Do you have feedback about the Java SDK for Azure Cosmos DB? Share feedback directly with the Azure Cosmos DB engineering team via GitHub issues at azure-sdk-for-java.

0 comments

Leave a comment