Using Azure Cosmos DB with Integrated Cache

Laura Lund (she/her)


Have you ever had a use case in which an application receives a high volume of repeated requests against a rarely-changing data set? Caching is the obvious choice to prevent your database from being overwhelmed by too many requests at once, but adding an external cache such as Azure Redis means creating and maintaining an additional dependency. What if you could benefit from caching without having to directly manage your cache?

My team recently had the opportunity to build a project using Azure Cosmos DB with integrated cache. We worked on a customer engagement for a use case that involved heavy reads, low writes, and repeated queries. For our backing datastore we used Azure Cosmos DB, a horizontally-scaled document database. We decided to try out the integrated cache feature because it looked like it might be a good fit for our customer’s use case. We spoke with members of Azure Cosmos DB product team to gain a better understanding of how the integrated cache operates and we applied those learnings to our project.

What is Azure Cosmos DB with integrated cache?

The Azure Cosmos DB integrated cache is essentially a turn-key cache that is managed for you. While Azure Cosmos DB does support multiple APIs, the integrated cache is currently only available for the NoSQL API. With the integrated cache you don’t need to set up any additional caching resources or dependencies and your code can simply interact with Cosmos DB without having to explicitly manage the cache.

When you create your Azure Cosmos DB account and enable the dedicated gateway, the integrated cache is configured automatically. As part of the process to provision the dedicated gateway, you select the number of gateway nodes you’d like to create. One node is sufficient for development purposes, but for use in production it is recommended to have a minimum of three. The cost for the dedicated gateway is calculated hourly and is determined by the number of nodes and the region selected. Since queries that hit the integrated cache incur no request unit (RU) charges, the overall cost of the database usage will be primarily driven by this hourly charge.

When is Azure Cosmos DB with Integrated Cache a good fit?

Because the integrated cache is managed for you, it does not give you explicit controls over its operations the way a stand-alone cache would. Therefore it is best suited for applications that:

  • Are read-heavy
  • Have data that changes infrequently
  • Need low latency
  • Have repeated queries that benefit from caching
  • Tolerate eventual or session consistency
  • Can operate without the need for an explicit cache refresh

Let’s take a global e-commerce platform as an example. Such applications are predominantly read-heavy. Users frequently view product listings, but the underlying product information doesn’t change very often. Because these types of queries are recurrent, the low latency offered by Azure Cosmos DB with Integrated Cache can considerably enhance the user experience. The eventual or session consistency models work well in this scenario because users don’t need immediate updates on slight changes such as new product reviews. Moreover, as these applications don’t need explicit control over cache refreshing, the automatic background refreshes provided by Azure Cosmos DB with Integrated Cache can maintain a high-quality user experience even in the face of occasional minor data discrepancies.

Architecture of Azure Cosmos DB with Integrated Cache

When an application sends a request to Azure Cosmos DB using the dedicated gateway, this request first reaches a dedicated gateway node. This node checks its cache for the requested data. If the data isn’t present in the cache (a “cache miss”), the gateway node communicates with the Azure Cosmos DB database to fetch the necessary data. After fetching the data from the database, the gateway node stores a copy of it in its own cache for future requests and then finally sends the data back to the original caller. Each cache miss requires data retrieval from the database node, which results in request unit charges.

Each dedicated gateway node maintains its own cache. This means you can potentially get back different data depending on which node the request is routed through (see the docs and the animation below). The backing data in the database is replicated as usual across all the backend nodes, but caches are not replicated across dedicated gateway nodes. Calls are routed to dedicated gateway nodes randomly, so it is possible to have a cache miss even if the data is cached on another dedicated gateway node.

Azure Cosmos DB integrated cache has an item cache and a query cache:

  • The item cache services point reads where the key is a composite of the item id and partition key and the value is the data in the associated document. The item cache works like write-through cache, which means that the data in the cache is updated at the time it is written.
  • The query cache services queries. For example, a query with a WHERE clause will be cached in the query cache even if it returns a single value. The key for an item in the query cache is based on the query text and the value is the result of that query. Even if a particular document has been cached previously as part of the result for another query, it will be cached again if a given query varies from the previous one. For example, a request that returns documents [1..40] will be cached as one query result and a request that returns documents [2..41] will be cached as a distinct query result. Changes to the backing data do not affect cached values in the query cache.

Cache Refresh

Azure Cosmos DB with integrated cache handles cache refresh differently from other caching technologies. At this time there is no explicit cache invalidation, but that may be included in a future release. A workaround for if you need to invalidate all cached data on all dedicated gateway nodes is to deprovision and re-provision the dedicated gateway.

As a cache on a given dedicated gateway node fills up with data, it evicts the least-recently accessed values. The item cache is automatically “refreshed” when an item is newly created, updated, or deleted. That is not true for values in the query cache.

The primary mechanism to trigger a query cache refresh is via the MaxIntegratedCacheStaleness value that is passed in when an application requests data from Cosmos DB. The calling application essentially tells the dedicated gateway what its tolerance is for stale data. That tolerance may be mere minutes to several days depending on the application’s use cases for the data. When the MaxIntegratedCacheStaleness value exceeds the amount of time that has passed since a query result was cached, the query is run against the database again and the updated result is stored in the cache on that dedicated gateway node. If an application queries Cosmos with a MaxIntegratedCacheStaleness of 0, it will always get back data from the database and the cache on that dedicated gateway node will be refreshed to store the new query results. The caches on any other dedicated gateway nodes will not be refreshed until a request that “expires” the cached values is routed to that node or it is evicted according to the Least Recently Used (LRU) eviction policy.

Be aware that queries that return no results are cached. This is an edge case in which an application queries data that does not yet exist, the data is then inserted into the database, and the same query is run again. We encountered this case during testing for our project. As a result we decided to skip the cache for certain types of queries in which the backing data changes more frequently. To always hit the database, you can connect to Cosmos DB via direct mode even with an account that has a dedicated gateway.

Azure Cosmos DB with Integrated Cache Request Routing and Cache Refresh

Below is an animation of what happens when a request is sent to the dedicated gateway endpoint.

  • If the data requested is not in the cache, the request is routed to a backend database node. The data is retrieved and then cached on the dedicated gateway node prior to being returned to the caller.
  • If the data is in the cache and the age of the cached data is within the MaxIntegratedCacheStaleness value sent with the request, no call is sent to a backend database node. The cached data is returned.
  • If the data is in the cache but the age of the cached data exceeds the MaxIntegratedCacheStaleness value sent with the request, the data is retrieved from a backend database node, cached in the database, and then returned.

Animation Showing Azure Cosmos DB Request Routing

Code Examples

You can download and walk through the code for my working sandbox project that uses Azure Cosmos DB with integrated cache. Start with the README to understand how to set up for local development.

Configure CosmosClient to Use Dedicated Gateway

To configure a CosmosClient to use your dedicated gateway, you will need to supply your dedicated gateway connection string as well as CosmosClientOptions that specify the ConnectionMode and ConsistencyLevel for the client to use.

    var dedicatedConnectionString = "AccountEndpoint=https://<cosmos-db-account-name>;AccountKey=<cosmos-db-account-key>;";
    var dedicatedGatewayCosmosClient = new CosmosClient(dedicatedConnectionString,
    new CosmosClientOptions()
        ConnectionMode = ConnectionMode.Gateway,
        ConsistencyLevel = ConsistencyLevel.Eventual

Hit the Integrated Query Cache

To build a query that hits the query cache, use the GetItemLinqQueryable method on your CosmosContainer object. In your QueryRequestOptions add the value you wish to use for MaxIntegratedCacheStaleness and set your ConsistencyLevel.

    var sortedRefIds = refIds.OrderBy(id => id).ToList(); // Always sending a request in a specific order will maximize your chances of a cache hit

    IQueryable<TestItem> queryable = containerWithDedicatedConnection.GetItemLinqQueryable<TestItem>(
            requestOptions: new QueryRequestOptions
                DedicatedGatewayRequestOptions = new DedicatedGatewayRequestOptions()
                    MaxIntegratedCacheStaleness = TimeSpan.FromHours(cacheStaleness)
                ConsistencyLevel = ConsistencyLevel.Eventual

    queryable = queryable.Where(e => sortedRefIds.Contains(e.RefId));

    using (var linqFeed = queryable.ToFeedIterator())
        while (linqFeed.HasMoreResults)
            var feedResponse = await linqFeed.ReadNextAsync();

            result.RequestUnits = feedResponse.RequestCharge; // When this has a value of 0, that means you've successfully hit the cache

            foreach (var item in feedResponse)

Hit the Item Cache

To read from the item cache, send a ReadItemAsync request along with ItemRequestOptions that point the request to use the dedicated gateway.

    ItemResponse<TestItem> response = await containerWithDedicatedConnection.ReadItemAsync<TestItem>(id, new PartitionKey(partitionKey), 
        new ItemRequestOptions() 
            DedicatedGatewayRequestOptions = new DedicatedGatewayRequestOptions()
                MaxIntegratedCacheStaleness = TimeSpan.FromHours(cacheStaleness)
            ConsistencyLevel = ConsistencyLevel.Eventual

    var testItemQueryResult = new TestItemQueryResult();

    testItemQueryResult.TestItems.Add(response.Resource); // the document
    testItemQueryResult.RequestUnits = response.RequestCharge; // the request unit charge

    return testItemQueryResult;

The document will be returned in the Resource property on the ItemResponse object and the number of Request Units consumed will be returned on the RequestCharge property.


The simplification of application architecture is what we liked the most about Azure Cosmos DB with integrated cache. It’s convenient to be able to simply write Cosmos queries and know that cache hits and misses are handled automatically. For usage patterns involving a high volume of repeated requests, Azure Cosmos DB with integrated cache can serve thousands of requests per second without a corresponding high Request Unit charge. The dedicated gateway costs can be predicted in advance, which can result in overall lower Cosmos DB charges as compared to a Azure Cosmos DB without an integrated cache. We look forward to additional features the Azure Cosmos DB product team may develop to make this an even more robust product.


I want to give thanks to crew members Bindu Chinnasamy, John Hauppa, Cameron Taylor, and Kanishk Tantia. I also want to thank Justine Cocchi and Madhav Annamraju who provided information and resources to help us understand this feature better.

This article is also shared on Medium.

Feedback usabilla icon