Improve your Azure Cosmos DB .NET SDK initialization
When you use the Azure Cosmos DB .NET SDK to perform operations and workloads on your databases you might have noticed that the initial operations often have a higher latency and poorer performance than the rest. In this post we’ll lean why is that and how you can avoid it with a better client initialization.
What happens on your first request
When you perform an operation, for example, a ReadItemAsync call, it needs to be correctly routed to the backend replicas (more information about replicas can be read on the official documentation). To achieve that, the SDK needs to:
- Read your account’s information: This includes the account topology, available regions, consistency configuration, etc.
- Read your container’s information: This informs the SDK of the partition key path configured in the container.
- Read the partition layout information: This allows the SDK to route the operation to the partition/s it needs to go to.
These are performed as HTTP requests to the routing Gateway, so if you are tracking HTTP dependencies with logging frameworks like Application Insights, you might have spotted them. They are required in order to complete the operation and performed inline with your operation execution.
Once this information is available, then the SDK proceeds to:
- Open a TCP connection to a replica of the target partition. Depending on the consistency mode, one replica could be enough, or for stronger consistencies, more replicas might be required.
- Execute the desired operation and obtain the response.
If you execute another operation for the same container, the first three pieces of information are already available. They are maintained in internal caches inside the SDK, so this second operation does not pay the latency of cost of those HTTP requests.
If the operation targets the same partition as the previous one, then it will reuse the same TCP connections already established, otherwise, the SDK will open new connections to the new partition.
Now, you can see why the first operations that you ever do after you create a fresh SDK client instance will have a higher latency.
Where to initialize in your application lifecycle
It is really important to maintain a singleton instance of the client during the lifetime of your application:
- A singleton instance will share these initialized caches among all the operations.
- A singleton instance will share the already established TCP connections among operations.
The common pattern when building an application is to put the client initialization code during the initialization event in your application, before the application signals it’s ready to receive requests. Depending on your architecture and technology stack this could mean:
- The Application_Start event on ASP.NET applications running on IIS.
- The ConfigureServices step in a NET Core application where you register services as Singleton.
- The StartAsync event in a IHostedService in NET Core applications if you are using custom extensions.
- The Configure(IFunctionsHostBuilder builder) step in Azure Functions applications.
On these steps we could create the .NET SDK CosmosClient instance and register it as a static/Singleton instance to be reused. Up until now, this meant just calling the CosmosClient constructor and having our initial requests having the known latency spikes.
Now that we know why these first operations experience latency, and where to initialize, we can explore the new APIs available that we can use, and reduce the latency overhead while creating the client at the same time.
CosmosClient.CreateAndInitializeAsync is an API that takes:
- The account endpoint and credentials, in the form of a connection string or endpoint and key.
- A list of pairs of Database and Container, which should identify which is/are the container/s your current application will commonly interact with during its lifetime. This could be all or a subset of the containers in your Cosmos DB account.
- Optionally, configuration options in the form of CosmosClientOptions.
And does three things:
- Creates a CosmosClient instance using the authentication information provided.
- Initializes the required caches for the desired containers, covering the first three steps we previously mentioned.
- Return the initialized CosmosClient instance for your usage.
If this is done as part of your application initialization, it makes sure that when the application is running and the first operations come, they do not pay the latency cost of those previously needed HTTP requests, because all the required caches are already ready.
To leverage these APIs during your application initialization, you can do:
The only caveat that you need to know is the containers being passed need to previously exist, otherwise the initialization will fail with a 404 error.
Follow up tips
Remember to check these other improvements for the .NET SDK:
- Optimizing bandwidth in the Azure Cosmos DB .NET SDK
- HttpClientFactory in the Azure Cosmos DB .NET SDK
– Can you tell us how to do this for Azure Functions? I’m not sure where we initialize events there.
– Can you also talk about the ApplicationName property? How and where is this used?
Thanks for your comment, I added the point on Functions where you can initialize. We also have a basic example at https://github.com/Azure/azure-cosmos-dotnet-v3/blob/master/Microsoft.Azure.Cosmos.Samples/Usage/AzureFunctions/Startup.cs. Functions has a similar initialization mechanism as NET Core applications.
ApplicationName is unrelated to this post but a good practice. Identifying each application that interacts with your account with ApplicationName stamps all requests’ UserAgent with that value, so they are easy to spot if you are using Azure Monitor or Diagnostics Logs.