May 21st, 2024

General Availability: Index Metrics in Azure Cosmos DB for NoSQL

James Codella
Principal Product Manager

You can now use the Azure Cosmos DB index metrics to optimize query performance with Index Advisor! Indexing metrics show two important pieces of information:

  • Utilized indexes – Which paths in the indexing policy were used in your query and utilized by the Azure Cosmos DB query engine
  • Recommended indexes – Which indexes can improve your query speed and reduce RU cost if added to the indexing policy

Enabling indexing metrics:

You can enable indexing metrics for a query by setting the PopulateIndexMetrics property to true. When not specified, PopulateIndexMetrics defaults to false. The indexing metrics are supported in the .NET SDK (version 3.21.0 or later) and Java SDK (version 4.19.0 or later), and the Python SDK (version 4.6.0 or later)

Index impact score:

The index impact score is the likelihood that an indexed path, based on the query shape, has a significant impact on query performance. In other words, the index impact score is the probability that, without that specific indexed path, the query RU charge would have been substantially higher.

There are two possible index impact scores: high and low. If you have multiple potential indexed paths, we recommend focusing on indexed paths with a high impact score.

The only criteria used in the index impact score is the query shape. For example, in the below query, the indexed path /age/? would be assigned a high index impact score:

SELECT * 
FROM c 
WHERE c.age > 35

The actual impact depending on the nature of the data, so not all indexes with high index impact scores will improve query performance. In the above example, the impact of an /age/? indexed path would be larger if fewer items matched the c.age > 30 filter expression.

Examples:

To show example indexing metrics, we’ve created an example container with about 9 million items. Here’s an example item:

{
	"id": "ax92n6231",
	"name": "James Codella",
	"address": "1 Cosmos DB Avenue",
	"company": "Microsoft",
	"job": "Product Manager",
	"_ts": 3141592653589
}

The container has the default indexing policy of all paths indexed with a range index. We’ll show the index metrics, both the utilized indexes and potential single indexes, for several different queries.

Query 1:

SELECT * 
FROM c 
WHERE c.name = “George Washington”
Original RU charge: 2.79 RUs

Index metrics:

Index Utilization Information
  Utilized Single Indexes
    Index Spec: /name/?
    Index Impact Score: High
    ---
  Potential Single Indexes
  Utilized Composite Indexes
  Potential Composite Indexes

Explanation:

In this simple example, the indexing metrics show that the query used the existing /name/? indexed path. Given the high index impact score, there’s a high likelihood that this indexed path was critical for query performance. There aren’t any recommendations for further improving query performance.

Query 2:

SELECT AVG(c._ts)
FROM c
WHERE c.job = “Developer”

Original RU charge: 222.35 RUs

Index metrics:

Index Utilization Information
  Utilized Single Indexes
    Index Spec: /job/?
    Index Impact Score: High
    ---
    Index Spec: /_ts/?
    Index Impact Score: High
    ---
  Potential Single Indexes
  Utilized Composite Indexes
  Potential Composite Indexes
    Index Spec: /job ASC, /_ts ASC
    Index Impact Score: High

Explanation:

This query averages the timestamp employees with the job “Developer”. While the query used the existing indexed paths /job/? and /_ts/?, the indexing metrics show a recommended composite index with a high likelihood of impact. In this case, after adding a composite index on (/job ASC, /_ts ASC), the RU charge of the query is reduced from 222.35 RUs to 137.56 RUs.

Query 3:

SELECT TOP 10 *
FROM c WHERE c.job = “Developer”
AND c._ts > 1589840249

Original RU charge: 31.3 RUs

Index metrics:

Index Utilization Information
  Utilized Single Indexes
    Index Spec: /job/?
    Index Impact Score: High
    ---
    Index Spec: /_ts/?
    Index Impact Score: High
    ---
  Potential Single Indexes
  Utilized Composite Indexes
  Potential Composite Indexes
    Index Spec: /job ASC, /_ts ASC
    Index Impact Score: High

Explanation:

This filters to all developers and sorts by timestamp . While the query used the existing indexed paths /job/? and /_ts/?, the indexing metrics show a recommended composite index with a high likelihood of impact. In this case, after adding a composite index on (/job ASC, /_ts ASC), the RU charge of the query is reduced from RUs 31.3 to 23.7 RUs.

Next steps:

Do you have any feedback about the indexing metrics? Share feedback directly with the Azure Cosmos DB engineering team: cosmosdbindexing [@] microsoft.com

About Azure Cosmos DB

Azure Cosmos DB is a fully managed and serverless distributed database for modern app development, with SLA-backed speed and availability, automatic and instant scalability, and support for open-source PostgreSQL, MongoDB and Apache Cassandra. Try Azure Cosmos DB for free here. To stay in the loop on Azure Cosmos DB updates, follow us on XYouTube, and LinkedIn. To quickly build your first database, watch our Get Started videos on YouTube and explore ways to dev/test free.

Author

James Codella
Principal Product Manager

0 comments

Discussion are closed.