Optimize query performance with Azure Cosmos DB indexing metrics

Tim

You can now use the Azure Cosmos DB indexing metrics, announced as general available at Microsoft Ignite, to optimize query performance! The indexing metrics show two important pieces of information:

  • Utilized indexes
  • Recommended indexes for improving query performance

Enabling indexing metrics:

You can enable indexing metrics for a query by setting the PopulateIndexMetrics property to true. When not specified, PopulateIndexMetrics defaults to false. The indexing metrics are supported in the .NET SDK (version 3.21.0 or later) and Java SDK (version 4.19.0 or later).

You should check the indexing metrics after you have already identified an expensive query, through methods such as using diagnostic logs. We don’t recommend proactively logging the indexing metrics because they won’t change between different executions of the same query with the same indexing policy.

Index impact score:

The index impact score is the likelihood that an indexed path, based on the query shape, has a significant impact on query performance. In other words, the index impact score is the probability that, without that specific indexed path, the query RU charge would have been substantially higher.

There are two possible index impact scores: high and low. If you have multiple potential indexed paths, we recommend focusing on indexed paths with a high impact score.

The only criteria used in the index impact score is the query shape. For example, in the below query, the indexed path /age/? would be assigned a high index impact score:

SELECT *
FROM c
WHERE c.age > 30

The actual impact depending on the nature of the data, so not all indexes with high index impact scores will improve query performance. In the above example, the impact of an /age/? indexed path would be larger if fewer items matched the c.age > 30 filter expression.

Examples:

To show example indexing metrics, we’ve created an example container with about 9 million items. Here’s an example item:

{
    "id": "qd471022kj",
    "name": "Mark Brown",
    "address": "123 Cosmos DB Way"
    "company": "Microsoft",
    "job": "Software Developer",
    "_ts": 1589840249
}

The container has the default indexing policy of all paths indexed with a range index. We’ll show the index metrics, both the utilized indexes and potential single indexes, for several different queries.

Query 1:

SELECT *
FROM c
WHERE c.name = “Tim Sander”

Original RU charge: 2.79 RUs

Index metrics:

Index Utilization Information
  Utilized Single Indexes
    Index Spec: /name/?
    Index Impact Score: High
    ---
  Potential Single Indexes
  Utilized Composite Indexes
  Potential Composite Indexes

Explanation:

In this simple example, the indexing metrics show that the query used the existing /name/? indexed path. Given the high index impact score, there’s a high likelihood that this indexed path was critical for query performance. There aren’t any recommendations for further improving query performance.

Query 2:

SELECT AVG(c._ts)
FROM c
WHERE c.job = “Developer”

Original RU charge: 222.35 RUs

Index metrics:

Index Utilization Information
  Utilized Single Indexes
    Index Spec: /job/?
    Index Impact Score: High
    ---
    Index Spec: /_ts/?
    Index Impact Score: High
    ---
  Potential Single Indexes
  Utilized Composite Indexes
  Potential Composite Indexes
    Index Spec: /job ASC, /_ts ASC
    Index Impact Score: High

Explanation:

This query averages the timestamp employees with the job “Developer”. While the query used the existing indexed paths /job/? and /_ts/?, the indexing metrics show a recommended composite index with a high likelihood of impact. In this case, after adding a composite index on (/job ASC, /_ts ASC), the RU charge of the query is reduced from 222.35 RUs to 137.56 RUs.

 

Query 3:

SELECT TOP 10 *
FROM c
WHERE c.job = “Developer”
AND c._ts > 1589840249

Original RU charge: 31.3 RUs

Index metrics:

Index Utilization Information
  Utilized Single Indexes
    Index Spec: /job/?
    Index Impact Score: High
    ---
    Index Spec: /_ts/?
    Index Impact Score: High
    ---
  Potential Single Indexes
  Utilized Composite Indexes
  Potential Composite Indexes
    Index Spec: /job ASC, /_ts ASC
    Index Impact Score: High

Explanation:

This filters to all developers and sorts by timestamp . While the query used the existing indexed paths /job/? and /_ts/?, the indexing metrics show a recommended composite index with a high likelihood of impact. In this case, after adding a composite index on (/job ASC, /_ts ASC), the RU charge of the query is reduced from RUs 31.3 to 23.7 RUs.

Next steps:

Do you have any feedback about the indexing metrics? Share feedback directly with the Azure Cosmos DB engineering team: cosmosdbindexing [@] microsoft.com

1 comment

Leave a comment

  • iain

    I tried emailing the address provided, but it bounced back on me. Apparently the email address is internal to MS only.

    The feedback in the mail was just that I’d love this to be available in the Azure Portal, as that’s where I generally test out my queries.