October 22nd, 2020

Understanding the difference between point reads and queries in Azure Cosmos DB

Tim Sander
Program Manager

This blog is part one of a series of blogs (read part two here) where we’ll demystify commonly confused concepts for developers learning how to query in the Core (SQL) API in Azure Cosmos DB.

First, we’ll take an in-depth look at the differences between point reads and queries.

There are two ways to read data in Azure Cosmos DB: point reads and queries. Most developers know that you can query data using Cosmos DB’s SQL query language, but not everyone realizes that point reads are an even more efficient way to read data. A point read is a key/value lookup on a single item ID and partition key.

Here are the major differences between point reads and queries:

Point read (assumes 1 KB item) Query
Latency Typically less than 10 ms Variable
RU charge 1 RU At least 2.3 RUs, variable
Number of items returned 1 item Unlimited (if results size is too large, results are split across multiple pages)
Include partition key? Required Recommended

 

In many cases, you can make simple changes in your app to rewrite simple queries as point reads. Most read-heavy workloads on Azure Cosmos DB use a combination of both point reads and SQL queries. If you just need to read a single item, point reads are cheaper and faster than queries. Point reads can read the data directly and don’t require the query engine.

Transforming queries to point reads:

Let’s imagine that Contoso Retail, a fictional company, has a large Cosmos orders container that has customers orders data. Each item in the orders container represents a separate customer’s order and you can identify an order by its id value. The partition key of the orders container is customerId.

Here’s an example item in the orders container:

{
           "id" : "order0103",
           "customerId": "TimS1234"
           "order details" : ….
}

Contoso Retail’s app runs the following query on the orders container very frequently:

SELECT * 
FROM c
WHERE c.id = "order0103" AND c.customerId = "TimS1234"

This query will cost a little under 3 RU’s when executed. Because the query has an equality filter on customerId (the partition key), the query is an in-partition query.

While this query is efficient, it could be written as a point read. Here’s an example using the .NET SDK and an Order entity:

Order order = await container.ReadItemAsync<Order>(id: "order0103", partitionKey: new PartitionKey("TimS1234"));

This point read operation will be slightly faster than the query and only use 1 RU. This equates to an RU cost savings of 66%! If the application ran this read operation frequently enough, this one line code change could translate to significant savings!

 

Utilizing the id property and partition key:

You can only do a point read if you have both the id and partition key of an item. The id and partition key combination are also a globally unique key for items within an Azure Cosmos container. Developers know to take great care in selecting a good partition key choice for their Cosmos container. However, it is not as widely known that assigning meaningful values to the id property are required for doing point reads.

Here’s a data model which simply assigns a random guid as the id value:

{
           "id": "f2ef7v28f9802fa482",
           "customerId" : "TimS1234",
           "orderId" : "order0103",
           "order details" : ….
}

Consider this query, which filters on orderId (instead of id like in the previous example):

SELECT * 
FROM c 
WHERE c.orderId = "order0103" AND c.customerId = "TimS1234"

Unlike the previous query which searched for a specific id and partition key combination, this query includes the orderId instead. In order to rewrite this read as a point read, it is important that the operation include a specific id value. The id property is special, in the sense that no other properties in the item can be used to do point reads. In this case, it would be worthwhile to modify the data model to have the orderId  value stored as the id value. Assigning a meaningful id value makes it much more efficient to read data.

Learn more:

 

Author

Tim Sander
Program Manager

Program Manager on the Azure Cosmos DB engineering team

5 comments

Discussion is closed. Login to edit/delete existing comments.

  • paul van bladel

    In some cosmosdb samples from Microsoft on GitHub, you see that the id field is a compound key delimited by a :.
    So first part the partition key an after the “:”, another unique key.
    Any idea why?

  • paul van bladel

    Interesting.
    In order to be a successful point read, should the id field be globally unique, so unique over partitions?

  • PRASAD NAIR

    Thanks, Tim for the blog. It clarified some of the doubts I had in my mind. However, it created some new ones 🙂

    Why is it that the SDK or the cosmos DB node(coordinator of the request) is not able to treat a query with ‘partition key’ and ‘id’ values supplied equivalent to a point read? Can you please clarify either as reply to this comment or a new blog? 🙂

    • Tim SanderMicrosoft employee Author

      Hi Prasad,

      Queries like this could be easily rewritten as a point read:

      SELECT * FROM c WHERE c.id = "val1" and c.pk = "val2"

      If you ran the above query, it would likely end up getting executed as a point read. However, this still requires some work from the query engine so the RU charge will be higher than that of a point read. In other words, it takes a small amount of RUs just to know...

      Read more
  • Rich King

    Thanks Tim — I started writing my first Cosmos DB stored procedure today and this was super helpful. Looking forward to reading the rest of the series!