Understanding the difference between point reads and queries in Azure Cosmos DB

Avatar

Tim

This blog is part 1 of a series of blogs where we’ll demystify commonly confused concepts for developers learning how to query in the Core (SQL) API in Azure Cosmos DB.

First, we’ll take an in-depth look at the differences between point reads and queries.

There are two ways to read data in Azure Cosmos DB: point reads and queries. Most developers know that you can query data using Cosmos DB’s SQL query language, but not everyone realizes that point reads are an even more efficient way to read data. A point read is a key/value lookup on a single item ID and partition key.

Here are the major differences between point reads and queries:

Point read (assumes 1 KB item)Query
LatencyTypically less than 10 msVariable
RU charge1 RUAt least 2.3 RUs, variable
Number of items returned1 itemUnlimited (if results size is too large, results are split across multiple pages)
Include partition key?RequiredRecommended

 

In many cases, you can make simple changes in your app to rewrite simple queries as point reads. Most read-heavy workloads on Azure Cosmos DB use a combination of both point reads and SQL queries. If you just need to read a single item, point reads are cheaper and faster than queries. Point reads can read the data directly and don’t require the query engine.

Transforming queries to point reads:

Let’s imagine that Contoso Retail, a fictional company, has a large Cosmos orders container that has customers orders data. Each item in the orders container represents a separate customer’s order and you can identify an order by its id value. The partition key of the orders container is customerId.

Here’s an example item in the orders container:

{
           "id" : "order0103",
           "customerId": "TimS1234"
           "order details" : ….
}

Contoso Retail’s app runs the following query on the orders container very frequently:

SELECT * 
FROM c
WHERE c.id = "order0103" AND c.customerId = "TimS1234"

This query will cost a little under 3 RU’s when executed. Because the query has an equality filter on customerId (the partition key), the query is an in-partition query.

While this query is efficient, it could be written as a point read. Here’s an example using the .NET SDK and an Order entity:

Order order = await container.ReadItemAsync<Order>(id: "order0103", partitionKey: new PartitionKey("TimS1234"));

This point read operation will be slightly faster than the query and only use 1 RU. This equates to an RU cost savings of 66%! If the application ran this read operation frequently enough, this one line code change could translate to significant savings!

 

Utilizing the id property and partition key:

You can only do a point read if you have both the id and partition key of an item. The id and partition key combination are also a globally unique key for items within an Azure Cosmos container. Developers know to take great care in selecting a good partition key choice for their Cosmos container. However, it is not as widely known that assigning meaningful values to the id property are required for doing point reads.

Here’s a data model which simply assigns a random guid as the id value:

{
           "id": "f2ef7v28f9802fa482",
           "customerId" : "TimS1234",
           "orderId" : "order0103",
           "order details" : ….
}

Consider this query, which filters on orderId (instead of id like in the previous example):

SELECT * 
FROM c 
WHERE c.orderId = "order0103" AND c.customerId = "TimS1234"

Unlike the previous query which searched for a specific id and partition key combination, this query includes the orderId instead. In order to rewrite this read as a point read, it is important that the operation include a specific id value. The id property is special, in the sense that no other properties in the item can be used to do point reads. In this case, it would be worthwhile to modify the data model to have the orderId  value stored as the id value. Assigning a meaningful id value makes it much more efficient to read data.

Learn more:

 

5 comments

Leave a comment

  • Avatar
    PRASAD NAIR

    Thanks, Tim for the blog. It clarified some of the doubts I had in my mind. However, it created some new ones 🙂

    Why is it that the SDK or the cosmos DB node(coordinator of the request) is not able to treat a query with ‘partition key’ and ‘id’ values supplied equivalent to a point read? Can you please clarify either as reply to this comment or a new blog? 🙂

    • Avatar
      Tim SanderMicrosoft employee

      Hi Prasad,

      Queries like this could be easily rewritten as a point read:

      SELECT * FROM c WHERE c.id = “val1” and c.pk = “val2”

      If you ran the above query, it would likely end up getting executed as a point read. However, this still requires some work from the query engine so the RU charge will be higher than that of a point read. In other words, it takes a small amount of RUs just to know that this query uses id and the partition key. It always better to just issue the point read if you are doing a request with the id and partition key.