Autoscale + serverless: new offers to fit any workload

Deborah Chen

This blog post was co-authored by Deborah Chen and Thomas Weiss, program managers on Azure Cosmos DB. 

Update: Serverless is available in preview for the Core (SQL) API as of August 19, 2020.

Over the years, we’ve heard from many of you that you’d like more flexibility with the Azure Cosmos DB billing model to better balance cost and performance. With our current provisioned throughput model, it’s easy to set the exact throughput you need for your workload, measured in Request Units per second (RU/s), guaranteed by Azure Cosmos DB’s SLAs. For scenarios where your application doesn’t need constant throughput however, we know choosing the right RU/s can be a challenge. For example, some apps might have variable, unpredictable traffic, while others might not have consistent usage at all, seeing only sporadic spikes of usage.

To make this easier, we’re excited to announce two new ways to pay for your database operations: the general availability of autoscale provisioned throughput, which brings automatic scaling to our provisioned throughput model, and the upcoming preview of serverless. Together, provisioned throughput and serverless ensure that Azure Cosmos DB is, more than ever, a database that delivers the best performance and cost-effectiveness for any kind of workload.

Autoscale provisioned throughput (GA)

With autoscale provisioned throughput (formerly known as “autopilot”) Azure Cosmos DB automatically and instantaneously scales your throughput (RU/s) based on the workload usage within a preset range. This means you can focus on building your application and let Azure Cosmos DB handle the work of capacity and scale management for you. Autoscale is well suited for mission-critical applications that have variable or unpredictable traffic patterns.

It’s backed by all Azure Cosmos DB SLAs, supports all Azure Cosmos DB APIs – Core (SQL), Gremlin, Cassandra, Table, and API For MongoDB –  and helps optimize your RU/s usage and cost by scaling down when not in use.

How does autoscale work?

You set the highest or maximum throughput (RU/s), T_max, you want your database or container to scale to. Azure Cosmos DB scales the RU/s based on usage, so that it’s always between 10% of T_max and T_max. For example, if you set a maximum throughput of 10,000 RU/s, this will scale between 1000 to 10,000 RU/s. Billing is done on a per-hour basis, for the highest RU/s the system scaled to within the hour.

Autoscale automatically adjusts the provisioned throughput to your traffic

Based on your feedback from the preview, with GA, we are introducing several new features to make autoscale easier to use:

Custom values are now supported for the maximum throughput (RU/s), which replaces the set of values available in the preview. This gives you more flexibility to set the right value based on the needs of your workload. Any resources created during the preview will automatically be compatible with the new model.

Create new autoscale container in Azure portal

Autoscale can now be enabled on existing databases and containers. This makes it easy to take advantage of autoscale without having to migrate your data or create resources. If you have an existing workload with a variable or unpredictable traffic pattern, and don’t currently do manual scaling yourself, enabling autoscale can help optimize your RU/s usage and cost, as it will scale down to the minimum of the RU/s range when not in use.

Finally, we now have programmatic support in the latest versions of the Azure Cosmos DB SDKs for .NET and Java, Resource Manager, and commands for Cassandra API and API for MongoDB. Support for PowerShell and Azure CLI will be available in an upcoming release.

How you can save with autoscale provisioned throughput

Imagine a workload with certain characteristics:

  • Sustained traffic that varies over time, with no predictable pattern
  • Peaks of 50,000 RU/s throughput the month
  • Peaks occur no more than 25% of the time

Example of workload with variable traffic between 5000 to 50,000 RU/s

The workload’s peak is easily identified – 50,000 RU/s – but the throughput needs for the rest of the time keeps changing. If you use standard provisioned throughput for this workload, you may decide to simply set the throughput capacity to 50,000 RU/s in order to manage your peaks over the course of the month. However, doing this would mean that you’d be paying the maximum all month – despite only needing that capacity 25% of the time.

With autoscale provisioned throughput, you would set 50,000 RU/s as your maximum and then pay for the throughput your workload uses, starting at 10% of your maximum.  As long as you don’t use your maximum more than 1/3 of the time, you can expect considerable savings. Learn more about how to choose between standard and autoscale throughput.

Announcing serverless on Azure Cosmos DB

Autoscale is a great fit for any situation where you need guaranteed throughput and performance, and your traffic isn’t predictable enough to scale your throughput manually. But what if your workload doesn’t require sustained throughput?

In some scenarios, you may expect your Azure Cosmos DB database to sit idle most of the time, only processing requests occasionally. This is typically the case when you get started with Azure Cosmos DB, build a prototype, or even run small, non-critical applications. Provisioning throughput isn’t required here; instead, you just need a cost-effective way to pay for the individual database requests you are sending.

A spiky workload with sporadic requests

To best serve this kind of use-cases, we are extremely excited to announce the upcoming preview of Azure Cosmos DB serverless, a purely consumption-based offer. With serverless, you will only pay for:

  • the request units (RU) consumed by your database operations,
  • the storage consumed by your data.

Only RUs consumed by your requests get billed in serverless mode

Because serverless is a true pay-per-request billing model, it will lower even further the entry price for anyone who wants to start using Azure Cosmos DB or run small applications with light traffic.

Azure Cosmos DB serverless will launch in public preview in the next couple of months and will be available for all Azure Cosmos DB APIs.

A tour of Azure Cosmos DB pricing models in video

Watch this video to better understand how autoscale provisioned throughput and serverless make Azure Cosmos DB a cost-effective solution for any kind of workload:

Get started

To get started with autoscale, check out our guide on how to determine if you should use autoscale for your workload. Learn more about how autoscale works and how to enable autoscale on a new or existing workload.