April 19th, 2024

Azure Cosmos DB design patterns – Part 9: Schema versioning

Jay Gordon
Senior Program Manager

Welcome to Part 9 of our Azure Cosmos DB Design Patterns series, focusing on Schema Versioning. This edition is particularly useful for those new to NoSQL databases or looking to understand Azure Cosmos DB’s unique capabilities. Here, we will explore how schema versioning can help manage and evolve your database schema efficiently, drawing on real-world applications and specific patterns. By the end of this post, you’ll have a clear roadmap for implementing schema versioning in your projects, ensuring your database keeps pace with application development without downtime or data corruption.

Azure Samples / cosmsos-db-design-patterns

We have expanded the reach of our Azure Cosmos DB Design Patterns, once shared only on a case-by-case basis, by launching a dedicated GitHub repository. This repository highlights a variety of examples that demonstrate the implementation of specific design patterns, tailored to help you tackle design challenges in Azure Cosmos DB projects. Each entry in our blog series delves into one pattern, complete with a sample application featured in the repository. Our goal is to make these powerful insights more accessible, helping you enhance your projects efficiently. Enjoy exploring these patterns, and we hope you find the series both informative and useful.

Here is a list of the previous posts in this series:

Azure Cosmos DB design pattern: Schema versioning

What is Schema Versioning?

Schema versioning is the process of tracking changes in the database schema over time. In the context of NoSQL databases like Azure Cosmos DB, schema versioning involves adding a field, typically named SchemaVersion,to each document. This field indicates the version of the schema that the document conforms to. If a document lacks this field, it is treated as conforming to the original schema version.

Schema versioning helps in:

  • Minimizing Disruptions: Changes can be made incrementally without affecting existing operations.
  • Preserving Data Integrity: Ensures that data conforms to expected schema definitions, preventing errors and inconsistencies.
  • Facilitating Evolution: Allows the database to evolve alongside application changes smoothly.

Implementing Schema Versioning in Azure Cosmos DB

Step 1: Define Your Versioning Strategy

Before implementing schema versioning, decide on the versioning format and how granular the versions need to be. A simple numerical increment (1, 2, 3…) or semantic versioning (1.0, 1.1, 2.0…) could be used depending on the complexity of the changes.

Step 2: Modify Your Application Logic

Update your application logic to include the SchemaVersion field in each new document written to Cosmos DB. Ensure that every write operation either specifies the current schema version or relies on default values defined in your database model.

var document = new {
    id = "unique-id",
    SchemaVersion = "1.0",
    data = // your data here
};

await container.CreateItemAsync(document);

Step 3: Handling Data Reads

When reading data, your application should check the SchemaVersion of each document and process it accordingly. Implement logic to handle documents according to their schema version or convert them to the latest version format as needed.

if (document.SchemaVersion == "1.0") {
    // Handle version 1.0 specific logic
} else if (document.SchemaVersion == "2.0") {
    // Handle version 2.0 specific logic
}

Step 4: Data Migration

For documents that are in older versions, you might need to perform migrations to update them to the latest schema. This can be done lazily (on access) or as a batch process, depending on your application needs and the volume of data.

// Example of a lazy migration approach
if (document.SchemaVersion == "1.0") {
    document = MigrateToVersion2(document);
    document.SchemaVersion = "2.0";
    await container.UpsertItemAsync(document);
}

private static dynamic MigrateToVersion2(dynamic doc) {
    // Migration logic here
    return doc;
}

These C# examples demonstrate how to manage schema versioning within an Azure Cosmos DB application using the .NET SDK. Be sure to replace placeholders and types as appropriate for your specific application context.

The Scenario:

As organizations grow and adapt, so do their data structures. In this section, we will discuss how schema versioning plays a crucial role in managing these changes efficiently, particularly using Azure Cosmos DB.

The Need for Schema Versioning

Consider a data-intensive application that must evolve to include more detailed user data, transaction histories, and interaction logs. As new features are introduced, the document schema in the database must be updated to reflect these new data types and relationships.

Implementing Schema Versioning

With Azure Cosmos DB, implementing schema versioning involves adding a SchemaVersion field to each document, which denotes the version of the schema it adheres to. This approach allows applications to handle documents differently based on their version, facilitating a smooth transition as schemas evolve.

Structuring Schema Changes

  1. Initial Schema (Version 1.0):
    • Contains essential fields such as userID, transactionDate, amount.
  2. Revised Schema (Version 2.0):
    • Adds new fields like transactionHistory, userInteractionLogs.
    • Modifications to userID to support new authentication methods.

Each schema change is well-documented, and the transition between them is managed through version checks in the application logic.

Documentation and Release Notes

For every schema update, detailed release notes are maintained. These notes typically include:

  • Changes Made: Description of new fields added or modified.
  • Rationale: Reasons behind these changes, such as accommodating new features or improving performance.
  • Impact Analysis: How the changes affect existing operations and data, and any necessary actions to adapt to the new schema.

Sample Release Note Entry

## Version 2.0 Release Notes - March 20XX

### Changes Made:
- New fields: `transactionHistory`, `userInteractionLogs` added.
- Updated `userID` to support multiple authentication types.

### Rationale:
- To provide a more comprehensive view of user activities and enhance security measures.

### Impact:
- Documents created with the new schema will utilize these fields. Older documents remain compatible but will operate under Version 1.0.

Sample Implementation:

Case Study:

Wide World Importers operates an online store with its data stored in Azure Cosmos DB for NoSQL. Initially, their cart object was structured simply to accommodate straightforward product orders. Below is an example of the initial Cart and CartItem class definitions and how they were represented in Azure Cosmos DB:

Initial Cart Class:

public class Cart
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    public string SessionId { get; set; } = Guid.NewGuid().ToString();
    public int CustomerId { get; set; }
    public List<CartItem>? Items { get; set;}
}

public class CartItem {
    public string ProductName { get; set; } = "";
    public int Quantity { get; set; }
}

Corresponding JSON Document in Azure Cosmos DB:

{
  "id": "194d7453-d9db-496b-834b-7b2db408e4be",
  "SessionId": "98f5621e-b1af-44f1-815c-f4aac728c4d4",
  "CustomerId": 741,
  "Items": [
    {"ProductName": "Product 23", "Quantity": 4},
    {"ProductName": "Product 16", "Quantity": 3}
  ]
}

Updating the Schema

Feedback indicated a need to track special order details for products. To avoid updating all cart items unnecessarily, a SchemaVersion field was added to the cart object to manage changes efficiently.

Updated CartWithVersion Class:

public class CartWithVersion
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    public string SessionId { get; set; } = Guid.NewGuid().ToString();
    public long CustomerId { get; set; }
    public List<CartItemWithSpecialOrder>? Items { get; set;}
    public int SchemaVersion = 2;
}

public class CartItemWithSpecialOrder : CartItem {
    public bool IsSpecialOrder { get; set; } = false;
    public string? SpecialOrderNotes { get; set; }
}

Updated JSON Document in Cosmos DB:

{
  "SchemaVersion": 2,
  "id": "9baf08d2-e119-46a1-92d7-d94ee59d7270",
  "SessionId": "39306d1b-d8d8-424a-aa8b-800df123cb3c",
  "CustomerId": 827,
  "Items": [
    {
      "ProductName": "Product 4",
      "Quantity": 2,
      "IsSpecialOrder": false,
      "SpecialOrderNotes": null
    },
    {
      "ProductName": "Product 22",
      "Quantity": 2,
      "IsSpecialOrder": true,
      "SpecialOrderNotes": "Special Order Details for Product 22"
    },
    {
      "ProductName": "Product 15",
      "Quantity": 3,
      "IsSpecialOrder": true,
      "SpecialOrderNotes": "Special Order Details for Product 15"
    }
  ]
}

Managing Schema Versions

It is beneficial to document schema updates comprehensively. Here is a hypothetical schema.md document to track these changes:

Filename: schema.md

## Schema Updates:

| Version | Notes                                         |
|---------|-----------------------------------------------|
| 2       | Added special order details to cart items     |
| 1       | Original release                              |

Implementing in the Application

On the application side, handling the schema version allows developers to render UI components conditionally based on whether the cart includes special orders:

Snippet from the Application Code:

public class Cart
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    public string SessionId { get; set; } = Guid.NewGuid().ToString();
    public long CustomerId { get; set; }
    public List<CartItemWithSpecialOrder>? Items { get; set;}
    public int? SchemaVersion { get; set; }

    public bool HasSpecialOrders() {
        return this.Items.Any(x => x.IsSpecialOrder);
    }
}

Rendering Conditional UI Based on Schema Version:

@foreach (Cart cart in Model.Carts){
    <section data-id="@cart.Id">
        <p><strong>Customer: </strong>@cart.CustomerId</p>
        <table>
            <thead>
                <tr>
                    @if(cart.SchemaVersion != null){
                        <th>Schema Version</th>
                    }
                    <th>Product Name</th>
                    <th>Quantity</th>
                    @if (cart.HasSpecialOrders()){
                        <th>Special Order Notes</th>
                    }
                </tr>
            </thead>
            <tbody>
            @foreach (var item in cart.Items)
            {
                <tr>
                    @if(cart.SchemaVersion != null){
                        <td>@cart.SchemaVersion</td>
                    }
                    <td>@item.ProductName</td>
                    <td>@item.Quantity</td>
                    @if (cart.HasSpecialOrders() && item.IsSpecialOrder){
                        <td>@item.SpecialOrderNotes</td>
                    } else {
                        <td></td>
                    }
                </tr>
            }
            </tbody>
        </table>
    </section>
}

By adopting a schema versioning pattern is essential for effectively managing database schema changes. By embedding a SchemaVersion field within each document, applications can support multiple schema versions seamlessly, ensuring smooth transitions and minimizing disruptions during schema evolution. This approach not only maintains backward compatibility and simplifies maintenance but also enhances data integrity and facilitates easier debugging. Implementing structured versioning rules, maintaining detailed documentation for each schema iteration, and adapting application logic to accommodate these changes are crucial strategies for achieving robust and scalable database architecture.

Getting Started with Azure Cosmos DB Design Patterns

You can review the sample code by visiting the Schema validation design pattern on GitHub. You can also try this out for yourself by visiting the Azure Cosmos DB Design Patterns GitHub repo and cloning or forking it. Then run locally or from Codespaces in GitHub. If you are new to Azure Cosmos DB, we have you covered with a free Azure Cosmos DB account for 30 days, no credit card needed. If you want more time, you can extend the free period. You can even upgrade too.

Sign up for your free Azure Cosmos DB account at aka.ms/trycosmosdb.

Explore this and the other design patterns and see how Azure Cosmos DB can enhance your application development and data modeling efforts. Whether you are an experienced developer or just getting started, the free trial allows you to discover the benefits firsthand.

To get started with Azure Cosmos DB Design Patterns, follow these steps:

  1. Visit the GitHub repository and explore the various design patterns and best practices provided.
  2. Clone or download the repository to access the sample code and documentation.
  3. Review the README files and documentation for each design pattern to understand when and how to apply them to your Azure Cosmos DB projects.
  4. Experiment with the sample code and adapt it to your specific use cases.

About Azure Cosmos DB

Azure Cosmos DB is a fully managed and serverless distributed database for modern app development, with SLA-backed speed and availability, automatic and instant scalability, and support for open-source PostgreSQL, MongoDB, and Apache Cassandra. Try Azure Cosmos DB for free here. To stay in the loop on Azure Cosmos DB updates, follow us on XYouTube, and LinkedIn.

Author

Jay Gordon
Senior Program Manager

Jay Gordon is a Senior Program Manager with Azure Cosmos DB focused on reaching developer communities. Jay is located in Brooklyn, NY.

0 comments

Discussion are closed.