September 8th, 2021

Taking the EF Core Azure Cosmos DB Provider for a Test Drive

Jeremy Likness
Principal Program Manager - .NET Web Frameworks

The release of EF Core 6.0 is right on the horizon (as I write this). The team has been hard at work adding features. One area of focus is the Azure Cosmos DB experience. We received feedback that many developers would prefer to use the provider for Cosmos DB but are waiting for certain key features.

Planetary docs

I built a reference app that uses Azure Cosmos DB with EF Core on Blazor Server. It includes search capability, cross-referenced entities, and an interface to create, read, and update. I recently upgraded to the latest EF Core 6.0 version and was able to simplify and remove quite a bit of code!

Screenshot of Planetary Docs

Feature overview

Here are some of the features requested that we added to the EF Core 6.0 Azure Cosmos DB provider.

Implicit ownership

EF Core was built as an object relational mapper. In relational databases, complex relationships are expressed by storing related entities in separate tables and referencing them with foreign keys. EF Core assumes non-primitive entity types encountered in a parent are expressed as foreign key relationships. The relationships are configured using HasMany or HasOne and the instances are assumed to exist independently with a configured relationship. In document databases, the default behavior for entity types is to assume they are embedded documents owned by the parent. In other words, the complex type’s data exists within the context of the parent. In previous versions of EF Core, this behavior had to be configured explicitly for it to work with the Azure Cosmos DB provider. In EF Core 6.0, ownership is implicit. This saves configuration and ensures the behavior is consistent with NoSQL approaches from other providers.

For example, in Planetary Docs there are authors and tags. The entities “own” a list of summaries that point to the URL and titles of related documents. This way, when a user asks “What documents have tag X” I only need one document loaded to answer the question (I load tag X, then iterate its owned collection of titles). Using EF Core 5, I had to explicitly claim ownership:

tagModel.OwnsMany(t => t.Documents);
authorModel.OwnsMany(t => t.Documents);

In EF Core 6, the ownership is implicit so there is no need to configure the entities except to specify partition keys.

Support for primitive collections

In relational databases, primitive collections are often modeled by either promoting them to complex types or converting them to a serialized artifact to store in a single column. Consider a blog post that can have a list of tags. One common approach would be to create an entity that represents a tag:

public class Tag 
{
    public int Id { get; set; }
    public string Text { get; set; }
}

The tag is then referenced:

public ICollection<Tag> Tags { get; set; }

The primitive is promoted to a complex type and stored in a separate table. An alternative is to collapse the tags into a single field that contains a comma-delimited list. This approach requires a value converter to marshal the list into the field for updates and decompose the field into the list for read. It also makes it difficult and expensive to answer questions like, “How many posts are tagged X?” Using EF Core 5, I chose the single column approach. I serialized the list to JSON when writing and deserialized when reading. This is the serialization code:

private static string ToJson<T>(T item) => JsonSerializer.Serialize(item);
private static T FromJson<T>(string json) => JsonSerializer.Deserialize<T>(json);

I configured EF Core to make the conversions:

docModel.Property(d => d.Tags)
    .HasConversion(
        t => ToJson(t),
        t => FromJson<List<string>>(t));

And the resulting document looked like this:

{
    "tags" : "[\"one\", \"two\", \"three\"]"
}

With EF Core 6.0, I simply deleted the code to take advantage of the built-in handling of primitive types. This results in a document like this:

{
    "tags" : [ 
        "one",
        "two",
        "three"
    ]
}

This results in a schema change that Azure Cosmos DB has no problem handling. The C# code, on the other hand, will throw when a current model using tags as an array encounters a legacy record that used tags as a field. How do we handle this when EF Core doesn’t have the concept of NoSQL migrations?

Raw SQL

A popular request is to allow developers to write their own SQL for data access. This is exactly the feature I needed to handle my code migration. For the raw SQL to work, it must project to an existing model. It is an extension of the DbSet<T> for the entity. In my case, it enabled an in-place migration. After updating the code, attempting to load a document would fail. The document had a single string property for “tag” but the C# model is an array, so the JSON serializer would throw an exception. To remedy this, I used a built-in feature of Azure Cosmos DB that will parse a string into an array. Using a query, I project the entity to a document that matches the current schema and then save it back. This is the migration code:

var docs = await Documents.FromSqlRaw(
    "select c.id, c.Uid, c.AuthorAlias, c.Description, c.Html, c.Markdown, c.PublishDate, c.Title, STRINGTOARRAY(c.Tags) as Tags from c").ToListAsync();
foreach (var doc in docs)
{
    Entry(doc).State = EntityState.Modified;
}

This feature empowers developers to craft complex queries that may not be supported by the LINQ provider.

Additional enhancements

In addition to what I already covered, these enhancements also made it in.

Summary

I’m excited about the changes coming and hope that you are, too. Are you using the Cosmos DB provider? Are you considering it now that we’ve added these features? Is there something critical you need that we missed? Let me know in the comments below. Thank you!

Author

Jeremy Likness
Principal Program Manager - .NET Web Frameworks

Jeremy is a Principal Program Manager for .NET Web Frameworks at Microsoft. Jeremy wrote his first program in 1982, was recognized in the "who's who in Quake" list for programming the first implementation of "Midnight Capture the Flag" in Quake C and has been developing enterprise applications for 25 years with a primary focus on web-based delivery of line of business applications. Jeremy is the author of four technology books, a former 8-year Microsoft MVP for Developer Tools and Technologies, is an international and keynote speaker and writes regularly on cloud and container development. Jeremy follows a 100% plant-based diet and spends most of his free time running, hiking and camping, and playing 9-ball and one pocket.

8 comments

Discussion is closed. Login to edit/delete existing comments.

Newest
Newest
Popular
Oldest
  • Chris DaMour

    any updates on support for heterogenous/polymorphic support for collections? hoping primitive is just a stepping stone

  • Nirmal Prabhu

    Does EF Core 6 offer support for continuation token with Cosmos DB for pagination? We are lacking this functionality with EF Core 5.

  • Midnight · Edited

    Looks like a typo near

    authorModel.OwnsMany(t => t.Documentts);
    • Jeremy LiknessMicrosoft employee Author

      TThanks! Tthis has been fixed. 🙂

  • Joris Kommeren · Edited

    Sounds good! Nice article.

    I’m curious though, would someone like to tell me why they’re choosing cosmos as a provider over a sql database for EF? Genuinely interested in use cases!

  • Eric Blankenburg · Edited

    Thank you for supporting arrays of primitive types. It’s critical for what we need.

    Question — Cosmos has a private preview of hierarchical partitioning keys. This is another thing we need. When can we expect that in Entity Framework?

    • Jeremy LiknessMicrosoft employee Author

      Our pleasure!

      We will support features that are out of preview when EF Core is out of preview. Do you know if the Cosmos team has an ETA? I can connect with them on my end but am asking in case you know already based on your interactions.

      The best way to get it on the roadmap is to file an issue on the EFCore repository.

      https://github.com/dotnet/efcore/issues/new/choose

      We typically prioritize issues with the most upvotes as signal for reach...

      Read more

Feedback