January 13th, 2025

Cosmos DB Embeddings Generator Sample

Mark Brown
Principal PM Manager

Ever since our first preview announcement for vector indexing and support for DiskANN and then again when we announced the GA for these features, customers have been asking us to make it easier to generate Azure OpenAI embeddings on their data in Azure Cosmos DB.

So we did just that and created the Azure Cosmos DB Embeddings Generator sample application and hosted it on GitHub.

This sample shows how to use an Azure Cosmos DB Trigger and Output Binding in Azure Functions to automatically generate Azure OpenAI embeddings on a new or updated item, then save it back to the same item in Cosmos DB.

There is both a C# and Python version of this sample. You can choose your own adventure on which you want to use.

Image embedding generator code

What’s in this sample

Here are the specifics of what you can see and learn from this sample.

  • How to create Azure Cosmos DB Trigger and Output Bindings for Azure Functions.
    • The Functions Trigger and Output Binding are completely configuration driven. There are no hard coded values in this sample.
    • Even the input value for the Functions Trigger is cast as a dynamic type to make it easier to take and use as-is.
  • This sample also demonstrates a common ask from customers which is how to prevent endless loops in Functions Triggers (and change feed) from in-place document updates. In our sample, we compare hashes on the document to determine if it has changed.
  • How to generate embeddings using Azure OpenAI SDK with using the text-embedding-3-small embedding model.
  • How to do Entra ID authentication and RBAC for Azure Functions and Azure Cosmos DB and Azure OpenAI.
  • Deployment of this sample to Azure Functions, Azure Cosmos DB, Azure OpenAI with managed identities and RBAC.

Deploy and get started

To get the sample, just fork or clone the Azure Cosmos DB Embeddings Generator sample application from GitHub.

The sample deploys easily using Azure Developer CLI (AZD). After a few minutes you can navigate to the Azure Portal and create a new document see it in action. When you deploy, AZD also writes a local.settings.json  file so you can easily step through your C# or Python to see how it works. Then take this and easily extend this to for your own data.

Image sample embeddings

We hope you enjoy this new sample. Feel free to leave comments in the issues in GitHub.

Author

Mark Brown
Principal PM Manager

Mark is a Principal Program Manager on the Azure Cosmos DB team and is focused on making sure Azure Cosmos DB is the most developer friendly NoSQL database in the cloud. Mark is passionate about web development, cloud computing and growing the developer community around Azure and Cosmos DB.

0 comments