Azure Cosmos DB for AI Engineers

Rodrigo Souza

In this “Azure Cosmos DB for AI Engineers”  blog post, you will learn how AI Engineers can use Azure Cosmos DB to support their AI solutions, focusing on storing and analyzing unstructured or semi-structured data.

AI Engineers design and implement intelligent apps and agents that simulate human perception using cognitive services, machine learning, and knowledge mining. Typical scenarios are anomaly detection, language understanding, text mining, search, among others. Let’s see why Azure Cosmos DB is the perfect database for AI Architectures on Azure.


Azure Cosmos DB and Azure Cognitive Services

Cognitive Services bring AI within reach of every developer—without requiring machine-learning expertise. All it takes is an API call to embed the ability to see, hear, speak, search, understand, and accelerate decision-making into your apps.

All those APIs return JSON documents, a native format for Azure Cosmos DB. AI applications may store those results in raw format just adding a unique ID, what is required for all documents in Azure Cosmos DB. There are three main reasons why you would save the results from Cognitive Services.

  • Reuse, avoiding the cost and the latency processing over and over the same data. An example is sentiment analysis, you don’t need to submit the same review to the text analytics API more than once.
  • Historic, for any kind compliance about data lineage.
  • Advanced analytics, that now is supported for Azure Cosmos DB analytical store through Azure Synapse Link. An example would be an IoT scenario, where you are use Anomaly Detector API and save the anomalies for reporting, analytics, etc.

Check out our video on IoT Anomaly Detection with Jupyter Notebooks support, using Python and Cognitive Services.


Azure Cognitive Search is a search-as-a-service cloud solution that gives developers APIs and tools for adding a rich search experience over private, heterogeneous content in web, mobile, and enterprise applications. It’s the Microsoft service for knowledge mining that AI to detect entities, read text in images, among other human capabilities. The metadata created with AI is always loaded into a searchable index, that allows full text search with natural language processing. Azure Cosmos DB plays 3 roles:

  • Data Source, allowing data enrichment on top of the Azure Cosmos DB data.
  • Data store for the insights found within the enrichment pipeline, for uses other than search. This is accomplished with a Custom Skill, that allows you to customize, filter, and store the data you need. This is a real time and granular approach to save Azure Cognitive Search data in Azure Cosmos DB.
  • Data store for enriched data, since the pipeline can export all metadata to a storage account, in table or JSON format. Unlike the above option, here all data is exported, and you will have to batch load the data in Azure Cosmos DB using a tool like Azure Data Factory.


Azure Cosmos DB and Bots

Azure Bot Service enables you to build intelligent, enterprise-grade bots with ownership and control of your data. From a simple Q&A bot to a sophisticated virtual assistant, the open source SDK connects your bot to popular channels and devices. And you can give your bot the ability to speak, listen, and understand your users because of the native integration to Cognitive Services. Azure Cosmos DB plays 3 roles:

  • State management. While bots are stateless applications, the conversations must be stateful to avoid loss of context and force the user to repeat information already provided. Bots SDK V4+ use Azure Cosmos DB as a built-in option for this cache management.
  • Data store for master data: login, password, previous transactions, actual balance, etc.
  • Data store for new information: orders, user behavior logging, feedbacks, etc. A bot is a web application running your code, and you may want to use Cognitive Services to create an intelligent agent. Here you can apply the same suggestion of the first topic and save the results of the API calls in Azure Cosmos DB.


Azure Cosmos DB in AI Architectures

AI Applications can leverage key Azure Cosmos DB capabilities like geographic distribution, multi master writes, and performance SLAs to avoid costs and latency, keeping the data close to your users. Because of its performance, there is no need of any other cache service for your solution. In the reference AI Architecture diagram below you can see a geographically distributed Intelligent Bot  using Azure Cosmos DB, Cognitive Services, and Azure Cognitive Search. For cloud-native HTAP (Hybrid Transactional and Analytical Processing) capabilities, the AI Architecture also has Azure Synapse Link for near real time analytics with SQL serverless and Apache Spark for Azure Synapse Analytics.


Cosmos DB in Reference Solution AI Architecture
Distributed Intelligent Bot Reference Architecture using Cognitive Services and Azure Cosmos DB.


Azure Cosmos DB is an important tool for AI engineers engaged into AI applications that will ingest, store, and process semi structured data. Use can use the AI Engineer Learning Path and the Azure Cosmos DB Workshops to learn more about these technologies.


Discussion is closed.

Feedback usabilla icon