June 2nd, 2026
0 reactions

Announcing the Public Preview of Semantic Reranker in Azure Cosmos DB for NoSQL

Principal Product Manager

Today we’re thrilled to announce the public preview of Semantic Reranker in Azure Cosmos DB for NoSQL,a new AI-powered capability that improves the relevancy of your search results with just a few lines of code. If you’ve ever run a vector, full-text, or hybrid search and wished the most relevant documents bubbled to the very top, this one’s for you.

Semantic Reranker uses an AI model to score and reorder the results of any query based on how well each document matches the user’s intent. It’s built right into the Azure Cosmos DB SDKs (Python, .NET, and Java), so you can reorder results from any container with minimal code changes. Whether you’re building a search experience or grounding an agent with Retrieval Augmented Generation (RAG), better ranking means better answers.

 

Infographic explaining Azure Cosmos DB Semantic Reranker, an AI-powered feature that improves the relevance of search results. The diagram shows how vector search, full-text search, hybrid search, and other query types produce initial search results that are then reranked by an AI semantic reranker model. The reranked results receive updated relevance scores between 0 and 1 based on semantic context. The infographic highlights benefits including improved result quality, SDK integration for .NET, Python, and JavaScript, minimal code changes, and support for flexible application scenarios. The design uses Azure Cosmos DB branding with blue and purple icons, workflow arrows, and AI-themed visuals.

What is it?

Semantic Reranker takes the results from a query you’ve already run (e.g., vector search, full-text search, hybrid search, or any other query) and uses an AI model to rescore and reorder them based on relevancy to a user’s search phrase or context. Under the hood it uses the Microsoft AI Semantic Ranker model.

Each reranked result comes back with a few useful pieces of information:

  • Original documents: the same documents from your query, reordered by semantic relevance. Content and metadata are unchanged.
  • Relevance score: a score from 0 to 1 for each document,the higher the score, the more relevant the result. Great for filtering or thresholding.
  • Inference latency: the time the service spent on the rerank request.
  • Token usage: the number of tokens consumed by the reranking request.

Why is this important?

When you evaluate a search or retrieval system, a few metrics shape the end-user experience: accuracy/recall (how closely results match the ground truth), latency (how fast results come back), and relevancy (how well results match the user’s actual intent).

Vector and hybrid search are great at recall, they find the candidate documents. But the order of those candidates doesn’t always reflect what the user really wants. Consider a search for “best surf spots in Hawaii”: a curated surf guide is far more relevant than a general Hawaii travel article, even if both score similarly on raw similarity. That’s exactly the gap Semantic Reranker closes.

Why developers will love it:

  • Enhanced result quality: AI-powered semantic understanding surfaces higher-quality results near the top, improving the end-user experience and enriching agents with more relevant context for RAG.
  • Seamless SDK integration: works with your existing vector, full-text, and hybrid search queries.
  • Minimal code changes: a single call on the container you’re already using.
  • Flexible application: rerank results from any query against any container.

A note on tradeoffs: reranking adds another network call plus model inference time, so it increases overall search latency. It also won’t help every workload equally. Our advice: measure relevance before and after enabling the reranker to confirm it delivers a measurable win for your scenario.

How does it work?

Set up Semantic Reranker

Getting started takes just a few steps in the Azure portal:

  1. Go to your Azure Cosmos DB account in the Azure portal and find the Semantic Reranker setup experience in the menu under AI Capabilities.

Screenshot of the Azure portal showing the Semantic Reranker feature page for an Azure Cosmos DB account. The interface displays a toggle switch for enabling or disabling Semantic Reranker, which refines search results from full-text, feed, vector, or standard queries using semantic similarity. The feature is currently marked as disabled and inactive. The left navigation menu includes Azure Cosmos DB sections such as Data Explorer, Container Copy, AI Capabilities (Preview), and Monitoring. Pricing information for reranker calls is also displayed in the main panel.

  1. Review the preview information and enable the feature for your resource.
  2. Configure the reranker settings for your account and workload.

Screenshot of the Azure portal showing the Semantic Reranker feature enabled for an Azure Cosmos DB account. The interface displays an active semantic reranking endpoint URL and a toggle switch turned on. The feature description explains that Semantic Reranker refines full-text, feed, vector, and standard query results using semantic similarity. Below, a “Reranker Permissions” section allows administrators to grant or revoke reranker access for identities with query permissions. A list of user identities is shown, along with buttons for managing permissions. The left navigation menu includes Azure Cosmos DB sections such as Data Explorer, Container Copy, AI Capabilities (Preview), and Monitoring.

  1. Review your configuration and save your changes.

Heads up: Semantic Reranker is in preview. After you enable it in the portal, allow up to 1 hour for activation.

Next, assign the appropriate roles to the identities that configure or call the feature. Semantic Reranker uses account-level RBAC roles, grant each identity only what it needs:

Role What it allows
Inference Account Operator Enables and disables Semantic Reranker on an account, but doesn’t allow runtime reranker calls.
Inference Account Owner Enables and disables Semantic Reranker on an account and allows runtime reranker calls.
Semantic Reranker User Allows an application, managed identity, service principal, or user to run Semantic Reranker queries against an account.

Using it from the SDK

The pattern is simple: run your query as you normally would, then pass the results plus a reranking context (the user’s question or task) to the reranker. The context represents user intent; the documents are the candidate results from vector, full-text, hybrid, or any other search.

Here’s a Python example. It runs a full-text search, then reranks the results against a context string:

import json
import os
from azure.cosmos import CosmosClient
from azure.identity import DefaultAzureCredential

os.environ["AZURE_COSMOS_SEMANTIC_RERANKER_INFERENCE_ENDPOINT"] = "https://mytestaccount.eastus2.dbinference.azure.com"

endpoint = "https://mytestaccount.documents.azure.com:443/"
database_name = "testdatabase"
container_name = "testcontainer"
credential = DefaultAzureCredential()

client = CosmosClient(endpoint, credential=credential)
database = client.get_database_client(database_name)
container = database.get_container_client(container_name)

query = """

SELECT TOP 15 c.id, c.Title, c.Studio, c.Description, c.YearReleased
FROM c
WHERE FullTextContainsAny(c.Description, "Sandra Bullock", "Johnny Depp", "comedy")
ORDER BY RANK FullTextScore(c.Description, "comedy")

"""

documents = list(container.query_items(
query=query, enable_cross_partition_query=True))

reranking_context = "audience rated PG-13 and public sentiment strong and positive"

reranked_results = container.semantic_rerank(context=reranking_context, documents=[json.dumps(document) for document in documents],

options={
"return_documents": True,
"top_k": 5,
"sort": True,
"document_type": "json",
"target_paths": "Description",
})

for score in reranked_results["Scores"]:
     print(f"index: {score['index']}, score: {score['score']}, document: {score['document']}")

The reranker accepts a context string and a list of candidate documents, along with a few optional knobs to shape the output:

  • return_documents – include the original documents in the response (default true).
  • top_k – limit the number of reranked results returned.
  • sort – sort the response by reranking score.
  • document_type – identify the document format, such as json.
  • target_paths – identify the document field(s) to consider for reranking.

The same conceptual inputs are available in the .NET and Java SDKs with language-specific parameter names, so you can drop reranking into whichever stack you’re already building on.

Limitations

A few things to keep in mind while Semantic Reranker is in preview:

  • Semantic Reranker supports a maximum of 50 documents per rerank call.
  • A single context-document pair should be at most 2,048 tokens.
  • Reranking adds a network call and model inference time, so it increases overall search latency,always measure relevance before and after to confirm the benefit.
  • Semantic Reranker is $1 USD per 1,000 rerank calls. Regional and cloud pricing may vary.
  • Supported SDKs are: .NET (preview), Python, and Java.

Get started

Ready to make your search results more relevant? Enable Semantic Reranker on your Azure Cosmos DB account, update to the latest SDK, and add a single rerank call to your existing query pipeline. Then measure the lift,we think you’ll like what you see.

Dive into the docs to learn more:

About Azure Cosmos DB

Azure Cosmos DB is a fully managed and serverless NoSQL and vector database for modern app development, including AI applications. With its SLA-backed speed and availability as well as instant dynamic scalability, it is ideal for real-time NoSQL and MongoDB applications that require high performance and distributed computing over massive volumes of NoSQL and vector data.

To stay in the loop on Azure Cosmos DB updates, follow us on XYouTube, and LinkedIn.  Join the discussion with other developers on the #nosql channel on the Microsoft Open Source Discord.

Author

James Codella
Principal Product Manager

0 comments