Announcing latest Azure Cosmos DB Python SDK: Powering the Future of AI with OpenAI

We’re thrilled to announce the stable release of Azure Cosmos DB Python SDK version 4.14.0! This release brings together months of innovation and collaboration, featuring ground-breaking capabilities that have been battle-tested in production environments. Many of these features were developed in close partnership with OpenAI, who rely heavily on Cosmos DB to store chat data for ChatGPT at massive scale.

What Makes This Release Special

After extensive beta testing, we’re proud to deliver a stable release that combines performance, intelligence, and developer productivity. The features in this release have been proven in real-world scenarios, including powering some of the most demanding AI workloads in the world.

🚀 Major New Features

Semantic Reranking – AI powered document intelligence (Preview)

One of the most exciting additions is our new Semantic Reranking API, currently a private preview feature that brings AI-powered document reranking directly to your Cosmos DB containers. This feature leverages Azure’s inference services to intelligently rank documents based on semantic relevance. If you want to be onboarded to the semantic re-ranking private preview – sign up here. For more information, contact us at CosmosDBSemanticReranker@Microsoft.com. Check out our demo sample here to test drive this, and other powerful semantic search features, in Python for Azure Cosmos DB.

from azure.cosmos import CosmosClient

# Initialize your client
client = CosmosClient(endpoint, key)
container = client.get_database("MyDatabase").get_container("MyContainer")

# Perform semantic reranking
results = container.semantic_rerank(
    context="What is the capital of France?",
    documents=[
        "Berlin is the capital of Germany.",
        "Paris is the capital of France.", 
        "Madrid is the capital of Spain."
    ],
    options={
        "return_documents": True,
        "top_k": 10,
        "batch_size": 32,
        "sort": True
    }
)

# Results are intelligently ranked by relevance
print(results)
# Output:
# {
#   "Scores": [
#     {
#       "index": 1,
#       "document": "Paris is the capital of France.",
#       "score": 0.9921875
#     },
#     ...
#   ]
# }

This feature enables you to build more intelligent applications that can understand context and meaning, not just keyword matching. Perfect for RAG (Retrieval-Augmented Generation) patterns in AI applications.

Read Many Items – Optimized Batch Retrieval

The new read_items API revolutionizes how you retrieve multiple documents, offering significant performance improvements and cost savings over individual point reads.

# Define the items you want to retrieve
item_list = [
    ("item1", "partition1"),
    ("item2", "partition1"), 
    ("item3", "partition2")
]

# Retrieve all items in a single optimized request
items = list(container.read_items(
    items=item_list,
    max_concurrency=100
))

# The SDK intelligently groups items by partition and uses
# optimized backend queries (often IN clauses) to minimize
# network round trips and RU consumption

Performance Benefits:

Reduces network round trips by up to 90%
Lower RU consumption compared to individual reads
Intelligent query optimization based on partition distribution

Automatic Write Retries – Enhanced Resilience

Say goodbye to manual retry logic for write operations! The SDK now includes built-in retry capabilities for write operations that encounter transient failures.

# Enable retries at the client level
client = CosmosClient(
    endpoint, 
    key,
    connection_policy=ConnectionPolicy(retry_write=1)
)

# Or enable per-request
container.create_item(
    body=my_document,
    retry_write=1 # Automatic retry on timeouts/server errors
)

What Gets Retried:

Timeout errors (408)
Server errors (5xx status codes)
Transient connectivity issues

Smart Retry Logic:

Single-region accounts: One additional attempt to the same region
Multi-region accounts: Cross-regional failover capability
Patch operations require explicit opt-in due to potential non-idempotency

Enhanced Developer Experience

Client-Level Configuration Options

Custom User Agent: Identify your applications in telemetry:

# Set custom user agent suffix for better tracking
client = CosmosClient(
    endpoint, 
    key,
    user_agent_suffix="MyApplication/1.0"
)

Throughput Bucket Headers: Optimize performance monitoring (see here for more information on throughput buckets):

# Enable throughput bucket headers for detailed RU tracking
client = CosmosClient(
    endpoint, 
    key,
    throughput_bucket=2  # Set at client level
)

# Or set per request
container.create_item(
    body=document,
    throughput_bucket=2
)

Excluded Locations: Fine-tune regional preferences:

# Exclude specific regions at client level
client = CosmosClient(
    endpoint, 
    key,
    excluded_locations=["West US", "East Asia"]
)

# Or exclude regions for specific requests
container.read_item(
    item="item-id",
    partition_key="partition-key", 
    excluded_locations=["Central US"]
)

Return Properties with Container Operations

Streamline your workflows with the new return_properties parameter:

# Get both the container proxy and properties in one call
container, properties = database.create_container(
    id="MyContainer",
    partition_key=PartitionKey(path="/id"),
    return_properties=True
)

# Now you have immediate access to container metadata
print(f"Container RID: {properties['_rid']}")
print(f"Index Policy: {properties['indexingPolicy']}")

Feed Range Support in Queries

Unlock advanced parallel change feed processing capabilities:

# Get feed ranges for parallel processing
feed_ranges = container.get_feed_ranges()

# Query specific feed ranges for optimal parallelism
for feed_range in feed_ranges:
    items = container.query_items(
        query="SELECT * FROM c WHERE c.category = @category",
        parameters=[{"name": "@category", "value": "electronics"}],
        feed_range=feed_range
    )

Enhanced Change Feed: More flexible change feed processing:

# New change feed mode support for fine-grained control
change_feed_iter = container.query_items_change_feed(
    feed_range=feed_range,
    mode="Incremental",  # New mode support
    start_time=datetime.utcnow() - timedelta(hours=1)
)

Vector Embedding Policy Management

Enhanced support for AI workloads with vector embedding policy updates:

# Update indexing policy for containers with vector embeddings
indexing_policy = {
    "indexingMode": "consistent",
    "vectorIndexes": [
        {
            "path": "/vector",
            "type": "quantizedFlat"
        }
    ]
}

# Now you can replace indexing policies even when vector embeddings are present
container.replace_container(
    container=container_properties,
    indexing_policy=indexing_policy
)

Advanced Query Capabilities

Weighted RRF for Hybrid Search: Enhance your search relevance with Reciprocal Rank Fusion:

# Use weighted RRF in hybrid search queries
query = """
SELECT c.id, c.title, c.content 
FROM c 
WHERE CONTAINS(c.title, "machine learning") 
ORDER BY RRF(VectorDistance(c.embedding, @vector), 
             FullTextScore(c.content, "artificial intelligence"), 
             [0.7, 0.3])
"""

items = container.query_items(query=query, parameters=[
    {"name": "@vector", "value": search_vector}
])

Computed Properties (Now GA)

Computed Properties have graduated from preview to general availability:

# Define computed properties for efficient querying
computed_properties = [
    {
        "name": "lowerCaseName", 
        "query": "SELECT VALUE LOWER(c.name) FROM c"
    }
]

# Replace container with computed properties
container.replace_container(
    container=container_properties,
    computed_properties=computed_properties
)

# Query using computed properties for better performance
items = container.query_items(
    query="SELECT * FROM c WHERE c.lowerCaseName = 'john doe'"
)

Reliability and Performance Improvements

Advanced Session Management

The SDK now includes sophisticated session token management:

Automatically optimizes session tokens
Sends only relevant partition-local tokens for reads
Eliminates unnecessary session tokens for single-region writes
Improves performance and reduces request size

Circuit Breaker Support

Enable partition-level circuit breakers for enhanced fault tolerance:

import os

# Enable circuit breaker via environment variable
os.environ['AZURE_COSMOS_ENABLE_CIRCUIT_BREAKER'] = 'true'

# The SDK will automatically isolate failing partitions
# while keeping healthy partitions available

Enhanced Error Handling

More resilient retry logic with cross-regional capabilities.

Monitoring and Diagnostics

Enhanced Logging and Diagnostics

Automatic failover improvements:

Better handling of bounded staleness consistency
Cross-region retries when no preferred locations are set
Improved database account call resilience

import logging
from azure.cosmos import CosmosHttpLoggingPolicy # Set up enhanced logging logging.basicConfig(level=logging.INFO) client = CosmosClient( endpoint, key, logging_policy=CosmosHttpLoggingPolicy(logger=logging.getLogger()) )

The OpenAI Connection

Many of these features were developed in collaboration with OpenAI, who use Cosmos DB extensively for ChatGPT’s data storage needs. This partnership ensures our SDK can handle:

Massive Scale: Billions of operations per day
Low Latency: Sub-10ms response times for AI workloads
High Availability: 99.999% uptime requirements
Global Distribution: Seamless worldwide data replication

When you use the Python SDK for Azure Cosmos DB, you’re leveraging the same technology that powers some of the world’s most advanced AI applications.

Real-World Impact

Performance Benchmarks

Based on testing with synthetic workloads:

Read Many Items: Up to 85% reduction in latency for batch retrieval scenarios
Write Retries: 99.5% reduction in transient failure impact
Session Optimization: 60% reduction in session token overhead
Circuit Breaker: 90% faster recovery from partition-level failures

Cost Optimization

Reduced RU Consumption: Batch operations can reduce costs by up to 40%
Fewer Network Calls: Significant bandwidth savings in high-throughput scenarios
Optimized Retries: Intelligent retry logic prevents unnecessary RU charges

Breaking Changes (Important!)

If you have been using the beta versions of Python SDK (since the last stable version 4.9.0) there is one breaking change:

Changed `retry_write` Parameter Type

# Before (4.13.x and earlier)
retry_write = True  # boolean

# After (4.14.0)
retry_write = 3  # integer (number of retries)

This change aligns with other retry configuration options and provides more granular control.

Migration Guide

Upgrading from any beta higher than 4.9.0 to 4.14.0

Update your dependencies:
```
pip install azure-cosmos==4.14.0
```

Update retry_write usage (if applicable):

# Old way
client = CosmosClient(endpoint, key, retry_write=True)

# New way  
client = CosmosClient(endpoint, key, retry_write=3)

Leverage new features (optional but recommended):
- Take advantage of read_items for batch operations
- Enable automatic write retries for resilience
- Use return_properties to reduce API calls

What’s Next?

This release establishes the foundation for even more exciting AI-focused features coming in future versions:

Enhanced vector search capabilities
Advanced semantic search integration
Expanded AI inference service integrations
Performance optimizations for RAG patterns

Additional Resources

Full Changelog – Complete list of changes including bug fixes since 4.9.0
SDK Documentation – Comprehensive API reference
Sample Code – Working examples for all new features
Migration Guide – Step-by-step upgrade instructions

Get Involved

Have feedback or questions? We’d love to hear from you!

GitHub Issues: Report bugs or request features
Stack Overflow: Tag your questions with azure-cosmosdb and python
Documentation: Contribute to our docs

Ready to upgrade? Install Azure Cosmos DB Python SDK v4.14.0 today and experience the power of AI-enhanced database operations!

pip install --upgrade azure-cosmos==4.14.0

The future of AI-powered applications starts with the right data foundation. With the latest Cosmos DB Python SDK, you have the tools to build intelligent, scalable, and resilient applications that can handle anything the world throws at them.

Announcing latest Azure Cosmos DB Python SDK: Powering the Future of AI with OpenAI

What Makes This Release Special

🚀 Major New Features

Semantic Reranking – AI powered document intelligence (Preview)

Read Many Items – Optimized Batch Retrieval

Automatic Write Retries – Enhanced Resilience

Enhanced Developer Experience

Client-Level Configuration Options

Return Properties with Container Operations

Feed Range Support in Queries

Vector Embedding Policy Management

Advanced Query Capabilities

Computed Properties (Now GA)

Reliability and Performance Improvements

Advanced Session Management

Circuit Breaker Support

Enhanced Error Handling

Monitoring and Diagnostics

Enhanced Logging and Diagnostics

The OpenAI Connection

Real-World Impact

Performance Benchmarks

Cost Optimization

Breaking Changes (Important!)

Changed `retry_write` Parameter Type

Migration Guide

Upgrading from any beta higher than 4.9.0 to 4.14.0

What’s Next?

Additional Resources

Get Involved

Author

0 comments

Leave a commentCancel reply

Read next

“Cost Management” Is Now “Account Throughput” and It’s Moving

Bringing Context to Copilot: Azure Cosmos DB Best Practices, Right in Your VS Code Workspace

What Makes This Release Special

🚀 Major New Features

Semantic Reranking – AI powered document intelligence (Preview)

Read Many Items – Optimized Batch Retrieval

Automatic Write Retries – Enhanced Resilience

Enhanced Developer Experience

Client-Level Configuration Options

Return Properties with Container Operations

Feed Range Support in Queries

Vector Embedding Policy Management

Advanced Query Capabilities

Computed Properties (Now GA)

Reliability and Performance Improvements

Advanced Session Management

Circuit Breaker Support

Enhanced Error Handling

Monitoring and Diagnostics

Enhanced Logging and Diagnostics

The OpenAI Connection

Real-World Impact

Performance Benchmarks

Cost Optimization

Breaking Changes (Important!)

Changed retry_write Parameter Type

Migration Guide

Upgrading from any beta higher than 4.9.0 to 4.14.0

What’s Next?

Additional Resources

Get Involved

Author

0 comments

Leave a commentCancel reply

Read next

“Cost Management” Is Now “Account Throughput” and It’s Moving

Bringing Context to Copilot: Azure Cosmos DB Best Practices, Right in Your VS Code Workspace

Stay informed

Changed `retry_write` Parameter Type