We’re thrilled to announce the stable release of Azure Cosmos DB Python SDK version 4.14.0! This release brings together months of innovation and collaboration, featuring ground-breaking capabilities that have been battle-tested in production environments. Many of these features were developed in close partnership with OpenAI, who rely heavily on Cosmos DB to store chat data for ChatGPT at massive scale.
What Makes This Release Special
After extensive beta testing, we’re proud to deliver a stable release that combines performance, intelligence, and developer productivity. The features in this release have been proven in real-world scenarios, including powering some of the most demanding AI workloads in the world.
🚀 Major New Features
Semantic Reranking – AI powered document intelligence (Preview)
One of the most exciting additions is our new Semantic Reranking API, currently a private preview feature that brings AI-powered document reranking directly to your Cosmos DB containers. This feature leverages Azure’s inference services to intelligently rank documents based on semantic relevance. If you want to be onboarded to the semantic re-ranking private preview – sign up here. For more information, contact us at CosmosDBSemanticReranker@Microsoft.com. Check out our demo sample here to test drive this, and other powerful semantic search features, in Python for Azure Cosmos DB.
from azure.cosmos import CosmosClient
# Initialize your client
client = CosmosClient(endpoint, key)
container = client.get_database("MyDatabase").get_container("MyContainer")
# Perform semantic reranking
results = container.semantic_rerank(
context="What is the capital of France?",
documents=[
"Berlin is the capital of Germany.",
"Paris is the capital of France.",
"Madrid is the capital of Spain."
],
options={
"return_documents": True,
"top_k": 10,
"batch_size": 32,
"sort": True
}
)
# Results are intelligently ranked by relevance
print(results)
# Output:
# {
# "Scores": [
# {
# "index": 1,
# "document": "Paris is the capital of France.",
# "score": 0.9921875
# },
# ...
# ]
# }
This feature enables you to build more intelligent applications that can understand context and meaning, not just keyword matching. Perfect for RAG (Retrieval-Augmented Generation) patterns in AI applications.
Read Many Items – Optimized Batch Retrieval
The new read_items
 API revolutionizes how you retrieve multiple documents, offering significant performance improvements and cost savings over individual point reads.
# Define the items you want to retrieve
item_list = [
("item1", "partition1"),
("item2", "partition1"),
("item3", "partition2")
]
# Retrieve all items in a single optimized request
items = list(container.read_items(
item_list=item_list,
max_degree_of_parallelism=4,
max_items_per_batch=100
))
# The SDK intelligently groups items by partition and uses
# optimized backend queries (often IN clauses) to minimize
# network round trips and RU consumption
Performance Benefits:
- Reduces network round trips by up to 90%
- Lower RU consumption compared to individual reads
- Intelligent query optimization based on partition distribution
Automatic Write Retries – Enhanced Resilience
Say goodbye to manual retry logic for write operations! The SDK now includes built-in retry capabilities for write operations that encounter transient failures.
# Enable retries at the client level
client = CosmosClient(
endpoint,
key,
connection_policy=ConnectionPolicy(retry_write=1)
)
# Or enable per-request
container.create_item(
body=my_document,
retry_write=1 # Automatic retry on timeouts/server errors
)
What Gets Retried:
- Timeout errors (408)
- Server errors (5xx status codes)
- Transient connectivity issues
Smart Retry Logic:
- Single-region accounts: One additional attempt to the same region
- Multi-region accounts: Cross-regional failover capability
- Patch operations require explicit opt-in due to potential non-idempotency
Enhanced Developer Experience
Client-Level Configuration Options
Custom User Agent: Identify your applications in telemetry:
# Set custom user agent suffix for better tracking
client = CosmosClient(
endpoint,
key,
user_agent_suffix="MyApplication/1.0"
)
Throughput Bucket Headers: Optimize performance monitoring (see here for more information on throughput buckets):
# Enable throughput bucket headers for detailed RU tracking
client = CosmosClient(
endpoint,
key,
throughput_bucket=2 # Set at client level
)
# Or set per request
container.create_item(
body=document,
throughput_bucket=2
)
Excluded Locations: Fine-tune regional preferences:
# Exclude specific regions at client level
client = CosmosClient(
endpoint,
key,
excluded_locations=["West US", "East Asia"]
)
# Or exclude regions for specific requests
container.read_item(
item="item-id",
partition_key="partition-key",
excluded_locations=["Central US"]
)
Return Properties with Container Operations
Streamline your workflows with the new return_properties
 parameter:
# Get both the container proxy and properties in one call
container, properties = database.create_container(
id="MyContainer",
partition_key=PartitionKey(path="/id"),
return_properties=True
)
# Now you have immediate access to container metadata
print(f"Container RID: {properties['_rid']}")
print(f"Index Policy: {properties['indexingPolicy']}")
Feed Range Support in Queries
Unlock advanced parallel change feed processing capabilities:
# Get feed ranges for parallel processing
feed_ranges = container.get_feed_ranges()
# Query specific feed ranges for optimal parallelism
for feed_range in feed_ranges:
items = container.query_items(
query="SELECT * FROM c WHERE c.category = @category",
parameters=[{"name": "@category", "value": "electronics"}],
feed_range=feed_range
)
Enhanced Change Feed: More flexible change feed processing:
# New change feed mode support for fine-grained control
change_feed_iter = container.query_items_change_feed(
feed_range=feed_range,
mode="Incremental", # New mode support
start_time=datetime.utcnow() - timedelta(hours=1)
)
Vector Embedding Policy Management
Enhanced support for AI workloads with vector embedding policy updates:
# Update indexing policy for containers with vector embeddings
indexing_policy = {
"indexingMode": "consistent",
"vectorIndexes": [
{
"path": "/vector",
"type": "quantizedFlat"
}
]
}
# Now you can replace indexing policies even when vector embeddings are present
container.replace_container(
container=container_properties,
indexing_policy=indexing_policy
)
Advanced Query Capabilities
Weighted RRF for Hybrid Search: Enhance your search relevance with Reciprocal Rank Fusion:
# Use weighted RRF in hybrid search queries
query = """
SELECT c.id, c.title, c.content
FROM c
WHERE CONTAINS(c.title, "machine learning")
ORDER BY RRF(VectorDistance(c.embedding, @vector),
FullTextScore(c.content, "artificial intelligence"),
[0.7, 0.3])
"""
items = container.query_items(query=query, parameters=[
{"name": "@vector", "value": search_vector}
])
Computed Properties (Now GA)
Computed Properties have graduated from preview to general availability:
# Define computed properties for efficient querying
computed_properties = [
{
"name": "lowerCaseName",
"query": "SELECT VALUE LOWER(c.name) FROM c"
}
]
# Replace container with computed properties
container.replace_container(
container=container_properties,
computed_properties=computed_properties
)
# Query using computed properties for better performance
items = container.query_items(
query="SELECT * FROM c WHERE c.lowerCaseName = 'john doe'"
)
Reliability and Performance Improvements
Advanced Session Management
The SDK now includes sophisticated session token management:
- Automatically optimizes session tokens
- Sends only relevant partition-local tokens for reads
- Eliminates unnecessary session tokens for single-region writes
- Improves performance and reduces request size
Circuit Breaker Support
Enable partition-level circuit breakers for enhanced fault tolerance:
import os
# Enable circuit breaker via environment variable
os.environ['AZURE_COSMOS_ENABLE_CIRCUIT_BREAKER'] = 'true'
# The SDK will automatically isolate failing partitions
# while keeping healthy partitions available
Enhanced Error Handling
More resilient retry logic with cross-regional capabilities.
Monitoring and Diagnostics
Enhanced Logging and Diagnostics
Automatic failover improvements:
- Better handling of bounded staleness consistency
- Cross-region retries when no preferred locations are set
- Improved database account call resilience
import logging
from azure.cosmos import CosmosHttpLoggingPolicy
# Set up enhanced logging logging.basicConfig(level=logging.INFO) client = CosmosClient( endpoint, key, logging_policy=CosmosHttpLoggingPolicy(logger=logging.getLogger()) )
The OpenAI Connection
Many of these features were developed in collaboration with OpenAI, who use Cosmos DB extensively for ChatGPT’s data storage needs. This partnership ensures our SDK can handle:
- Massive Scale: Billions of operations per day
- Low Latency: Sub-10ms response times for AI workloads
- High Availability: 99.999% uptime requirements
- Global Distribution: Seamless worldwide data replication
When you use the Python SDK for Azure Cosmos DB, you’re leveraging the same technology that powers some of the world’s most advanced AI applications.
Real-World Impact
Performance Benchmarks
Based on testing with synthetic workloads:
- Read Many Items: Up to 85% reduction in latency for batch retrieval scenarios
- Write Retries: 99.5% reduction in transient failure impact
- Session Optimization: 60% reduction in session token overhead
- Circuit Breaker: 90% faster recovery from partition-level failures
Cost Optimization
- Reduced RU Consumption: Batch operations can reduce costs by up to 40%
- Fewer Network Calls: Significant bandwidth savings in high-throughput scenarios
- Optimized Retries: Intelligent retry logic prevents unnecessary RU charges
Breaking Changes (Important!)
If you have been using the beta versions of Python SDK (since the last stable version 4.9.0) there is one breaking change:
Changed retry_write
 Parameter Type
# Before (4.13.x and earlier)
retry_write = True # boolean
# After (4.14.0)
retry_write = 3 # integer (number of retries)
This change aligns with other retry configuration options and provides more granular control.
Migration Guide
Upgrading from any beta higher than 4.9.0 to 4.14.0
-
Update your dependencies:
pip install azure-cosmos==4.14.0
-
Update retry_write usage (if applicable):
# Old way client = CosmosClient(endpoint, key, retry_write=True) # New way client = CosmosClient(endpoint, key, retry_write=3)
-
Leverage new features (optional but recommended):
-
Take advantage of read_items for batch operations
-
Enable automatic write retries for resilience
-
Use return_properties to reduce API calls
-
What’s Next?
This release establishes the foundation for even more exciting AI-focused features coming in future versions:
- Enhanced vector search capabilities
- Advanced semantic search integration
- Expanded AI inference service integrations
- Performance optimizations for RAG patterns
Additional Resources
- Full Changelog – Complete list of changes including bug fixes since 4.9.0
- SDK Documentation – Comprehensive API reference
- Sample Code – Working examples for all new features
- Migration Guide – Step-by-step upgrade instructions
Get Involved
Have feedback or questions? We’d love to hear from you!
- GitHub Issues:Â Report bugs or request features
- Stack Overflow: Tag your questions withÂ
azure-cosmosdb
 andÂpython
- Documentation:Â Contribute to our docs
Ready to upgrade? Install Azure Cosmos DB Python SDK v4.14.0 today and experience the power of AI-enhanced database operations!
pip install --upgrade azure-cosmos==4.14.0
The future of AI-powered applications starts with the right data foundation. With the latest Cosmos DB Python SDK, you have the tools to build intelligent, scalable, and resilient applications that can handle anything the world throws at them.
0 comments
Be the first to start the discussion.