{"id":12379,"date":"2026-06-02T12:15:18","date_gmt":"2026-06-02T19:15:18","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/cosmosdb\/?p=12379"},"modified":"2026-06-02T03:03:10","modified_gmt":"2026-06-02T10:03:10","slug":"azure-cosmos-db-agent-kit-now-battle-tested-for-ga","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/cosmosdb\/azure-cosmos-db-agent-kit-now-battle-tested-for-ga\/","title":{"rendered":"Azure Cosmos DB Agent Kit now battle tested for GA"},"content":{"rendered":"<p>Back in January, we <a href=\"https:\/\/devblogs.microsoft.com\/cosmosdb\/azure-cosmos-db-agent-kit-ai-coding-assistants\/\">shipped the Azure Cosmos DB Agent kit in preview<\/a> with 45 rules and a hypothesis: if we package Azure Cosmos DB expertise into a format that AI coding agents understand, developers will stop making the same expensive mistakes. That hypothesis held up. What surprised us was how much the rules themselves needed to evolve once we started systematically testing them.<\/p>\n<p>Today the <a href=\"https:\/\/devblogs.microsoft.com\/cosmosdb\/azure-cosmos-db-agent-kit-ai-coding-assistants\/\">Agent Kit is generally available<\/a> . It now contains <strong>120+ rules across 12 categories<\/strong>. But the number that matters more: we&#8217;ve run over 200 automated test iterations where AI agents build real applications from scratch using these rules, and we&#8217;ve fixed every gap those tests exposed.<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-content\/uploads\/sites\/52\/2026\/05\/ChatGPT-Image-May-29-2026-08_45_12-PM.png\"><img decoding=\"async\" class=\"alignnone wp-image-12385 size-full\" src=\"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-content\/uploads\/sites\/52\/2026\/05\/ChatGPT-Image-May-29-2026-08_45_12-PM.png\" alt=\"Promotional graphic for the Azure Cosmos DB Agent Kit general availability announcement. The design features a futuristic dark blue and purple cosmic background with glowing neon effects, the Azure Cosmos DB planet logo, and a large \u201cAgent Kit\u201d headline. An open \u201cAgent Kit\u201d container emits feature cards highlighting capabilities such as full-text search, vector search, partition strategy, indexing, and multi-agent coordination. Additional panels showcase production readiness, open-source installation commands, community contributions, real-world AI application scenarios, and performance improvements validated through testing.\" width=\"1672\" height=\"941\" srcset=\"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-content\/uploads\/sites\/52\/2026\/05\/ChatGPT-Image-May-29-2026-08_45_12-PM.png 1672w, https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-content\/uploads\/sites\/52\/2026\/05\/ChatGPT-Image-May-29-2026-08_45_12-PM-300x169.png 300w, https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-content\/uploads\/sites\/52\/2026\/05\/ChatGPT-Image-May-29-2026-08_45_12-PM-1024x576.png 1024w, https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-content\/uploads\/sites\/52\/2026\/05\/ChatGPT-Image-May-29-2026-08_45_12-PM-768x432.png 768w, https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-content\/uploads\/sites\/52\/2026\/05\/ChatGPT-Image-May-29-2026-08_45_12-PM-1536x864.png 1536w\" sizes=\"(max-width: 1672px) 100vw, 1672px\" \/><\/a><\/p>\n<h2>Why We Spent Four Months on Testing Infrastructure<\/h2>\n<p>We could have just kept adding rules. That&#8217;s what most knowledge bases do \u2014 accumulate content and hope it&#8217;s correct. Instead, we built something we hadn&#8217;t seen anyone else do for agent skills: a closed-loop testing system that runs the rules through real code generation and checks the output.<\/p>\n<p>The setup works like this. We have five application scenarios \u2014 an e-commerce order API, a gaming leaderboard, an IoT telemetry pipeline, a RAG chat app, and a multi-tenant SaaS platform. For each scenario, we define an API contract (what endpoints exist, what they return, what edge cases they handle). Then we let GitHub Copilot generate the entire application with our skill loaded, spin up the <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/cosmos-db\/emulator-linux\">Azure Cosmos DB Emulator<\/a> in CI, build the app, run it, and hit it with a full test suite covering API behavior, Cosmos infrastructure setup, and data integrity.<\/p>\n<p>When tests fail, we examine the cause. In some cases, the rule itself is wrong. In others, the rule is technically correct but vague enough that agents interpret it differently. Sometimes the issue exposes a gap: an Azure Cosmos DB behavior that had not yet been documented. In every case, we update the rule and run the batch again.<\/p>\n<p>We don&#8217;t run each scenario once. We run it 5+ times per language to get statistical confidence. One passing iteration might be luck. Five passing iterations means the rule is solid.<\/p>\n<p>Some results that gave us confidence to ship GA:<\/p>\n<ul>\n<li>The IoT telemetry scenario in .NET scored 9.5\/10 \u2014 the agent correctly applied 30+ rules including hierarchical partition keys, autoscale, TTL, composite indexes, and singleton client patterns, all in a single generation pass.<\/li>\n<li>The gaming leaderboard in Python went from 5\/10 without the skill to 9\/10 with it. The delta was entirely in Cosmos-specific gotchas that general-purpose agents don&#8217;t know about.<\/li>\n<li>The multi-tenant SaaS scenario in Java hit 100% test pass rate on API contract, Cosmos infrastructure, and data integrity tests across all iterations where the build succeeded. (The 40% build failure rate turned out to be a Netty\/OpenSSL issue with the local emulator, not a skill gap.)<\/li>\n<\/ul>\n<h2>Rules We Discovered the Hard Way<\/h2>\n<p>The most useful rules in the kit didn&#8217;t come from documentation reviews. They came from watching AI agents repeatedly make the same mistake and figuring out how to teach them not to.<\/p>\n<p><strong>The enum serialization trap.<\/strong> In our e-commerce scenario, the .NET SDK was storing order status as integers (0, 1, 2) while the generated queries were filtering by string values (&#8220;Pending&#8221;, &#8220;Shipped&#8221;, &#8220;Delivered&#8221;). Every status query returned zero results. The app looked like it worked \u2014 no errors, no crashes \u2014 it just silently returned empty arrays. We added sdk-serialization-enums and the problem disappeared across all subsequent iterations.<\/p>\n<p><strong>The `TOP` parameter surprise.<\/strong> Every SQL developer knows you should parameterize values. So AI agents parameterize everything, including TOP. But Azure Cosmos DB requires TOP to be a literal integer \u2014 parameterize it and you get a 400 Bad Request. We watched this happen in three separate gaming-leaderboard iterations before adding query-top-literal. Agents were applying the &#8220;right&#8221; general practice that happens to be wrong for Cosmos DB specifically.<\/p>\n<p><strong>The missing `aiohttp` dependency.<\/strong> Python&#8217;s async Azure Cosmos DB client needs aiohttp, but it&#8217;s not listed as a hard dependency that pip resolves automatically. AI agents generate from azure.cosmos.aio import CosmosClient, the code passes linting, the import succeeds at module load&#8230; and then the first actual database call throws a confusing runtime error. Three lines in a requirements file, but agents never think to add them because nothing in the obvious documentation says to.<\/p>\n<p><strong>Composite index direction mismatches.<\/strong> Agents would create a composite index with ASC order, then write a query with ORDER BY c.score DESC, c.timestamp DESC. Works fine in testing with small datasets (Cosmos DB can scan), falls over in production. The rule now teaches agents to define both ASC and DESC variants upfront.<\/p>\n<p>&nbsp;<\/p>\n<h2>What&#8217;s New Since Preview<\/h2>\n<p>If you installed the preview back in January, a lot has changed beyond the rule count.<\/p>\n<p><strong>Four new categories.<\/strong> Vector Search (6 rules covering embedding policies, DiskANN vs QuantizedFlat, distance queries, normalization). Full-Text Search (6 rules for BM25, fullTextPolicy, hybrid queries). Design Patterns (change feed materialized views, efficient ranking, multi-agent coordination). Developer Tooling (emulator setup, local dev config, build validation).<\/p>\n<p><strong>Multi-agent patterns.<\/strong> If you&#8217;re building LangGraph applications backed by Cosmos DB, we added rules for wrapping sync database calls in asyncio.to_thread inside routing functions, attributing messages to specific agents, and preventing the infinite recursion loop that happens when agents check all messages instead of only new ones.<\/p>\n<p><strong>Java\/Spring got serious attention.<\/strong> The preview rules were mostly .NET and Python. Now there&#8217;s deep coverage for Spring Data Cosmos \u2014 the @PostConstruct circular-dependency trap, Jackson config for Cosmos system metadata fields, JPA migration patterns, and the SSL certificate handling you need for the emulator in Java CI environments.<\/p>\n<p><strong>Cascade delete semantics.<\/strong> This one bit people in production. If you denormalize data across containers (which you should, for read performance), deleting the source document has to cascade to all derived copies. Updating a field that&#8217;s used as a partition key in a derived container means delete-and-recreate, not update-in-place. The rule now includes Python and C# examples for both patterns.<\/p>\n<h2>If You&#8217;re New Here<\/h2>\n<p><iframe src=\"\/\/www.youtube.com\/embed\/PSUF27AeMac\" width=\"560\" height=\"314\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<p>The Agent Kit is an open-source skill that plugs into your AI coding assistant \u2014 GitHub Copilot, Claude Code, Gemini CLI, Cursor, Windsurf, anything that supports the <u>Agent Skills<\/u> format. Once installed, it activates automatically when you&#8217;re working with Cosmos DB code.<\/p>\n<pre class=\"prettyprint language-ts\"><code class=\"language-ts\">npx skills add AzureCosmosDB\/cosmosdb-agent-kit    \r\n<\/code><\/pre>\n<p>Then just work normally. Ask your agent to review a data model, design a partition strategy, optimize a query, set up vector search \u2014 it now has 111 rules of Cosmos DB-specific knowledge to draw from instead of relying on generic database intuition.<\/p>\n<p>The kinds of problems it catches:<\/p>\n<ul>\n<li>Creating a new CosmosClient per request instead of reusing a singleton (connection exhaustion under load)<\/li>\n<li>Running SELECT * when you only need three fields (unnecessary RU burn)<\/li>\n<li>Choosing \/id as a partition key for a multi-tenant app (guaranteed hot partition)<\/li>\n<li>Missing retry configuration for 429 throttling responses (intermittent failures that only show up at scale)<\/li>\n<li>Using cross-partition queries where a materialized view would give you single-partition reads<\/li>\n<\/ul>\n<p>These aren&#8217;t obscure edge cases. They&#8217;re the top five mistakes we see in production support cases, and AI agents make all of them by default because they&#8217;re optimizing for &#8220;code that compiles&#8221; rather than &#8220;code that scales.&#8221;<\/p>\n<p>If you&#8217;re setting up your development environment for Azure Cosmos DB, watch this session <a href=\"https:\/\/www.youtube.com\/watch?v=PrO_42ZC_1M\">Azure Cosmos DB Dev Environment with AI | at Azure Cosmos DB Conf 2026\u00a0<\/a><\/p>\n<p class=\"isSelectedEnd\">We\u2019ve received rules from 9 contributors so far, and the best submissions came from people who hit a real problem, spent hours debugging it, and realized \u201cmy AI agent should have known this.\u201d Here are sample PRs and issues that directly shaped the GA release:<\/p>\n<ul data-spread=\"true\">\n<li><a href=\"https:\/\/github.com\/AzureCosmosDB\/cosmosdb-agent-kit\/pull\/95\">PR #95<\/a> \u2014 <a href=\"https:\/\/github.com\/DavideDelVecchio\">@DavideDelVecchio<\/a> contributed 4 new SDK best-practice rules and the entire Full-Text Search section (6 rules covering BM25 ranking, <code dir=\"ltr\">FullTextContains<\/code>, hybrid queries, and <code dir=\"ltr\">fullTextPolicy<\/code> configuration). A 927-line addition from a community fork.<\/li>\n<li><a href=\"https:\/\/github.com\/AzureCosmosDB\/cosmosdb-agent-kit\/pull\/19\">PR #19<\/a> \u2014 <a href=\"https:\/\/github.com\/sesmyrnov\">@sesmyrnov<\/a> updated the Data Modeling, Partitioning, and Change Feed materialized-view rules early in the project\u2019s life, strengthening the core knowledge base before automated testing even existed.<\/li>\n<li><a href=\"https:\/\/github.com\/AzureCosmosDB\/cosmosdb-agent-kit\/issues\/144\">Issue #144<\/a> \u2014 <a href=\"https:\/\/github.com\/sevoku\">@sevoku<\/a> caught a token-efficiency regression: after PR #95 landed, <code dir=\"ltr\">SKILL.md<\/code> was linking into the compiled <code dir=\"ltr\">AGENTS.md<\/code> instead of individual rule files, blowing up token consumption for every agent that loaded the skill. Fixed within days in <a href=\"https:\/\/github.com\/AzureCosmosDB\/cosmosdb-agent-kit\/pull\/145\">PR #145<\/a>.<\/li>\n<\/ul>\n<p><strong>\u00a0The Story Behind <span style=\"font-family: terminal, monaco, monospace;\">sdk-dotnet-namespace-collision<\/span><\/strong><\/p>\n<p data-start=\"51\" data-end=\"310\">The most satisfying issue-to-rule pipeline we&#8217;ve seen started with <a class=\"decorated-link\" href=\"https:\/\/github.com\/jaydestro\" target=\"_new\" rel=\"noopener\" data-start=\"118\" data-end=\"160\">@jaydestro<\/a> filing <a class=\"decorated-link\" href=\"https:\/\/github.com\/AzureCosmosDB\/cosmosdb-agent-kit\/issues\/142\" target=\"_new\" rel=\"noopener\" data-start=\"168\" data-end=\"244\">Issue #142<\/a>: \u201cusing Microsoft.Azure.Cosmos; collides with domain User model.\u201d<\/p>\n<p data-start=\"312\" data-end=\"799\">Here\u2019s what happened. During our automated gap-analysis tool runs against the e-commerce scenario, AI agents kept generating code that put <code data-start=\"451\" data-end=\"482\">using Microsoft.Azure.Cosmos;<\/code> and <code data-start=\"487\" data-end=\"517\">using ECommerce.Core.Models;<\/code> in the same file. Both namespaces contain a type called <code data-start=\"574\" data-end=\"580\">User<\/code> \u2014 the SDK ships one as a control-plane type for Cosmos user\/permission principals, while the app defines its own domain entity. The result: <code data-start=\"721\" data-end=\"769\">error CS0104: 'User' is an ambiguous reference<\/code> \u2014 an immediate build failure.<\/p>\n<p data-start=\"801\" data-end=\"1070\">The insidious part? Microsoft\u2019s own quickstart documentation uses <code data-start=\"867\" data-end=\"898\">using Microsoft.Azure.Cosmos;<\/code> without noting the collision. Models trained on those docs reproduce the pattern verbatim. Without the Agent Kit loaded, AI agents had zero signal that this was dangerous.<\/p>\n<p data-start=\"1072\" data-end=\"1438\">Jay\u2019s issue included reproducers from two separate test profiles, the exact compiler diagnostic citing <code data-start=\"1175\" data-end=\"1204\">Microsoft.Azure.Cosmos.User<\/code>, and a minimal 25-line repro that anyone could validate with <code data-start=\"1266\" data-end=\"1280\">dotnet build<\/code>. He even validated the fix against SDK 3.59.0 \u2014 confirming that <code data-start=\"1345\" data-end=\"1385\">using Cosmos = Microsoft.Azure.Cosmos;<\/code> as a namespace alias resolves the ambiguity cleanly.<\/p>\n<p data-start=\"1440\" data-end=\"1929\">Two weeks later, <a class=\"decorated-link cursor-pointer\" target=\"_new\" rel=\"noopener\" data-start=\"1457\" data-end=\"1528\">PR #149<\/a> landed: the new <code data-start=\"1545\" data-end=\"1577\">sdk-dotnet-namespace-collision<\/code> rule. It warns agents that <code data-start=\"1605\" data-end=\"1629\">Microsoft.Azure.Cosmos<\/code> ships top-level types including <code data-start=\"1662\" data-end=\"1668\">User<\/code>, <code data-start=\"1670\" data-end=\"1680\">Database<\/code>, <code data-start=\"1682\" data-end=\"1693\">Container<\/code>, <code data-start=\"1695\" data-end=\"1705\">Conflict<\/code>, <code data-start=\"1707\" data-end=\"1716\">Trigger<\/code>, and <code data-start=\"1722\" data-end=\"1734\">Permission<\/code>, and teaches them to alias the import or fully qualify SDK types when domain models use the same names. Since that rule shipped, the CS0104 error has appeared in zero subsequent test iterations.<\/p>\n<p data-start=\"1931\" data-end=\"2129\" data-is-last-node=\"\" data-is-only-node=\"\">That\u2019s the loop: community member hits a real production problem \u2192 files an issue with evidence \u2192 rule gets written and validated \u2192 every AI agent using the kit now avoids that class of bug forever.<\/p>\n<h2 style=\"margin-bottom: 10.0pt;\">How to Contribute<\/h2>\n<p>The process: add a rule file to \/skills\/cosmosdb-best-practices\/rules\/, open a PR with the scenario that triggered it, and our CI pipeline will validate it against the testing framework. If it improves outcomes in batch evaluations, it ships.<\/p>\n<p>The kit is GA and it works \u2014 111+ \u00a0rules, 200+ test iterations, real production coverage across .NET, Python, Java, and Node.js. But no knowledge base is ever complete. If you hit a Cosmos DB gotcha that your AI agent should have caught, we want to hear about it.<\/p>\n<h2>\u00a0<strong>About Azure Cosmos DB<\/strong><\/h2>\n<p>Azure Cosmos DB is a fully managed and serverless NoSQL and vector database for modern app development, including AI applications. With its SLA-backed speed and availability as well as instant dynamic scalability, it is ideal for real-time NoSQL and MongoDB applications that require high performance and distributed computing over massive volumes of NoSQL and vector data.<\/p>\n<p>To stay in the loop on Azure Cosmos DB updates, follow us on\u00a0<a href=\"https:\/\/twitter.com\/AzureCosmosDB\">X<\/a>,\u00a0<a href=\"https:\/\/aka.ms\/AzureCosmosDBYouTube\">YouTube<\/a>, and\u00a0<a href=\"https:\/\/www.linkedin.com\/company\/azure-cosmos-db\/\">LinkedIn<\/a>.\u00a0 Join the discussion with other developers on the\u00a0<a href=\"https:\/\/discord.gg\/pczdC2SU\">#nosql channel on the Microsoft Open Source Discord<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Back in January, we shipped the Azure Cosmos DB Agent kit in preview with 45 rules and a hypothesis: if we package Azure Cosmos DB expertise into a format that AI coding agents understand, developers will stop making the same expensive mistakes. That hypothesis held up. What surprised us was how much the rules themselves [&hellip;]<\/p>\n","protected":false},"author":80443,"featured_media":12385,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1610,1980,2017],"tags":[1986,1864,957,1872,1992],"class_list":["post-12379","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","category-azure-cosmos-db","category-azure-cosmos-db-tools","tag-agentkit","tag-azurecosmosdb","tag-cosmosdb","tag-nosql","tag-skills"],"acf":[],"blog_post_summary":"<p>Back in January, we shipped the Azure Cosmos DB Agent kit in preview with 45 rules and a hypothesis: if we package Azure Cosmos DB expertise into a format that AI coding agents understand, developers will stop making the same expensive mistakes. That hypothesis held up. What surprised us was how much the rules themselves [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/posts\/12379","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/users\/80443"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/comments?post=12379"}],"version-history":[{"count":2,"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/posts\/12379\/revisions"}],"predecessor-version":[{"id":12448,"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/posts\/12379\/revisions\/12448"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/media\/12385"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/media?parent=12379"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/categories?post=12379"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/tags?post=12379"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}