{"id":2381,"date":"2026-05-30T23:18:28","date_gmt":"2026-05-31T06:18:28","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/foundry\/?p=2381"},"modified":"2026-05-30T23:18:28","modified_gmt":"2026-05-31T06:18:28","slug":"whats-new-in-microsoft-foundry-may-2026","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/foundry\/whats-new-in-microsoft-foundry-may-2026\/","title":{"rendered":"What&#8217;s new in Microsoft Foundry | May 2026"},"content":{"rendered":"<h2>TL;DR<\/h2>\n<ul>\n<li><strong>Trace-based evaluation for external and hosted agents:<\/strong> Grade real production traces from Foundry, GCP, AWS, or any framework \u2014 no hand-curated datasets required.<\/li>\n<li><strong>Grok 4.3:<\/strong> xAI&#8217;s latest model lands in Foundry for advanced agentic and domain-specific workloads.<\/li>\n<li><strong>DeepSeek V4:<\/strong> DeepSeek&#8217;s newest model family expands open-model choice in the catalog.<\/li>\n<li><strong>Fireworks AI \u2014 May update:<\/strong> DeepSeek V4 Pro and Kimi 2.6 arrive via Fireworks for high-performance open-model inference.<\/li>\n<li><strong>GPT-5 Reinforcement Fine-Tuning (Gated GA):<\/strong> RFT graduates to gated GA with enterprise-ready compliance and SLA coverage.<\/li>\n<li><strong>MagenticBrain, Fara1.5-9B, MagenticLite:<\/strong> Microsoft Research ships three on-device agent projects for agentic reasoning, screen-based UI automation, and local browser\/file-system workflows \u2014 with examples in Foundry Labs.<\/li>\n<li><strong>SocialReasoning-Bench + STATE-Bench:<\/strong> Two open-source benchmarks for evaluating agent negotiation, coordination, and memory quality.<\/li>\n<li><strong>Managed VNET (GA):<\/strong> Microsoft-managed network isolation reaches general availability.<\/li>\n<li><strong>Project-level cost attribution:<\/strong> See LLM costs by project for budget tracking and governance.<\/li>\n<li><strong>Content Understanding improvements (GA):<\/strong> Read and layout analyzers reach GA alongside a Logic App connector and Foundry NextGen integration.<\/li>\n<li><strong>Evaluation tooling updates:<\/strong> Skill evaluation, workflow evaluation UX improvements, and evaluation alignment across VS Code and the portal.<\/li>\n<li><strong>Foundry Local 1.1 + 1.2:<\/strong> Live audio transcription, text embeddings, Qwen 3.5 Vision, multilingual ASR, cancellable downloads, Linux ARM64, and ONNX Runtime 1.26 across Python, JavaScript, C#, and Rust.<\/li>\n<li><strong>Foundry Agent Service:<\/strong> <code>azure-ai-projects<\/code> 2.2.0 adds preview skills and toolboxes; the sample below registers a design skill, exposes it through a toolbox MCP endpoint, and invokes a GPT-5.4 prompt agent with image input.<\/li>\n<li><strong>SDK updates:<\/strong> <code>azure-ai-projects<\/code> ships 2.2.0 across Python, JS\/TS, and .NET with external agent definitions, skills, toolboxes, model weight registry, routines, and optimization jobs.<\/li>\n<li><strong>Microsoft Build:<\/strong> Register for Microsoft Build and save Microsoft Foundry sessions to watch online.<\/li>\n<\/ul>\n<div class=\"d-flex\"><a class=\"cta_button_link\" href=\"https:\/\/build.microsoft.com\/\" target=\"_blank\" rel=\"noopener\">Register for Microsoft Build<\/a><\/div>\n<p>Looking for Microsoft Foundry sessions to watch online? Start with these Microsoft Build breakout sessions. Times are shown in Pacific time; check the session page for the latest schedule.<\/p>\n<table>\n<thead>\n<tr>\n<th>Session<\/th>\n<th>Date\/time<\/th>\n<th>Speaker(s)<\/th>\n<th>Description<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/BRK230\">Confident model selection and integration with Microsoft Foundry (BRK230)<\/a><\/td>\n<td>June 2, 12:30-1:15 PM PT<\/td>\n<td>Yina Arenas, Naomi Moneypenny<\/td>\n<td>Choose, integrate, and validate AI models in Microsoft Foundry, including benchmarking and integrated developer workflows.<\/td>\n<\/tr>\n<tr>\n<td><a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/BRK250\">Govern open-source AI agents, any framework, any scale (BRK250)<\/a><\/td>\n<td>June 2, 2:30-3:15 PM PT<\/td>\n<td>Sarah Bird, Mehrnoosh Sameki<\/td>\n<td>Learn governance patterns for Microsoft Agent Framework and open-source agent stacks, including evaluations and risk controls.<\/td>\n<\/tr>\n<tr>\n<td><a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/BRK241\">From prototype to production: build and run agents at scale (BRK241)<\/a><\/td>\n<td>June 2, 3:45-4:30 PM PT<\/td>\n<td>Tina Schuchman, Jeff Hollan<\/td>\n<td>Walk through the lifecycle for production-grade agents with Foundry Agent Service and Microsoft Agent Framework.<\/td>\n<\/tr>\n<tr>\n<td><a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/BRK252\">From observability to ROI for AI agents on any framework (BRK252)<\/a><\/td>\n<td>June 2, 3:45-4:30 PM PT<\/td>\n<td>Sebastian Kohlmeier, Filisha Shah<\/td>\n<td>Cover cross-framework tracing, evaluations, production observability, and ROI measurement for AI agents.<\/td>\n<\/tr>\n<tr>\n<td><a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/BRKSP94\">Orchestrate special agents with Nemotron models on Microsoft AI Foundry (BRKSP94)<\/a><\/td>\n<td>June 2, 3:45-4:30 PM PT<\/td>\n<td>Stephen McCullough<\/td>\n<td>Route tasks across frontier models, NVIDIA Nemotron, and local models for tiered agentic AI architectures.<\/td>\n<\/tr>\n<tr>\n<td><a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/BRK231\">Deploy. Observe. Learn. Reinforcement learning for production agents (BRK231)<\/a><\/td>\n<td>June 2, 5:00-5:45 PM PT<\/td>\n<td>Alicia Frame, Omkar More<\/td>\n<td>Use fine-tuning and reinforcement learning on Microsoft Foundry to improve production agents with real usage signals.<\/td>\n<\/tr>\n<tr>\n<td><a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/BRK240\">Build context-aware agents at scale with Microsoft IQ (BRK240)<\/a><\/td>\n<td>June 2, 5:00-5:45 PM PT<\/td>\n<td>Marco Casalaina<\/td>\n<td>Learn how Foundry IQ, Fabric IQ, and Work IQ provide an enterprise intelligence layer for AI agents.<\/td>\n<\/tr>\n<tr>\n<td><a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/BRK246\">Context engineering for agents: connect agents with enterprise knowledge (BRK246)<\/a><\/td>\n<td>June 3, 9:00-9:45 AM PT<\/td>\n<td>Pablo Castro Castro<\/td>\n<td>Explore Foundry IQ, Azure AI Search, knowledge sources, agentic retrieval-augmented generation (RAG), and enterprise security.<\/td>\n<\/tr>\n<tr>\n<td><a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/BRK235\">Local models, developer control, and the future of AI runtimes (BRK235)<\/a><\/td>\n<td>June 3, 10:15-11:00 AM PT<\/td>\n<td>Parth Sareen<\/td>\n<td>Learn how local and hybrid model execution can reshape developer workflows, privacy, and experimentation.<\/td>\n<\/tr>\n<tr>\n<td><a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/BRK243\">Claw and agent harness in Microsoft Foundry (BRK243)<\/a><\/td>\n<td>June 3, 11:30 AM-12:15 PM PT<\/td>\n<td>Glenn Condron, Amanda Foster, Shawn Henry<\/td>\n<td>Go deep on multi-agent systems, Claw agent patterns, hosted agents architecture, triggers, state management, and file access.<\/td>\n<\/tr>\n<tr>\n<td><a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/BRK251\">Build secure and enterprise-ready agents with Agent 365 (BRK251)<\/a><\/td>\n<td>June 3, 11:30 AM-12:15 PM PT<\/td>\n<td>Neta Haiby<\/td>\n<td>Build enterprise-ready agents with runtime visibility, identity-aware access, data protection, and policy-based governance.<\/td>\n<\/tr>\n<tr>\n<td><a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/BRKSP92\">Build distributed agentic apps from edge to cloud (BRKSP92)<\/a><\/td>\n<td>June 3, 11:30 AM-12:15 PM PT<\/td>\n<td>Colin Helms, Eddy Rodriguez<\/td>\n<td>Design and run multi-agent applications across client, edge, and Azure environments.<\/td>\n<\/tr>\n<tr>\n<td><a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/BRK232\">Train and deploy custom OSS reasoning models with Foundry (BRK232)<\/a><\/td>\n<td>June 3, 2:45-3:30 PM PT<\/td>\n<td>Vijay Aski, Manoj Bableshwar, Chris Lauren<\/td>\n<td>Train and tune open-source reasoning models in Microsoft Foundry with code-first workflows and curated reinforcement learning environments.<\/td>\n<\/tr>\n<tr>\n<td><a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/BRK242\">Turn your agents into action: connect tools, APIs, and data (BRK242)<\/a><\/td>\n<td>June 3, 4:00-4:45 PM PT<\/td>\n<td>Ronak Chokshi, Joe Filcik, Maria Naggaga<\/td>\n<td>See how to connect agents with toolsets, application programming interfaces (APIs), and data without overloading context windows.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Want the full online breakout catalogs? Browse <a href=\"https:\/\/build.microsoft.com\/en-US\/sessions?filter=topic%2FlogicalValue%3EAgents+%26+apps&amp;filter=deliveryTypes%2FlogicalValue%3EOnline&amp;filter=sessionType%2FlogicalValue%3EBreakout&amp;pageSize=96\">Agents &amp; apps<\/a>, <a href=\"https:\/\/build.microsoft.com\/en-US\/sessions?filter=deliveryTypes%2FlogicalValue%3EOnline&amp;filter=sessionType%2FlogicalValue%3EBreakout&amp;filter=topic%2FlogicalValue%3EResponsible+AI&amp;pageSize=96\">Responsible AI<\/a>, and <a href=\"https:\/\/build.microsoft.com\/en-US\/sessions?filter=deliveryTypes%2FlogicalValue%3EOnline&amp;filter=sessionType%2FlogicalValue%3EBreakout&amp;filter=topic%2FlogicalValue%3EWorking+with+models&amp;pageSize=96\">Working with models<\/a>.<\/p>\n<h2>Join the community<\/h2>\n<p>Connect with 50,000+ developers on <a href=\"https:\/\/aka.ms\/foundry\/discord\">Discord<\/a>, ask questions in <a href=\"https:\/\/aka.ms\/foundry\/forum\">GitHub Discussions<\/a>, or <a href=\"https:\/\/devblogs.microsoft.com\/foundry\/category\/whats-new\/feed\/\">subscribe via RSS<\/a> to get this digest monthly.<\/p>\n<hr \/>\n<h2>Models<\/h2>\n<h3>Grok 4.3<\/h3>\n<p><strong>Grok 4.3<\/strong> from xAI is available in the Microsoft Foundry model catalog. This is a step up from the Grok 4.2 GA that shipped in March \u2014 focused on advanced agentic workloads and domain-specific scenarios where you need a high-capability external model with Foundry&#8217;s production controls, safety tooling, and enterprise compliance.<\/p>\n<p>If you&#8217;re already using Grok models in Foundry, 4.3 is a direct upgrade path through the same deployment flow.<\/p>\n<p>One practical note before you move traffic: review the model card and run your own evaluations for your target use case. The catalog calls out additional responsible AI considerations for Grok 4.3, including higher safety and jailbreak risk than some other Azure Direct models. Treat that as a deployment checklist item, not a footnote.<\/p>\n<p>Grok 4.3 uses the Chat Completions API path, so call the deployment directly. Set <code>FOUNDRY_ENDPOINT<\/code> to your deployment endpoint ending in <code>\/openai\/v1\/chat\/completions<\/code>, then trim that suffix for the OpenAI client base URL:<\/p>\n<pre><code class=\"language-python\">import os\r\nfrom openai import OpenAI\r\n\r\nendpoint = os.environ[\"FOUNDRY_ENDPOINT\"]\r\n\r\nclient = OpenAI(\r\n    api_key=os.environ[\"FOUNDRY_API_KEY\"],\r\n    base_url=endpoint.removesuffix(\"\/chat\/completions\"),\r\n)\r\n\r\nresponse = client.chat.completions.create(\r\n    model=\"grok-4.3\",\r\n    messages=[\r\n        {\"role\": \"system\", \"content\": \"You are Grok, a highly intelligent, helpful AI assistant.\"},\r\n        {\r\n            \"role\": \"user\",\r\n            \"content\": \"In one sentence, explain why developers should evaluate agent tool calls before production.\",\r\n        },\r\n    ],\r\n    temperature=0.2,\r\n    max_tokens=80,\r\n)\r\n\r\nprint(response.choices[0].message.content)<\/code><\/pre>\n<p>Sample output:<\/p>\n<pre><code class=\"language-text\">Developers should evaluate agent tool calls before production to catch incorrect parameters, unsafe actions, or unintended side effects that autonomous agents can generate in real environments.<\/code><\/pre>\n<blockquote><p><strong>Action:<\/strong> Deploy Grok 4.3 from the model catalog and compare against your current Grok 4.2 workloads.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/techcommunity.microsoft.com\/blog\/azure-ai-foundry-blog\/introducing-grok-4-3-on-microsoft-foundry-latest-generation-agentic-capabilities\/4517096\" target=\"_blank\" rel=\"noopener\">Read the Grok 4.3 Announcement<\/a><\/div>\n<h3>DeepSeek V4<\/h3>\n<p><strong>DeepSeek V4<\/strong> variants are now available in Microsoft Foundry. DeepSeek has been building momentum in the open-model space, and V4 reinforces the breadth of choice in the Foundry catalog \u2014 particularly for teams that want competitive reasoning and coding performance from open-weight models with full Foundry deployment and monitoring.<\/p>\n<blockquote><p><strong>Action:<\/strong> Browse the DeepSeek V4 models in the catalog and test against your existing model mix.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/aka.ms\/deepseek-v4-blog\" target=\"_blank\" rel=\"noopener\">Read the DeepSeek V4 Announcement<\/a><\/div>\n<h3>Fireworks AI on Foundry \u2014 May Update<\/h3>\n<p>The Fireworks AI integration continues to expand. Two new models arrived in May:<\/p>\n<table>\n<thead>\n<tr>\n<th>Model<\/th>\n<th>What it does<\/th>\n<th>Best for<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><strong>DeepSeek V4 Pro<\/strong><\/td>\n<td>High-precision open-model reasoning via Fireworks<\/td>\n<td>Complex reasoning, coding, agent workflows<\/td>\n<\/tr>\n<tr>\n<td><strong>Kimi 2.6<\/strong><\/td>\n<td>Long-horizon reasoning and agentic workflows via Fireworks<\/td>\n<td>Extended reasoning chains, coding, multi-step agents<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Both run on Fireworks&#8217; high-throughput inference infrastructure with Azure enterprise security and compliance. The broader narrative here is model choice without compromise: open-model breadth, production scalability, and customer control over deployment, all within Foundry&#8217;s operational envelope.<\/p>\n<blockquote><p><strong>Action:<\/strong> Deploy DeepSeek V4 Pro or Kimi 2.6 via Fireworks from the model catalog for high-throughput open-model inference.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/ai.azure.com\/catalog\" target=\"_blank\" rel=\"noopener\">Browse Fireworks Models in Catalog<\/a><\/div>\n<h3>GPT-5 Reinforcement Fine-Tuning (Gated GA)<\/h3>\n<p><strong>GPT-5 Reinforcement Fine-Tuning (RFT)<\/strong> moves from preview to gated GA. This means enterprise-ready access with stronger compliance guarantees and SLA coverage \u2014 the kind of stability teams need before committing production fine-tuning pipelines to a service.<\/p>\n<p>RFT lets you train GPT-5 on domain-specific tasks using reinforcement learning from human feedback, without managing training infrastructure. The gated GA status means access requires approval, but once in, you get production-grade support and commitments.<\/p>\n<blockquote><p><strong>Action:<\/strong> If you have RFT workloads in preview, start planning the move to the GA tier. If you&#8217;re new to RFT, request access through the model catalog.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/learn.microsoft.com\/azure\/foundry\/openai\/how-to\/reinforcement-fine-tuning\" target=\"_blank\" rel=\"noopener\">Explore Reinforcement Fine-Tuning<\/a><\/div>\n<h3>Microsoft Research Agent Models \u2014 MagenticBrain, Fara1.5-9B, MagenticLite<\/h3>\n<p>Three related Foundry Labs from Microsoft Research released this month: <strong>MagenticLite<\/strong>, the local agentic app; <strong>MagenticBrain<\/strong>, the planner\/coder\/orchestrator; and <strong>Fara1.5<\/strong>, the computer-use model family for browser work. The Microsoft Research developed this as one co-designed system: MagenticLite is the app and harness, MagenticBrain handles reasoning\/delegation\/terminal use, and Fara1.5 handles browser-based tasks. Each component is useful on its own, but they work best together.<\/p>\n<p>Start with the <a href=\"https:\/\/github.com\/microsoft\/magentic-ui\">MagenticLite GitHub repo<\/a> when you want to run it locally; the <a href=\"https:\/\/labs.ai.azure.com\/projects\/magenticlite\/\">Foundry Labs page<\/a> is a good overview of the research artifact. The app expects OpenAI-compatible <code>\/v1<\/code> endpoints for the two recommended model roles. <a href=\"https:\/\/aka.ms\/MagenticLite\">MagenticLite on GitHub<\/a>, <a href=\"https:\/\/aka.ms\/MagenticBrain-foundry\">MagenticBrain on Microsoft Foundry<\/a>, and <a href=\"https:\/\/aka.ms\/fara-foundry\">Fara1.5 on Microsoft Foundry<\/a> are available for you to get started. Setup <a href=\"https:\/\/github.com\/microsoft\/magentic-ui\/blob\/main\/docs\/model-hosting-guide.md\">Foundry Managed Compute<\/a>: deploy <strong>Fara1.5-9B<\/strong> for browser use and <strong>MagenticBrain-14B<\/strong> for orchestration, then paste each deployment&#8217;s <code>\/v1<\/code> endpoint, model ID, and primary key into MagenticLite.<\/p>\n<p>Here is a MagenticLite expense-form demo from the project:<\/p>\n<p><div style=\"width: 640px;\" class=\"wp-video\"><video class=\"wp-video-shortcode\" id=\"video-2381-1\" width=\"640\" height=\"360\" preload=\"metadata\" controls=\"controls\"><source type=\"video\/mp4\" src=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2026\/05\/magentic-lite-expense-form-demo.mp4?_=1\" \/><a href=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2026\/05\/magentic-lite-expense-form-demo.mp4\">https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2026\/05\/magentic-lite-expense-form-demo.mp4<\/a><\/video><\/div><\/p>\n<p>You can also browse the full <a href=\"https:\/\/github.com\/microsoft\/magentic-ui#see-it-in-action\">demo set<\/a>.<\/p>\n<blockquote><p><strong>Action:<\/strong> Try MagenticLite locally if you&#8217;re exploring browser-plus-filesystem agents, then connect MagenticBrain and Fara1.5 endpoints through the Foundry Managed Compute guide.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/github.com\/microsoft\/magentic-ui\" target=\"_blank\" rel=\"noopener\">Run MagenticLite from GitHub<\/a><\/div>\n<hr \/>\n<h2>Evaluations &amp; Benchmarks<\/h2>\n<p>May was a strong month for evaluation infrastructure. The headline: you can now grade real production traces from agents running <em>anywhere<\/em> \u2014 not just Foundry-hosted agents, and not just from synthetic test sets.<\/p>\n<h3>Trace-Based Evaluation<\/h3>\n<p>Two capabilities shipped back-to-back:<\/p>\n<p><strong>Trace-based evaluation for external agents<\/strong> (May 4) lets you grade production traces from agents running on Foundry, GCP, AWS, or any other platform. Instead of hand-curating evaluation datasets, you point evaluators at real traces and get quality scores on live agent behavior.<\/p>\n<p><strong>Trace-based evaluation for hosted agents<\/strong> (May 6) brings the same approach to Foundry-hosted agents. Assess agent quality using live interactions rather than relying exclusively on synthetic test sets.<\/p>\n<p>This is a meaningful shift. Evaluation is moving from &#8220;test before you ship&#8221; to &#8220;measure what&#8217;s actually happening in production&#8221; \u2014 and it works across clouds and frameworks.<\/p>\n<blockquote><p><strong>Action:<\/strong> Connect your agent traces to Foundry evaluations. If you&#8217;re running agents on external platforms, use trace-based evaluation to get quality signals without migrating workloads.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/learn.microsoft.com\/azure\/ai-foundry\/observability\/how-to\/evaluate-agent\" target=\"_blank\" rel=\"noopener\">Set Up Trace-Based Evaluation<\/a><\/div>\n<h3>Evaluation Tooling Updates<\/h3>\n<p>Three evaluation improvements shipped at mid-month:<\/p>\n<ul>\n<li><strong>Evaluation alignment with AI Toolkit and VS Code<\/strong>: Consistent evaluation workflow across portal and IDE surfaces. Run the same evaluators from VS Code that you use in the portal \u2014 less friction switching between environments.<\/li>\n<li><strong>Skill evaluation<\/strong>: Skills become a first-class concept in Foundry evaluations. This is foundational platform work for structured skill evaluation \u2014 expect more here in coming months.<\/li>\n<li><strong>Workflow evaluation UX<\/strong>: Improved UX for workflow evaluations with new workflow evaluators. If you&#8217;ve found the workflow evaluation experience rough, this is the update to revisit.<\/li>\n<\/ul>\n<blockquote><p><strong>Action:<\/strong> Update your Foundry Toolkit extension and try running evaluations from VS Code alongside the portal.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/learn.microsoft.com\/azure\/foundry\/how-to\/develop\/cloud-evaluation\" target=\"_blank\" rel=\"noopener\">Develop an Evaluation<\/a><\/div>\n<h3>SocialReasoning-Bench and STATE-Bench<\/h3>\n<p>Two open-source benchmarks shipped for teams building agents that need to work <em>with<\/em> other agents or remember things across sessions:<\/p>\n<p><strong>SocialReasoning-Bench<\/strong>\u00a0tests whether agents negotiate and coordinate competently. It measures both outcome quality (did the agent get a good deal?) and process quality (did it negotiate reasonably?) in adversarial, realistic scenarios. Useful if you&#8217;re building multi-agent systems where agents interact with external parties.<\/p>\n<p><strong>STATE-Bench<\/strong>\u00a0measures whether agent memory actually improves performance on realistic enterprise tasks. If you&#8217;re investing in memory architectures for your agents, this gives you a reproducible way to evaluate whether that memory is helping or just adding complexity.<\/p>\n<blockquote><p><strong>Action:<\/strong> Use SocialReasoning-Bench and STATE-Bench to test your agents&#8217; coordination and memory capabilities against reproducible baselines.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/labs.ai.azure.com\/projects\/socialreasoning-bench\/\" target=\"_blank\" rel=\"noopener\">Explore SocialReasoning-Bench in Labs<\/a><\/div>\n<hr \/>\n<h2>Platform<\/h2>\n<h3>Managed VNET (GA)<\/h3>\n<p><strong>Managed VNET<\/strong> is now generally available for Microsoft Foundry projects. This is one of those platform updates that sounds like plumbing until your security review is staring at you from across the table.<\/p>\n<p>Instead of asking every app team to own virtual network design, subnet sizing, endpoint approval, and firewall details up front, Foundry can now provision a Microsoft-managed network boundary for agent outbound traffic. You still choose the isolation posture:<\/p>\n<table>\n<thead>\n<tr>\n<th>Mode<\/th>\n<th>Use it when<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><strong>Allow internet outbound<\/strong><\/td>\n<td>You need managed isolation, but broad outbound access is acceptable.<\/td>\n<\/tr>\n<tr>\n<td><strong>Allow only approved outbound<\/strong><\/td>\n<td>You need curated egress through service tags, private endpoints, or fully qualified domain name (FQDN) rules.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The developer payoff is simple: hosted and prompt agents can reach approved resources like Azure Storage, Azure Cosmos DB, Azure Key Vault, and Azure AI Search through managed private endpoints, while the networking primitives stay mostly out of your application code. Evaluations also plug into the same story, with required outbound rules for evaluation catalogs and Application Insights result reporting.<\/p>\n<p>Two gotchas are worth calling out before you flip the switch. First, the isolation mode is a creation-time architecture decision \u2014 you can&#8217;t disable Managed VNET later or convert a custom VNET deployment in place. Second, FQDN outbound rules create a managed Azure Firewall, which means firewall charges can appear even though the Managed VNET feature itself is free. Platform plumbing: still plumbing, just with a price tag if you ask for fancy valves.<\/p>\n<blockquote><p><strong>Action:<\/strong> Use Managed VNET for new regulated agent projects, and decide early whether you need <code>allow_internet_outbound<\/code> or <code>allow_only_approved_outbound<\/code>.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/learn.microsoft.com\/azure\/foundry\/how-to\/managed-virtual-network\" target=\"_blank\" rel=\"noopener\">Configure Managed VNET<\/a><\/div>\n<h3>Quota GA for Global and Data Zone<\/h3>\n<p>Quota management for <strong>Global Standard<\/strong> and <strong>Data Zone Standard<\/strong> deployments is now generally available. This matters if your production model deployments already span regions, or if you&#8217;re trying to move from \u201cwhy did I get a 429?\u201d archaeology to deliberate capacity planning.<\/p>\n<p>Quota is scoped per subscription, region, model, and deployment type. Foundry now gives you a cleaner operating model for that capacity:<\/p>\n<table>\n<thead>\n<tr>\n<th>Question<\/th>\n<th>Where to look<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>How much quota have I consumed?<\/td>\n<td>The Foundry portal quota page or the Usages API.<\/td>\n<\/tr>\n<tr>\n<td>Where can I deploy this model right now?<\/td>\n<td>The Model Capacities API.<\/td>\n<\/tr>\n<tr>\n<td>Why am I seeing 429s below my token usage chart?<\/td>\n<td>Response headers such as <code>x-ratelimit-limit-tokens<\/code>, <code>x-ratelimit-remaining-*<\/code>, and <code>retry-after-ms<\/code>.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The subtle but important change: quota is not the same thing as billed tokens. Rate limiting estimates the request&#8217;s maximum processed tokens at request time, including <code>max_tokens<\/code>, and RPM enforcement looks at short windows inside the minute. If your app bursts traffic or sets <code>max_tokens<\/code> to \u201cjust in case,\u201d you can throttle yourself even when Azure Monitor usage looks calm.<\/p>\n<blockquote><p><strong>Action:<\/strong> Add quota checks to your deployment automation and log rate-limit headers in production clients before you request more quota.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/learn.microsoft.com\/azure\/foundry\/openai\/quotas-limits\" target=\"_blank\" rel=\"noopener\">Manage Quota<\/a><\/div>\n<h3>Project-Level Cost Attribution<\/h3>\n<p><strong>Project-level cost attribution<\/strong> gives teams a more useful answer to the classic AI bill question: \u201cwhich project did this?\u201d<\/p>\n<p>That sounds small, but it changes how you run shared Foundry environments. If one workspace supports a chatbot prototype, an evaluation harness, a fine-tuning experiment, and a production agent, subscription-level cost graphs are too blunt. Project-level attribution gives platform teams a better unit for budgets, chargeback, anomaly review, and \u201cplease stop load-testing the expensive model at 5 PM\u201d conversations.<\/p>\n<p>Use it alongside Azure Cost Management rather than instead of it. Foundry project attribution helps explain model and project usage; Azure Cost Management still gives the full bill across supporting resources such as Azure AI Search, Storage, Key Vault, Application Insights, Private Link, virtual machines, and Marketplace model offers.<\/p>\n<blockquote><p><strong>Action:<\/strong> Review project-level spend after each evaluation or model-routing experiment, then set Azure budgets at the subscription or resource-group scope for the infrastructure around it.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/learn.microsoft.com\/azure\/ai-foundry\/how-to\/costs-plan-manage\" target=\"_blank\" rel=\"noopener\">View Cost Management<\/a><\/div>\n<h3>Data-Zone Support for OSS Models (Public Preview)<\/h3>\n<p><strong>Data-zone deployment support for open-source models<\/strong> is now in public preview. This is useful when the model choice is no longer the hard part \u2014 the hard part is where the inference runs.<\/p>\n<p>Global deployments are great when you want Azure to route traffic for availability. Data-zone deployments are the middle ground for teams that want Azure-managed routing inside a Microsoft-defined geography, with more control over data residency than a global deployment and less operational burden than stitching together regional deployments yourself.<\/p>\n<p>For developers building with open models, this gives you a cleaner path to test model quality, latency, and residency constraints together. Don&#8217;t treat preview as a free pass to production, though. Validate model availability, quota, content filters, and your fallback strategy before you put a user-facing workload behind it.<\/p>\n<blockquote><p><strong>Action:<\/strong> Use data-zone deployments for OSS model experiments that need geography-aware routing, then compare latency and quality against Global Standard and regional options.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/learn.microsoft.com\/azure\/foundry\/foundry-models\/concepts\/deployment-types\" target=\"_blank\" rel=\"noopener\">Compare Deployment Types<\/a><\/div>\n<h3>In-App Pay-As-You-Go Subscription<\/h3>\n<p>You can now create a <strong>pay-as-you-go subscription directly inside Foundry<\/strong> instead of detouring through the Azure portal. This is not the flashiest update in the post, but it removes one of the most annoying first-run speed bumps: \u201cI came here to try a model, why am I three tabs deep in billing setup?\u201d<\/p>\n<p>The signed-out experience also got a refresh, which helps new developers understand what Foundry does before they authenticate. That matters for internal enablement. If you&#8217;re sending teammates, customers, or workshop attendees into Foundry for the first time, fewer onboarding redirects means more time spent deploying models and fewer \u201cwhich portal am I in?\u201d messages in chat.<\/p>\n<blockquote><p><strong>Action:<\/strong> Update your workshop and onboarding links to point directly at the Foundry portal, especially for developers who do not already have an Azure subscription ready.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/ai.azure.com\" target=\"_blank\" rel=\"noopener\">Try Microsoft Foundry<\/a><\/div>\n<h3>Private Connectivity for Azure AI Search<\/h3>\n<p><strong>Private connectivity between Azure AI Search and Foundry<\/strong> is the update to look at if your agent architecture includes retrieval-augmented generation (RAG) over enterprise data.<\/p>\n<p>In a network-isolated Foundry setup, Azure AI Search can be reached through a private endpoint rather than public networking. That means the retrieval leg of your agent flow \u2014 query, top-k results, grounding chunks, metadata \u2014 can stay inside the approved network boundary. This is especially important when your search index contains sensitive internal documents, customer records, or regulated data that should not ride over public endpoints.<\/p>\n<p>The implementation detail developers need to remember: private Search with a private Foundry agent tool is supported in the new Foundry portal path, and your architecture should use bring-your-own resources for Storage, Azure AI Search, and Azure Cosmos DB when you need end-to-end network isolation. Also check tool support. MCP tools, Azure AI Search, OpenAPI, Azure Functions, and Agent-to-Agent (A2A) can run through the VNET path; some tools still use public endpoints or are not yet supported in isolated environments.<\/p>\n<blockquote><p><strong>Action:<\/strong> If your RAG agent uses Azure AI Search, move the Search connection into your network isolation design instead of treating it as an app-layer detail.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/learn.microsoft.com\/azure\/foundry\/how-to\/configure-private-link\" target=\"_blank\" rel=\"noopener\">Configure Network Isolation<\/a><\/div>\n<hr \/>\n<h2>Speech &amp; Content Understanding<\/h2>\n<h3>Speech Updates<\/h3>\n<p>Four speech capabilities shipped in May:<\/p>\n<table style=\"width: 89.3678%; height: 120px;\">\n<thead>\n<tr style=\"height: 24px;\">\n<th style=\"height: 24px; width: 37.1328%;\">Feature<\/th>\n<th style=\"height: 24px; width: 55.4642%;\">What it does<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr style=\"height: 24px;\">\n<td style=\"height: 24px; width: 37.1328%;\"><strong>Local live transcription<\/strong><\/td>\n<td style=\"height: 24px; width: 55.4642%;\">Local live transcription capability for speech scenarios in Foundry<\/td>\n<\/tr>\n<tr style=\"height: 24px;\">\n<td style=\"height: 24px; width: 37.1328%;\"><strong>Custom speech for Fast Transcription<\/strong><\/td>\n<td style=\"height: 24px; width: 55.4642%;\">Custom speech model support extended to fast transcription<\/td>\n<\/tr>\n<tr style=\"height: 24px;\">\n<td style=\"height: 24px; width: 37.1328%;\"><strong>Fast Transcribe API customization<\/strong><\/td>\n<td style=\"height: 24px; width: 55.4642%;\">Customization support for Fast Transcribe API (private preview)<\/td>\n<\/tr>\n<tr style=\"height: 24px;\">\n<td style=\"height: 24px; width: 37.1328%;\"><strong>Stereo support for realtime STT<\/strong><\/td>\n<td style=\"height: 24px; width: 55.4642%;\">Multi-channel audio fidelity for real-time speech-to-text<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The custom speech and Fast Transcribe items are particularly relevant if you&#8217;re running domain-specific transcription \u2014 medical, legal, or technical vocabularies that benefit from custom models. Stereo support matters for call-center and meeting scenarios where you need to distinguish speakers across channels.<\/p>\n<blockquote><p><strong>Action:<\/strong> If you&#8217;re using custom speech models, test them with Fast Transcription for faster turnaround on domain-specific audio.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/learn.microsoft.com\/azure\/ai-services\/speech-service\/fast-transcription-create\" target=\"_blank\" rel=\"noopener\">Try Fast Transcription<\/a><\/div>\n<h3>Content Understanding Improvements (GA)<\/h3>\n<p><strong>Content Understanding<\/strong> had a strong May. The read and layout analyzers reached GA \u2014 these are the document extraction primitives that power RAG pipelines, form processing, and document intelligence workflows.<\/p>\n<p>Alongside the GA:<\/p>\n<ul>\n<li><strong>Logic App connector<\/strong>: Plug Content Understanding extraction workflows into broader automation pipelines via Logic Apps. If you&#8217;re building document processing that feeds into business workflows, this is the integration point.<\/li>\n<li><strong>Content Understanding in Foundry<\/strong>: Part of the broader UX modernization inside Foundry \u2014 Content Understanding capabilities are surfaced in the new portal experience.<\/li>\n<li><strong>NER playground for TA4H<\/strong>: Next-generation playground for named entity recognition (NER) in Text Analytics for Health workflows.<\/li>\n<\/ul>\n<blockquote><p><strong>Action:<\/strong> If you&#8217;re using Content Understanding in preview, your read and layout analyzers are now GA. Test the Logic App connector for end-to-end document automation pipelines.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/learn.microsoft.com\/azure\/ai-services\/content-understanding\/overview\" target=\"_blank\" rel=\"noopener\">Explore Content Understanding<\/a><\/div>\n<hr \/>\n<h2>Foundry Local<\/h2>\n<h3>Foundry Local 1.1<\/h3>\n<p><strong>Foundry Local 1.1<\/strong> shipped in May with four headline features for on-device AI:<\/p>\n<ul>\n<li><strong>Live audio transcription<\/strong> \u2014 real-time speech-to-text streaming using the Nemotron ASR model (<code>nemotron-speech-streaming-en-0.6b<\/code>), with an OpenAI Realtime-compatible API surface. Available across Python, JavaScript, C#, and Rust SDKs.<\/li>\n<li><strong>Text embeddings<\/strong> \u2014 on-device embedding generation via a new embedding client. Ships with <code>qwen3-0.6b-embedding<\/code>.<\/li>\n<li><strong>Qwen 3.5 Vision<\/strong> \u2014 multimodal vision-language model running fully on-device.<\/li>\n<li><strong>WebGPU execution provider<\/strong> \u2014 delivered as a separate downloadable plugin to keep the default install lean (~20 MB base).<\/li>\n<\/ul>\n<p>The JavaScript SDK also dropped its <code>koffi<\/code> FFI dependency in favor of a prebuilt N-API addon \u2014 faster installs and a leaner <code>node_modules<\/code>. The C# SDK now dual-targets <code>netstandard2.0<\/code> and <code>net8.0<\/code>, enabling .NET Framework 4.6.1+ and Unity support.<\/p>\n<p>Qwen 3.5 Vision turns Foundry Local from \u201clocal chat model\u201d into \u201clocal model that can inspect the same screenshots, diagrams, whiteboards, and product photos your app already handles.\u201d No upload step, no cloud round trip, no awkward \u201cplease ignore the sensitive customer data in this screenshot\u201d moment.<\/p>\n<p>Use a small local image like this one:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2026\/05\/foundry-local-qwen-vision-sample.webp\" alt=\"Developer desk with a laptop chart, coffee mug, notebook, potted plant, and sensor board for a local vision model demo\" \/><\/p>\n<p>Then pass the image to the local Responses API:<\/p>\n<pre><code class=\"language-python\">import base64\r\nimport io\r\n\r\nfrom openai import OpenAI\r\nfrom PIL import Image\r\n\r\nfrom foundry_local_sdk import Configuration, FoundryLocalManager\r\n\r\nconfig = Configuration(app_name=\"foundry_local_vision_demo\")\r\nFoundryLocalManager.initialize(config)\r\nmanager = FoundryLocalManager.instance\r\n\r\nmodel = manager.catalog.get_model(\"qwen3-vl-2b-instruct\")\r\nif not model.is_cached:\r\n    model.download()\r\n\r\nclient = None\r\nservice_started = False\r\nmodel.load()\r\ntry:\r\n    manager.start_web_service()\r\n    service_started = True\r\n    client = OpenAI(base_url=manager.urls[0].rstrip(\"\/\") + \"\/v1\", api_key=\"notneeded\")\r\n\r\n    image = Image.open(\"images\/foundry-local-qwen-vision-sample.jpg\")\r\n    image.thumbnail((512, 512))\r\n\r\n    buffer = io.BytesIO()\r\n    image.save(buffer, format=\"JPEG\")\r\n    image_b64 = base64.b64encode(buffer.getvalue()).decode()\r\n\r\n    vision_input = [\r\n        {\r\n            \"type\": \"message\",\r\n            \"role\": \"user\",\r\n            \"content\": [\r\n                {\r\n                    \"type\": \"input_text\",\r\n                    \"text\": \"Describe the scene and identify anything useful for a developer demo.\",\r\n                },\r\n                {\r\n                    \"type\": \"input_image\",\r\n                    \"image_data\": image_b64,\r\n                    \"media_type\": \"image\/jpeg\",\r\n                },\r\n            ],\r\n        }\r\n    ]\r\n\r\n    stream = client.responses.create(\r\n        model=model.id,\r\n        input=\"placeholder\",\r\n        extra_body={\"input\": vision_input},\r\n        stream=True,\r\n    )\r\n\r\n    for event in stream:\r\n        if getattr(event, \"type\", None) == \"response.output_text.delta\":\r\n            print(getattr(event, \"delta\", \"\"), end=\"\", flush=True)\r\nfinally:\r\n    if client is not None:\r\n        client.close()\r\n    if service_started:\r\n        manager.stop_web_service()\r\n    model.unload()<\/code><\/pre>\n<p>Sample output:<\/p>\n<pre><code class=\"language-text\">This image depicts a typical developer's workspace, likely for a data scientist or software developer working on a project involving data analysis and development.\r\n\r\nThe scene is a wooden desk with a modern, functional setup, suggesting a focused and creative work environment.\r\n\r\n- Laptop: A silver laptop is the central focus. Its screen displays a digital graph, which is a bar chart with blue bars and an accompanying line graph, indicating some form of data analysis or progress tracking.\r\n- Development Tools: A small, green circuit board with USB ports and other connections is connected via cables to a breadboard-style adapter.\r\n- Notebook and Pen: A spiral-bound notebook and a pen are placed on the desk, indicating that the developer is taking notes and documenting their work.<\/code><\/pre>\n<p>Use Foundry Local to load the model, start the local web service, then call it through the same OpenAI-style Responses API shape you would use elsewhere. If your app already has screenshot capture, document preview, or image upload flows, this is a very low-friction way to add local visual reasoning.<\/p>\n<blockquote><p><strong>Action:<\/strong> Upgrade to <code>foundry-local-sdk<\/code> 1.1+ and try Qwen 3.5 Vision with one of your app&#8217;s real screenshots or diagrams.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/devblogs.microsoft.com\/foundry\/foundry-local-v1.1\/\" target=\"_blank\" rel=\"noopener\">Read the 1.1 Release Post<\/a><\/div>\n<h3>Foundry Local 1.2<\/h3>\n<p><strong>Foundry Local 1.2<\/strong> followed in May with operational improvements:<\/p>\n<ul>\n<li><strong>Cancellable downloads<\/strong> \u2014 model and execution provider downloads can be canceled using each platform&#8217;s native idiom (<code>CancellationToken<\/code> in C#, <code>AbortController<\/code> in JS, <code>threading.Event<\/code> in Python).<\/li>\n<li><strong>Multilingual ASR<\/strong> \u2014 speech recognition extended beyond English-only.<\/li>\n<li><strong>Linux ARM64<\/strong> \u2014 new platform target for aarch64 deployments.<\/li>\n<li><strong>WinML 2.0 upgrade<\/strong> \u2014 no longer requires the WinAppSDK Runtime bootstrapper, extends support to Windows 10.0.18362.0+, and adds WebGPU EP and plug-in auto-update.<\/li>\n<li><strong>ONNX Runtime 1.26.0 + GenAI 0.14.0<\/strong> \u2014 runtime upgrades with region-based downloads and no more 5-minute timeout cap on large models.<\/li>\n<\/ul>\n<p>For Python apps, the upgrade path stays simple:<\/p>\n<pre><code class=\"language-bash\">pip install --upgrade foundry-local-sdk<\/code><\/pre>\n<blockquote><p><strong>Action:<\/strong> Upgrade to <code>foundry-local-sdk<\/code> 1.2 for multilingual speech, ARM64 support, and cancellable downloads.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/github.com\/microsoft\/foundry-local\/releases\/tag\/v1.2.0\" target=\"_blank\" rel=\"noopener\">View 1.2 Release Notes<\/a><\/div>\n<hr \/>\n<h2>Foundry Agent Service<\/h2>\n<p>May&#8217;s most interesting agent developer update is the new prerelease surface in <code>azure-ai-projects<\/code>: <strong>skills<\/strong> and <strong>toolboxes<\/strong>. This is the vanilla Foundry Agent Service path \u2014 no Microsoft Agent Framework required.<\/p>\n<p>The useful shape is: register reusable guidance as a project skill, bundle it into a toolbox, expose that toolbox as a Model Context Protocol (MCP) endpoint, and attach the MCP endpoint to a prompt agent. I tested this with a scenario where Zava Studio&#8217;s agent reviews rough product screenshots and turns them into design guidance.<\/p>\n<p>Here is the rough input the agent received:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2026\/05\/zava-rough-product-screenshots.webp\" alt=\"Three rough Zava Studio product screenshots showing an upload queue, annotation canvas, and generated design-review notes\" \/><\/p>\n<p>The agent used <code>gpt-5.4<\/code> with <code>reasoning.effort=\"high\"<\/code>, plus a <code>frontend-design<\/code> skill registered through the project skill APIs.<\/p>\n<pre><code class=\"language-python\">import base64\r\nimport os\r\nfrom pathlib import Path\r\n\r\nfrom azure.ai.projects import AIProjectClient, models\r\nfrom azure.identity import DefaultAzureCredential\r\n\r\nendpoint = os.environ[\"FOUNDRY_PROJECT_ENDPOINT\"]\r\nmodel = \"gpt-5.4\"\r\n\r\ncredential = DefaultAzureCredential()\r\n\r\nwith AIProjectClient(\r\n    endpoint=endpoint,\r\n    credential=credential,\r\n    allow_preview=True,\r\n) as project_client, project_client.get_openai_client() as openai_client:\r\n    # 1. Register reusable design guidance as a project skill.\r\n    skill = project_client.beta.skills.create(\r\n        \"zava-frontend-design\",\r\n        inline_content=models.SkillInlineContent(\r\n            description=(\r\n                \"Zava Studio frontend-design skill: distinctive UI review \"\r\n                \"guidance for product screenshot workflows.\"\r\n            ),\r\n            instructions=Path(\"skills\/frontend-design\/SKILL.md\").read_text(),\r\n            metadata={\"scenario\": \"Zava Studio\"},\r\n        ),\r\n        default=True,\r\n    )\r\n\r\n    # 2. Put the skill in a toolbox with tool search enabled.\r\n    toolbox = project_client.beta.toolboxes.create_version(\r\n        \"zava-design-toolbox\",\r\n        description=(\r\n            \"Zava Studio design-review toolbox: frontend-design skill plus \"\r\n            \"a named web search tool for current UI guidance.\"\r\n        ),\r\n        tools=[\r\n            models.WebSearchTool(\r\n                type=\"web_search\",\r\n                name=\"zava_frontend_research\",\r\n                description=(\r\n                    \"Find frontend design guidance, product UI references, \"\r\n                    \"accessibility guidance, and verification ideas for Zava Studio.\"\r\n                ),\r\n                search_context_size=\"low\",\r\n            ),\r\n            models.ToolboxSearchPreviewTool(\r\n                type=\"toolbox_search_preview\",\r\n                name=\"zava_tool_search\",\r\n            ),\r\n        ],\r\n        skills=[\r\n            models.ToolboxSkillReference(\r\n                type=\"skill_reference\",\r\n                name=skill.name,\r\n                version=skill.version,\r\n            )\r\n        ],\r\n    )\r\n\r\n    # 3. Attach the toolbox to a prompt agent through its MCP endpoint.\r\n    token = credential.get_token(\"https:\/\/ai.azure.com\/.default\").token\r\n    toolbox_mcp_url = (\r\n        f\"{endpoint.rstrip('\/')}\/toolboxes\/zava-design-toolbox\/\"\r\n        f\"versions\/{toolbox.version}\/mcp?api-version=v1\"\r\n    )\r\n\r\n    toolbox_mcp_tool = models.MCPTool(\r\n        server_label=\"zava_design_toolbox\",\r\n        server_url=toolbox_mcp_url,\r\n        authorization=token,\r\n        headers={\"Foundry-Features\": \"Toolboxes=V1Preview\"},\r\n        require_approval=\"never\",\r\n    )\r\n\r\n    agent = project_client.agents.create_version(\r\n        \"zava-design-agent\",\r\n        definition=models.PromptAgentDefinition(\r\n            kind=\"prompt\",\r\n            model=model,\r\n            instructions=(\r\n                \"You are Zava Studio's frontend design agent. Use the attached \"\r\n                \"rough screenshots as visual context. First use tool_search, \"\r\n                \"then call_tool when useful. Return exactly three bullets: \"\r\n                \"aesthetic direction, concrete UI change, anti-pattern to avoid.\"\r\n            ),\r\n            reasoning=models.Reasoning(effort=\"high\"),\r\n            tools=[toolbox_mcp_tool],\r\n        ),\r\n    )\r\n\r\n    # 4. Invoke the agent with rough product screenshots as image input.\r\n    image_b64 = base64.b64encode(\r\n        Path(\"images\/zava-rough-product-screenshots.png\").read_bytes()\r\n    ).decode(\"ascii\")\r\n\r\n    response = openai_client.responses.create(\r\n        input=[\r\n            {\r\n                \"role\": \"user\",\r\n                \"content\": [\r\n                    {\r\n                        \"type\": \"input_text\",\r\n                        \"text\": (\r\n                            \"Review these Zava Studio rough product screenshots. \"\r\n                            \"The product turns messy product screenshots into \"\r\n                            \"clear design-review notes.\"\r\n                        ),\r\n                    },\r\n                    {\r\n                        \"type\": \"input_image\",\r\n                        \"image_url\": f\"data:image\/png;base64,{image_b64}\",\r\n                    },\r\n                ],\r\n            }\r\n        ],\r\n        extra_body={\r\n            \"agent_reference\": {\r\n                \"name\": agent.name,\r\n                \"type\": \"agent_reference\",\r\n            }\r\n        },\r\n    )\r\n\r\n    print(response.output_text)<\/code><\/pre>\n<p>Sample output from the live run:<\/p>\n<pre><code class=\"language-text\">- Aesthetic direction: Treat Zava as a calm review workbench for turning raw screenshots into decisions: neutral surfaces (canvas #F6F7F9, panel #FFFFFF, border #D9DEE7, text #0F172A \/ #475569), one blue action accent (#2563EB), semantic colors only for severity, 12px radius, 8\/16\/24 spacing, and very light elevation so the screenshot and annotations \u2014 not the app chrome \u2014 stay primary.\r\n\r\n- Concrete UI change: Recompose the first viewport into one desktop-first tri-pane flow \u2014 left 280px intake\/review queue, center flexible annotation canvas, right 360px generated notes \u2014 with the right pane opening on Actionable notes and moving Tokens \/ Accessibility into tabs or accordions below; that makes the product promise visible in one glance: \"messy screenshot in \u2192 clear review notes out.\" Verify on a 1440px desktop that all three panes fit without horizontal scroll, tab order moves left \u2192 center \u2192 right, spacing stays on an 8px grid, and text\/controls meet at least 4.5:1 contrast.\r\n\r\n- Anti-pattern to avoid: Don't turn this into three disconnected admin screens or a mini-Figma clone with dense toolbars, loud gradients, and multicolor cards everywhere; that overbuilds the UI, increases cognitive load, and hides the core before\/after value of the product.<\/code><\/pre>\n<blockquote><p><strong>Action:<\/strong> Try <code>azure-ai-projects==2.2.0<\/code> with a small project skill and toolbox. Start with the CRUD samples for <code>beta.skills<\/code>, then use the toolbox-search MCP pattern when you want agents to discover tools dynamically.<\/p><\/blockquote>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/github.com\/Azure\/azure-sdk-for-python\/tree\/main\/sdk\/ai\/azure-ai-projects\/samples\" target=\"_blank\" rel=\"noopener\">Browse azure-ai-projects Samples<\/a><\/div>\n<hr \/>\n<h2>SDK &amp; Language Changelog (May 2026)<\/h2>\n<p>May&#8217;s SDK story is about expanding the preview surface \u2014 external agents, model weight management, routines, optimization jobs, and memory stores all landed as beta operations. The pattern is the same as April: stable GA core, fast-moving <code>.beta<\/code> namespace.<\/p>\n<h3>Python<\/h3>\n<p><strong><code>azure-ai-projects<\/code> 2.2.0<\/strong><\/p>\n<p>The biggest addition is the <code>.beta.models<\/code> sub-client for AI model weight registry \u2014 create, list, update, delete, and retrieve credentials for model versions. This opens up programmatic model management workflows that previously required the portal.<\/p>\n<p>Other highlights:<\/p>\n<ul>\n<li><strong>External agent integration<\/strong> (preview) \u2014 new <code>ExternalAgentDefinition<\/code> for third-party agent integration.<\/li>\n<li><strong>New agent tools<\/strong> \u2014 <code>FabricIQPreviewTool<\/code> and <code>ToolboxSearchPreviewTool<\/code>.<\/li>\n<li><strong>Optimization jobs<\/strong> \u2014 create, monitor, and promote optimization candidates for hosted agents.<\/li>\n<li><strong>Routines<\/strong> \u2014 triggered automation CRUD via <code>.beta.routines<\/code>.<\/li>\n<li><strong>Data generation jobs<\/strong> \u2014 synthetic data generation via <code>.beta.datasets<\/code>.<\/li>\n<li><strong>Memory store item CRUD<\/strong> \u2014 individual memory management in <code>.beta.memory_stores<\/code>.<\/li>\n<li><strong>Skills versioned management<\/strong> \u2014 create, list, download, delete skill versions.<\/li>\n<\/ul>\n<p>Breaking changes are confined to the <code>.beta<\/code> namespace: <code>isolation_key<\/code> removed from session operations, several class renames (<code>AgentEndpoint<\/code> \u2192 <code>AgentEndpointConfig<\/code>, <code>SkillObject<\/code> \u2192 <code>SkillDetails<\/code>, <code>Target<\/code> \u2192 <code>EvaluationTarget<\/code>), and signature changes in skills and evaluation taxonomy methods.<\/p>\n<pre><code class=\"language-bash\">pip install --upgrade azure-ai-projects==2.2.0<\/code><\/pre>\n<blockquote><p><strong>Action:<\/strong> Upgrade to <code>azure-ai-projects==2.2.0<\/code>. Breaking changes only affect <code>.beta<\/code> surface \u2014 stable operations are unchanged.<\/p><\/blockquote>\n<p><a href=\"https:\/\/pypi.org\/project\/azure-ai-projects\/2.2.0\/\">Changelog<\/a><\/p>\n<h3>JavaScript \/ TypeScript<\/h3>\n<p><strong><code>@azure\/ai-projects<\/code> 2.1.1 + 2.2.0<\/strong><\/p>\n<p>The <code>2.1.1<\/code> patch fixes agent list operations that only returned the first page of results due to missing cursor-based pagination \u2014 upgrade if you have more than one page of agents.<\/p>\n<p><code>2.2.0<\/code> mirrors the Python release: external agent definitions, model weight registry, routines, optimization jobs, memory store CRUD, <code>FabricIQPreviewTool<\/code>, <code>WorkIQPreviewTool<\/code>, and <code>ToolboxSearchPreviewTool<\/code>. Same beta-scoped breaking changes as Python.<\/p>\n<pre><code class=\"language-bash\">npm install @azure\/ai-projects@2.2.0<\/code><\/pre>\n<blockquote><p><strong>Action:<\/strong> Take <code>2.1.1<\/code> immediately for the pagination fix. Move to <code>2.2.0<\/code> when you&#8217;re ready for the new beta surface.<\/p><\/blockquote>\n<p><a href=\"https:\/\/www.npmjs.com\/package\/@azure\/ai-projects\">Changelog<\/a><\/p>\n<h3>.NET<\/h3>\n<p><strong><code>Azure.AI.Projects<\/code> 2.1.0-beta.2 + 2.1.0-beta.3<\/strong><\/p>\n<p><code>2.1.0-beta.2<\/code> adds the <code>DataGenerationJobs<\/code> client for synthetic data generation \u2014 useful for creating evaluation datasets programmatically. New samples cover evaluation cluster insights, AI-assisted evaluators, and image grading.<\/p>\n<p><code>2.1.0-beta.3<\/code> adds the <code>AIProjectModels<\/code> client for model weight management and memory store item CRUD.<\/p>\n<blockquote><p><strong>Action:<\/strong> Upgrade to <code>Azure.AI.Projects<\/code> 2.1.0-beta.3 for the latest preview surface. GA operations remain on the stable 2.0.x line.<\/p><\/blockquote>\n<p><a href=\"https:\/\/github.com\/Azure\/azure-sdk-for-net\/blob\/main\/sdk\/ai\/Azure.AI.Projects\/CHANGELOG.md\">Changelog<\/a><\/p>\n<h3>Java<\/h3>\n<p><strong><code>azure-ai-projects<\/code> 2.1.0-beta.1<\/strong><\/p>\n<p>Java adds the <code>SkillsClient<\/code> and <code>SkillsAsyncClient<\/code> for end-to-end skill management (create, download, list, update, delete). A new <code>buildAgentScopedOpenAIClient(agentName)<\/code> method on <code>AIProjectClientBuilder<\/code> returns an OpenAI client scoped to a specific agent endpoint \u2014 useful when you need per-agent routing.<\/p>\n<p>Also adds <code>threshold<\/code> to <code>EvaluatorMetric<\/code> and new properties on <code>CodeBasedEvaluatorDefinition<\/code> for code-based evaluator workflows.<\/p>\n<blockquote><p><strong>Action:<\/strong> Upgrade to <code>com.azure:azure-ai-projects:2.1.0-beta.1<\/code> for skills management and agent-scoped clients.<\/p><\/blockquote>\n<p><a href=\"https:\/\/github.com\/Azure\/azure-sdk-for-java\/blob\/main\/sdk\/ai\/azure-ai-projects\/CHANGELOG.md\">Changelog<\/a><\/p>\n<hr \/>\n<h2>Resources &amp; Community<\/h2>\n<p><div class=\"alert alert-info\"><p class=\"alert-divider\"><i class=\"fabric-icon fabric-icon--Info\"><\/i><strong>Register for Microsoft Build<\/strong><\/p>Microsoft Build runs June 2-3, 2026, in San Francisco and online. Register now, sign in, and save Microsoft Foundry sessions to your schedule so you can watch them online. <a href=\"https:\/\/build.microsoft.com\/\">Register for Microsoft Build<\/a><\/div><\/p>\n<div class=\"d-flex\"><a class=\"cta_button_link btn-secondary\" href=\"https:\/\/build.microsoft.com\/\" target=\"_blank\" rel=\"noopener\">Register for Microsoft Build<\/a><\/div>\n<ul>\n<li><strong>Foundry docs:<\/strong> Start with the <a href=\"https:\/\/learn.microsoft.com\/azure\/ai-foundry\/\">Microsoft Foundry documentation<\/a><\/li>\n<li><strong>Discord:<\/strong> Join <a href=\"https:\/\/aka.ms\/foundry\/discord\">50,000+ developers<\/a> building with Foundry<\/li>\n<li><strong>GitHub Discussions:<\/strong> Ask questions in <a href=\"https:\/\/aka.ms\/foundry\/forum\">the forum<\/a><\/li>\n<li><strong>RSS:<\/strong> <a href=\"https:\/\/devblogs.microsoft.com\/foundry\/category\/whats-new\/feed\/\">Subscribe<\/a> to get this digest monthly<\/li>\n<li><strong>Foundry Labs:<\/strong> Explore research projects, model experiments, and runnable examples in <a href=\"https:\/\/labs.ai.azure.com\/\">Foundry Labs<\/a><\/li>\n<li><strong>Microsoft Build recap:<\/strong> Catch up on <a href=\"https:\/\/build.microsoft.com\/\">Microsoft Build sessions<\/a> if you missed them live<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>May ships trace-based evaluation for any agent on any cloud, Grok 4.3 and DeepSeek V4 in the model catalog, GPT-5 Reinforcement Fine-Tuning at gated GA, three Microsoft Research on-device agent models, Managed VNET at GA, project-level cost attribution, Content Understanding improvements reaching GA, Foundry Local 1.1 and 1.2 with live audio and vision, and azure-ai-projects 2.2.0 with skills, toolboxes, external agents, and model weight registry \u2014 plus a guide to Microsoft Foundry sessions at Microsoft Build.<\/p>\n","protected":false},"author":185793,"featured_media":2382,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1,27],"tags":[148,25,146,144,143,66,116,38,142,145,34,2,147,103,123,104,114],"class_list":["post-2381","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-microsoft-foundry","category-whats-new","tag-agent-service","tag-agents","tag-benchmarks","tag-content-understanding","tag-deepseek","tag-evaluations","tag-fireworks","tag-foundry-local","tag-grok","tag-managed-vnet","tag-microsoft-build","tag-microsoft-foundry","tag-microsoft-research","tag-models","tag-reinforcement-fine-tuning","tag-sdk","tag-speech"],"acf":[],"blog_post_summary":"<p>May ships trace-based evaluation for any agent on any cloud, Grok 4.3 and DeepSeek V4 in the model catalog, GPT-5 Reinforcement Fine-Tuning at gated GA, three Microsoft Research on-device agent models, Managed VNET at GA, project-level cost attribution, Content Understanding improvements reaching GA, Foundry Local 1.1 and 1.2 with live audio and vision, and azure-ai-projects 2.2.0 with skills, toolboxes, external agents, and model weight registry \u2014 plus a guide to Microsoft Foundry sessions at Microsoft Build.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/posts\/2381","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/users\/185793"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/comments?post=2381"}],"version-history":[{"count":1,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/posts\/2381\/revisions"}],"predecessor-version":[{"id":2386,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/posts\/2381\/revisions\/2386"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/media\/2382"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/media?parent=2381"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/categories?post=2381"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/tags?post=2381"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}