How to run evals for the model router
Walk through running quality, cost, and latency evaluations for the Foundry model router using an open-source GitHub repo designed for router-aware eval pipelines.
Walk through running quality, cost, and latency evaluations for the Foundry model router using an open-source GitHub repo designed for router-aware eval pipelines.
Foundry Local 1.1 adds live transcription, embeddings, Responses API, WebGPU plugin, and download cancellation.
April brings a wave of model arrivals — GPT-5.5, GPT-image-2, Microsoft first-party MAI models for image, voice, and transcription, Gemma 4, and Claude Opus 4.7 — alongside Foundry Local GA, Microsoft Agent Framework 1.0 GA, the Microsoft Foundry Toolkit for VS Code GA, batch evaluations for third-party agents, new tracing and monitoring capabiliti...
Available in Public Preview Today Toolbox is a new way to curate, configure, and reuse tools across all of your AI agents without rewiring them every time from Foundry. Today, teams build agents across different frameworks and runtimes. Each agent often wires tools directly, with its own authentication, credentials, and integration code. As ...
When we launched Microsoft Agent Framework last October, we made a promise: building production-grade AI agents should feel as natural and structured as building any other software. Today, we’re delivering on that promise — with the v1.0 release of Microsoft Agent Framework and the general availability of Foundry Toolkit for Visual Studio Code (...
Agents are already transforming how developers solve problems. Whether it's a coding agent that refactors your repo overnight, a research agent that synthesizes hundreds of documents into a brief, or an ops agent that monitors and remediates infrastructure — the pattern is clear. Developers are building agents that don't just answer questions, they...
April 2026 brings three major Reinforcement Fine-Tuning updates: Global Training for o4-mini with lower per-token rates across 12+ regions, new GPT-4.1 model graders for richer reward signals, and a comprehensive RFT best practices guide to help you ship specialized models faster.
March ships Foundry Agent Service GA with private networking, GPT-5.4 and GPT-5.4 Mini, Priority Processing, Phi-4 Reasoning Vision, SDK 2.0 GA across Python, JS/TS, Java, and .NET, Fireworks AI and NVIDIA Nemotron models, and third-party guardrails from Palo Alto and Zenity.
Ship local AI to millions of devices - fast, private on-device inference with no per-token costs.
The next-gen Foundry Agent Service is generally available today with end-to-end private networking, Voice Live integration, expanded MCP authentication, GA evaluations with continuous monitoring, and hosted agent deployments in six new Azure regions.