Author’s note: So.. it has been a bit. I have to level with you — I returned from paternity leave in January and between Microsoft Ignite 2025 and present day, things have changed a lot. Without further adieu, here is your monthly (late) drop for all things new with Microsoft Foundry. Expect following editions to be back on track going forward. Thanks for your patience!
TL;DR
December 2025 was one of the biggest months in Microsoft Foundry history. Here’s everything that shipped:
- GPT‑5.2 (GA): New enterprise reasoning standard — top benchmark scores across math, science, coding, and multimodal tasks; available as
gpt-5.2andgpt-5.2-chat-latest. - GPT‑5.1 Codex Max (GA): 77.9% on SWE-Bench, 400K context, 50+ languages — built for autonomous multi-agent coding pipelines, PR generation, and CI/CD integration.
- Mistral Large 3 (Public Preview): Apache 2.0, 41B active / 675B total parameters; $0.50 / $1.50 per million tokens. Strong instruction following and multimodal reasoning.
- DeepSeek V3.2 + V3.2‑Speciale (Public Preview): 128K context, up to 3× faster reasoning via Sparse Attention; Speciale drops tool calling entirely for maximum reasoning accuracy.
- Kimi‑K2 Thinking (Public Preview): Moonshot AI’s deep reasoning model with a 256K context window, now Direct from Azure.
- Cohere Rerank 4 (Fast + Pro): Cross-encoding reranker for RAG pipelines; 100+ languages, serverless pay-as-you-go.
- GPT‑image‑1.5 (GA): 4× faster generation, ~20% lower cost vs. GPT‑image‑1; adds inpainting and face preservation.
- FLUX.2 [pro] (Public Preview): Black Forest Labs’ next-gen image model with multi-reference support, improved text rendering, and enterprise SLAs on Azure.
- Audio models (GA, Dec 15): Realtime Mini, ASR (
gpt-4o-mini-transcribe), and TTS (gpt-4o-mini-tts) — all GA with significant accuracy and latency improvements. - Fine-tuning base models: Ministral 3B, Qwen3 32B, OSS-20B, and Llama 3.3 70B now available for serverless fine-tuning.
- ⚠️ AzureML SDK v1 EOL: June 30, 2026 — migrate to SDK v2 now; CLI v1 already sunset September 2025.
azure-ai-projectsv2 beta: Agents, inference, evaluations, and memory are now unified in a single package — theazure-ai-agentsdependency is gone;2.0.0b3shipped January 6, 2026.- Memory in Foundry Agent Service (Public Preview): Managed long-term memory store with automatic extraction, consolidation, and retrieval across agent sessions. Free during preview; pay only for the underlying model calls.
- Agent-to-Agent (A2A) Tool (Preview): Let Foundry agents call any A2A-protocol endpoint with explicit auth and clean call/response semantics — the structured evolution of Connected Agents.
- Foundry MCP Server (Preview): Cloud-hosted MCP at
mcp.ai.azure.com, live since December 3. Connect from VS Code, Visual Studio, or the Foundry portal — zero local process management, Entra auth included. - Microsoft Foundry for VS Code — January 2026: Multi-workflow visualizer, all prompt agents testable in Playground, and code samples for every agent type.
Models
GPT‑5.2 — The New Enterprise Reasoning Standard
GPT‑5.2 is now generally available in Microsoft Foundry. Built for multi-step problem solving, long-context understanding, and agentic tool-calling, GPT‑5.2 achieves top scores across math, science, coding, and multimodal benchmarks. Whether you’re orchestrating agents, reasoning over large document sets, or building production-grade pipelines, GPT‑5.2 delivers more coherent, compliant, and shippable outputs than any prior generation.
gpt-5.2— primary reasoning model for complex enterprise tasksgpt-5.2-chat-latest— optimized for conversational and everyday professional workflows
GPT‑5.1 Codex Max — AI for Autonomous Enterprise Coding
GPT‑5.1 Codex Max is now generally available in Microsoft Foundry, purpose-built for engineering-scale coding tasks. It achieves 77.9% on SWE-Bench, supports a 400K token context window across 50+ programming languages, and is designed end-to-end for multi-agent coding workflows — from refactoring legacy .NET and Java apps to automated pull requests, secure API generation, and CI/CD pipeline integration. It can be triggered directly from the terminal, VS Code, or GitHub Actions runners.
Mistral Large 3 — Open-Weight Enterprise Intelligence
Mistral Large 3 is now in public preview in Microsoft Foundry, released under the Apache 2.0 license — meaning free for commercial use without attribution. With 41B active parameters in a sparse mixture-of-experts architecture (675B total), it delivers strong instruction following, long-context comprehension, and multimodal reasoning, with a straightforward $0.50 / $1.50 per million input/output token price point.
DeepSeek V3.2 and V3.2‑Speciale
DeepSeek V3.2 and DeepSeek V3.2‑Speciale launched in public preview on December 15, 2025. Both feature a 128K context window and DeepSeek Sparse Attention for up to 3× faster reasoning paths. The Speciale variant is purpose-tuned for maximum reasoning accuracy — it omits native function/tool calling entirely to reserve all compute for pure reasoning, making it ideal for research labs, scientific workflows, and high-stakes evaluation pipelines.
Kimi‑K2 Thinking — Deep Reasoning from Moonshot AI
Kimi‑K2 Thinking from Moonshot AI is now in public preview as a Direct from Azure model. With a 256K context window, it excels at deep reasoning, tool orchestration, and complex multi-step problem solving — a strong addition to the growing catalog of non-OpenAI frontier reasoning models on Foundry.
Cohere Rerank 4 — State-of-the-Art Retrieval for RAG
Cohere Rerank v4.0 — available in Fast and Pro variants — is now in the Microsoft Foundry model catalog, deployable via pay-as-you-go serverless endpoints. Designed to improve search relevance and reduce LLM hallucinations in RAG pipelines, Rerank 4 uses cross-encoding AI to re-sort retrieved documents by semantic similarity to your query. Supports 100+ languages and drops into existing keyword or semantic retrieval stacks with minimal code changes.
Images
GPT‑image‑1.5 — Faster, Higher-Quality Image Generation
GPT‑image‑1.5 is now generally available in Microsoft Foundry, delivering up to 4× faster generation and approximately 20% lower API costs compared to GPT‑image‑1. Improvements span text-to-image generation, image-to-image transformation, inpainting, and face preservation — with output resolutions up to 1024×1536. Access at launch is gated for enterprise customers (MCA-E and EA).
FLUX.2 [pro] from Black Forest Labs
FLUX.2 [pro] from Black Forest Labs is now in public preview in Microsoft Foundry. Building on FLUX.1, it adds multi-reference support (up to 8 images), improved text rendering for infographics and UI mockups, and enhanced adherence to complex, multi-part prompts. Available with Microsoft-backed SLAs, Responsible AI controls, and global standard deployment across Azure regions.
Audio
Updated Audio Models: Realtime Mini, ASR, and TTS
Three new audio models reached general availability on December 15, 2025, raising the bar across the real-time voice stack:
| Model | What’s new |
|---|---|
| gpt-realtime-mini-2025-12-15 | Feature parity with full gpt-realtime in instruction-following and function-calling; new voices Marin and Cedar; glitch-free audio |
| gpt-4o-mini-transcribe-2025-12-15 | ~50% lower WER on English benchmarks; better multilingual support; up to 4× fewer silence hallucinations in noisy environments |
| gpt-4o-mini-tts-2025-12-15 | More natural, human-like multilingual speech synthesis with reduced artifacts |
All three are API-only deployments accessible through the Azure OpenAI endpoint in Microsoft Foundry.
Fine-Tuning
New Open-Source Base Models for Fine-Tuning
Microsoft Foundry expanded its fine-tuning catalog with four new open-source base models on serverless infrastructure — pre-announced at Ignite 2025 and now live:
| Model | Best for |
|---|---|
| Ministral 3B | Lightweight, cost-sensitive scenarios |
| Qwen3 32B | Multilingual applications |
| OSS-20B | Balanced enterprise workloads |
| Llama 3.3 70B | Complex reasoning at scale |
Fine-tuning is available on either serverless or managed compute, with Microsoft’s security, compliance, and Responsible AI guardrails applied uniformly across all models.
Agents
Memory in Foundry Agent Service (Public Preview)
Most agents today are stateless — every conversation starts from zero. Memory in Foundry Agent Service is a fully managed, long-term memory store natively integrated with the agent runtime, that extracts, consolidates, and retrieves user preferences and context across sessions and devices — no custom embedding database or retrieval pipeline required.
The process runs in four phases: Extract (preferences, facts, and key context from each conversation turn), Consolidate (LLM merges duplicates and resolves conflicts), Retrieve (hybrid search surfaces relevant memories at conversation start, with core facts like allergies or preferences injected immediately), and Customize (the user_profile_details parameter focuses extraction on what matters for your specific use case).
Enable it with a single click in the Foundry portal, or via SDK:
from azure.ai.projects.models import MemoryStoreDefaultDefinition, MemoryStoreDefaultOptions
definition = MemoryStoreDefaultDefinition(
chat_model="gpt-5",
embedding_model="text-embedding-3-small",
options=MemoryStoreDefaultOptions(
user_profile_enabled=True,
user_profile_details="Food preferences for a meal planning agent",
chat_summary_enabled=True,
),
)
memory_store = project_client.memory_stores.create(
name="my_memory_store",
description="Example memory store for conversations",
definition=definition,
)
Free during preview — you pay only for the underlying chat and embedding model calls.
Agent-to-Agent (A2A) Tool (Preview)
The A2A tool adds inter-agent communication to Foundry agents — point it at any endpoint that implements the A2A protocol, and your agent can invoke it as a first-class tool. This is the structured evolution of “Connected Agents” in Foundry Classic, with cleaner semantics and explicit authentication options: key-based, OAuth2, or Entra Agent Identity.
The distinction that matters for system design:
- A2A tool: Agent A calls Agent B; B’s answer returns to A; A synthesizes the final user response. Agent A stays in control of the thread.
- Multi-agent workflow: Agent B takes full ownership of the thread from the point of handoff — Agent A is out of the loop.
Configure via the Foundry portal (Tools → Connect tool → Custom → Agent2Agent) or in code:
from azure.ai.projects.models import A2ATool, PromptAgentDefinition
a2a_conn = project_client.connections.get(os.environ["A2A_PROJECT_CONNECTION_NAME"])
agent = project_client.agents.create_version(
agent_name="my-agent",
definition=PromptAgentDefinition(
model=os.environ["FOUNDRY_MODEL_DEPLOYMENT_NAME"],
instructions="You are a helpful assistant.",
tools=[A2ATool(project_connection_id=a2a_conn.id)],
),
)
Tools
Computer Use (Preview)
Computer Use lets Foundry agents visually interact with desktop and browser environments using the computer-use-preview model. Instead of calling structured APIs, the agent receives a screenshot and acts — click, type, scroll, navigate. Use it for UI testing automation, navigating legacy web apps that predate REST APIs, extracting data from visual-only interfaces, and RPA-style workflows where brittle CSS selectors previously dominated.
The .NET SDK (Azure.AI.Agents.Persistent 1.2.0-beta.8, December 2025) added first-class Computer Use tool support. Python and TypeScript support is in active development — track the changelogs.
Platform
Foundry MCP Server (Preview)
The Foundry MCP Server is a cloud-hosted, fully managed MCP endpoint at https://mcp.ai.azure.com — the production successor to the experimental local MCP server shipped at Build 2025. It went live December 3, 2025. No local uptime to manage. Connect it from VS Code (mcp.json), Visual Studio 2026 Insiders, or add it as a tool connection in the Foundry portal with one click.
Conversational workflows you can drive through it today:
- Model operations: Browse the catalog, compare benchmarks, get upgrade recommendations based on capabilities and deprecation schedules, check quota headroom, deploy and deprecate deployments
- Agent management: Create, update, and version agents without leaving your editor
- Evaluation pipelines: Chain
evaluation_dataset_create→evaluation_create→evaluation_comparison_createfor automated quality loops in a single chat thread
Security: Entra ID authentication end-to-end (OBO tokens scoped to https://mcp.ai.azure.com). Every operation runs under the signed-in user’s Azure RBAC permissions with full audit logging. Tenant admins control access via Azure Policy Conditional Access.
// .vscode/mcp.json
{
"servers": {
"foundry-mcp": { "type": "http", "url": "https://mcp.ai.azure.com" }
}
}
Microsoft Foundry for VS Code — January 2026 Update
The Microsoft Foundry extension for VS Code shipped a focused update on January 20, 2026:
- Multi-workflow Visualizer: View, navigate, and debug multiple interconnected workflows in a single project panel — previously limited to one at a time.
- Prompt agents in Playground: All prompt agents in your project are now surfaced directly in the Playground for interactive testing. No context-switching.
- Open code for any agent type: The extension generates and opens sample code for prompt agents, YAML-based workflows, hosted agents, and Foundry classic agents. Drop it straight into your existing project.
- Separated v1/v2 resource view: Classic Foundry resources and new-gen agents display in clearly distinct views, eliminating the common confusion about which generation a resource belongs to.
New Foundry Experience at ai.azure.com
The new unified Foundry portal experience — available at ai.azure.com via the “New Foundry” toggle — introduced a meaningfully different mental model from Foundry Classic:
- The Tools tab is the single entry point for discovering, connecting, and managing agentic integrations: MCP servers, A2A endpoints, Azure AI Search, SharePoint, Fabric, and more — across more than 1,400 business systems.
- Multi-agent workflows are built visually in the portal, distinct from the single-agent flow of Foundry Classic.
- The separated v1/v2 resource view ensures Classic and new-gen agents don’t share ambiguous panels.
Deprecation Notice
AzureML SDK v1 — End of Life June 30, 2026
The Azure Machine Learning SDK v1 reaches end of support on June 30, 2026. After this date, existing workflows may face security risks and breaking changes without active Microsoft support. Note that the AzureML CLI v1 extension already reached end of support on September 30, 2025. If you’re still running v1-based training pipelines, the SDK v2 migration guide is the place to start — v2 brings a significantly improved authoring experience, YAML-first job definitions, and continued investment from the Azure ML team.
SDK & Language Changelog (Dec 2025 – Jan 2026)
All Microsoft Foundry SDK development is consolidating into a single azure-ai-projects package per language. Agents, inference, evaluations, and memory operations that previously lived in separate packages (azure-ai-agents, etc.) are unified under the azure-ai-projects v2 beta line. All active development happens on preview/beta branches — pin accordingly.
Python
azure-ai-projects 2.0.0b3 (2026-01-06)
The v2 line is the new canonical SDK for everything Foundry: agents (now built on the OpenAI Responses protocol), evaluations, memory stores, and model inference. This release bundles openai and azure-identity as direct dependencies — no separate installs required.
Key changes across the 2.0.0b1–b3 window:
- Agents on Responses protocol:
AIProjectClientnow handles agent ops directly;azure-ai-agentsdependency dropped. get_openai_client()now returns anopenai.OpenAIclient pre-configured for your Foundry project endpoint (Responses API).- Class renames:
AgentObject→AgentDetails,MemoryStoreObject→MemoryStoreDetails,AgentVersionObject→AgentVersionDetails. - Tracing overhaul: span names, attribute keys, and operation names changed to align with OpenTelemetry
gen_ai.*conventions (e.g.gen_ai.provider.nameis now"microsoft.foundry"). - New operations:
.memory_stores,.evaluation_rules,.evaluators,.insights,.schedulesonAIProjectClient.
Action: Upgrade to azure-ai-projects==2.0.0b3. Remove any standalone azure-ai-agents pins — agent creation and runs are now first-class methods on AIProjectClient.
azure-ai-evaluation 1.14.0 (2026-01-05)
Evaluation still ships as a standalone package while consolidation completes — expect this to merge into azure-ai-projects in a future beta. 1.14.0 is primarily a bug-fix release: corrected binary scoring for CodeVulnerability and UngroundedAttributes evaluators in the RedTeam scanner, and fixed GroundednessEvaluator not honoring is_reasoning_model when the query parameter was supplied.
.NET
Azure.AI.Agents.Persistent 1.2.0-beta.8 (2025-12-01)
Added first-class Computer Use support for agents, letting you wire up computer-use-preview model runs directly from the persistent agents client. PersistentAgentsChatClient got improved error handling for incomplete-state streaming runs.
Breaking: none in this release.
Action: Pin Azure.AI.Agents.Persistent to 1.2.0-beta.8 to get Computer Use.
Azure.AI.Projects 1.2.0-beta.5 (2025-12-12)
Updated for transitive compatibility with OpenAI 2.8.0, including substantial changes to the [Experimental] Responses API surface. Also fixes file uploading for fine-tuning jobs. The 1.2.0-beta.1 entry (November) is also worth noting if you haven’t upgraded — it introduced the full Microsoft Foundry Agents Service feature set, memory, evaluations, red teaming, schedules, and insights on AIProjectClient.
Breaking: Responses API surface changed with OpenAI 2.8.0 compatibility update — review your [Experimental] Responses code paths.
Action: Upgrade to Azure.AI.Projects 1.2.0-beta.5.
JavaScript / TypeScript
@azure/ai-projects 2.0.0-beta.2 → 2.0.0-beta.4 (Dec 2025 – Jan 2026)
Three betas landed in quick succession — the highlights:
- 2.0.0-beta.2 (2025-12-02): Re-added
project.telemetryroute to restore access to Application Insights connection string (removed in beta.1). - 2.0.0-beta.3 (2026-01-09): Fixed response JSON schema deserializer bug.
- 2.0.0-beta.4 (2026-01-29): Major class renames to align with OpenAI naming conventions — GA tools now use a
Toolsuffix; preview tools usePreviewTool. Key renames:AzureAISearchAgentTool→AzureAISearchTool,BrowserAutomationAgentTool→BrowserAutomationPreviewTool,A2ATool→A2APreviewTool,SharepointAgentTool→SharepointPreviewTool,MicrosoftFabricAgentTool→MicrosoftFabricPreviewTool.
Breaking: The 2.0.0-beta.4 class renames are breaking. If you reference any *AgentTool class, update to the new suffixed name.
Action: Upgrade to @azure/ai-projects@2.0.0-beta.4 and search your codebase for the renamed classes. This mirrors the same rename convention coming to the Python 2.0.0b4 release.
Stay Connected
Plenty more is in flight — the February edition will land on a much shorter timeline. In the meantime, explore any of these models directly in the Microsoft Foundry model catalog or join the developer community to share what you’re building.
0 comments
Be the first to start the discussion.