What's new in Microsoft Foundry | Dec 2025 & Jan 2026

Author’s note: So.. it has been a bit. I have to level with you — I returned from paternity leave in January and between Microsoft Ignite 2025 and present day, things have changed a lot. Without further adieu, here is your monthly (late) drop for all things new with Microsoft Foundry. Expect following editions to be back on track going forward. Thanks for your patience!

TL;DR

December 2025 was one of the biggest months in Microsoft Foundry history. Here’s everything that shipped:

GPT‑5.2 (GA): New enterprise reasoning standard — top benchmark scores across math, science, coding, and multimodal tasks; available as gpt-5.2 and gpt-5.2-chat-latest.
GPT‑5.1 Codex Max (GA): 77.9% on SWE-Bench, 400K context, 50+ languages — built for autonomous multi-agent coding pipelines, PR generation, and CI/CD integration.
Mistral Large 3 (Public Preview): Apache 2.0, 41B active / 675B total parameters; $0.50 / $1.50 per million tokens. Strong instruction following and multimodal reasoning.
DeepSeek V3.2 + V3.2‑Speciale (Public Preview): 128K context, up to 3× faster reasoning via Sparse Attention; Speciale drops tool calling entirely for maximum reasoning accuracy.
Kimi‑K2 Thinking (Public Preview): Moonshot AI’s deep reasoning model with a 256K context window, now Direct from Azure.
Cohere Rerank 4 (Fast + Pro): Cross-encoding reranker for RAG pipelines; 100+ languages, serverless pay-as-you-go.
GPT‑image‑1.5 (GA): 4× faster generation, ~20% lower cost vs. GPT‑image‑1; adds inpainting and face preservation.
FLUX.2 [pro] (Public Preview): Black Forest Labs’ next-gen image model with multi-reference support, improved text rendering, and enterprise SLAs on Azure.
Audio models (GA, Dec 15): Realtime Mini, ASR (gpt-4o-mini-transcribe), and TTS (gpt-4o-mini-tts) — all GA with significant accuracy and latency improvements.
Fine-tuning base models: Ministral 3B, Qwen3 32B, OSS-20B, and Llama 3.3 70B now available for serverless fine-tuning.
⚠️ AzureML SDK v1 EOL: June 30, 2026 — migrate to SDK v2 now; CLI v1 already sunset September 2025.
azure-ai-projects v2 beta: Agents, inference, evaluations, and memory are now unified in a single package — the azure-ai-agents dependency is gone; 2.0.0b3 shipped January 6, 2026.
Memory in Foundry Agent Service (Public Preview): Managed long-term memory store with automatic extraction, consolidation, and retrieval across agent sessions. Free during preview; pay only for the underlying model calls.
Agent-to-Agent (A2A) Tool (Preview): Let Foundry agents call any A2A-protocol endpoint with explicit auth and clean call/response semantics — the structured evolution of Connected Agents.
Foundry MCP Server (Preview): Cloud-hosted MCP at mcp.ai.azure.com, live since December 3. Connect from VS Code, Visual Studio, or the Foundry portal — zero local process management, Entra auth included.
Microsoft Foundry for VS Code — January 2026: Multi-workflow visualizer, all prompt agents testable in Playground, and code samples for every agent type.

Models

GPT‑5.2 — The New Enterprise Reasoning Standard

GPT‑5.2 is now generally available in Microsoft Foundry. Built for multi-step problem solving, long-context understanding, and agentic tool-calling, GPT‑5.2 achieves top scores across math, science, coding, and multimodal benchmarks. Whether you’re orchestrating agents, reasoning over large document sets, or building production-grade pipelines, GPT‑5.2 delivers more coherent, compliant, and shippable outputs than any prior generation.

gpt-5.2 — primary reasoning model for complex enterprise tasks
gpt-5.2-chat-latest — optimized for conversational and everyday professional workflows

Read Announcement

GPT‑5.1 Codex Max — AI for Autonomous Enterprise Coding

GPT‑5.1 Codex Max is now generally available in Microsoft Foundry, purpose-built for engineering-scale coding tasks. It achieves 77.9% on SWE-Bench, supports a 400K token context window across 50+ programming languages, and is designed end-to-end for multi-agent coding workflows — from refactoring legacy .NET and Java apps to automated pull requests, secure API generation, and CI/CD pipeline integration. It can be triggered directly from the terminal, VS Code, or GitHub Actions runners.

Learn more

Mistral Large 3 — Open-Weight Enterprise Intelligence

Mistral Large 3 is now in public preview in Microsoft Foundry, released under the Apache 2.0 license — meaning free for commercial use without attribution. With 41B active parameters in a sparse mixture-of-experts architecture (675B total), it delivers strong instruction following, long-context comprehension, and multimodal reasoning, with a straightforward $0.50 / $1.50 per million input/output token price point.

Get Started

DeepSeek V3.2 and V3.2‑Speciale

DeepSeek V3.2 and DeepSeek V3.2‑Speciale launched in public preview on December 15, 2025. Both feature a 128K context window and DeepSeek Sparse Attention for up to 3× faster reasoning paths. The Speciale variant is purpose-tuned for maximum reasoning accuracy — it omits native function/tool calling entirely to reserve all compute for pure reasoning, making it ideal for research labs, scientific workflows, and high-stakes evaluation pipelines.

Learn more

Kimi‑K2 Thinking — Deep Reasoning from Moonshot AI

Kimi‑K2 Thinking from Moonshot AI is now in public preview as a Direct from Azure model. With a 256K context window, it excels at deep reasoning, tool orchestration, and complex multi-step problem solving — a strong addition to the growing catalog of non-OpenAI frontier reasoning models on Foundry.

Get Started

Cohere Rerank 4 — State-of-the-Art Retrieval for RAG

Cohere Rerank v4.0 — available in Fast and Pro variants — is now in the Microsoft Foundry model catalog, deployable via pay-as-you-go serverless endpoints. Designed to improve search relevance and reduce LLM hallucinations in RAG pipelines, Rerank 4 uses cross-encoding AI to re-sort retrieved documents by semantic similarity to your query. Supports 100+ languages and drops into existing keyword or semantic retrieval stacks with minimal code changes.

Explore in Catalog

Images

GPT‑image‑1.5 — Faster, Higher-Quality Image Generation

GPT‑image‑1.5 is now generally available in Microsoft Foundry, delivering up to 4× faster generation and approximately 20% lower API costs compared to GPT‑image‑1. Improvements span text-to-image generation, image-to-image transformation, inpainting, and face preservation — with output resolutions up to 1024×1536. Access at launch is gated for enterprise customers (MCA-E and EA).

Learn more

FLUX.2 [pro] from Black Forest Labs

FLUX.2 [pro] from Black Forest Labs is now in public preview in Microsoft Foundry. Building on FLUX.1, it adds multi-reference support (up to 8 images), improved text rendering for infographics and UI mockups, and enhanced adherence to complex, multi-part prompts. Available with Microsoft-backed SLAs, Responsible AI controls, and global standard deployment across Azure regions.

Explore in Catalog

Audio

Updated Audio Models: Realtime Mini, ASR, and TTS

Three new audio models reached general availability on December 15, 2025, raising the bar across the real-time voice stack:

Model	What’s new
gpt-realtime-mini-2025-12-15	Feature parity with full gpt-realtime in instruction-following and function-calling; new voices Marin and Cedar; glitch-free audio
gpt-4o-mini-transcribe-2025-12-15	~50% lower WER on English benchmarks; better multilingual support; up to 4× fewer silence hallucinations in noisy environments
gpt-4o-mini-tts-2025-12-15	More natural, human-like multilingual speech synthesis with reduced artifacts

All three are API-only deployments accessible through the Azure OpenAI endpoint in Microsoft Foundry.

Learn more

Fine-Tuning

New Open-Source Base Models for Fine-Tuning

Microsoft Foundry expanded its fine-tuning catalog with four new open-source base models on serverless infrastructure — pre-announced at Ignite 2025 and now live:

Model	Best for
Ministral 3B	Lightweight, cost-sensitive scenarios
Qwen3 32B	Multilingual applications
OSS-20B	Balanced enterprise workloads
Llama 3.3 70B	Complex reasoning at scale

Fine-tuning is available on either serverless or managed compute, with Microsoft’s security, compliance, and Responsible AI guardrails applied uniformly across all models.

Get Started

Agents

Memory in Foundry Agent Service (Public Preview)

Most agents today are stateless — every conversation starts from zero. Memory in Foundry Agent Service is a fully managed, long-term memory store natively integrated with the agent runtime, that extracts, consolidates, and retrieves user preferences and context across sessions and devices — no custom embedding database or retrieval pipeline required.

The process runs in four phases: Extract (preferences, facts, and key context from each conversation turn), Consolidate (LLM merges duplicates and resolves conflicts), Retrieve (hybrid search surfaces relevant memories at conversation start, with core facts like allergies or preferences injected immediately), and Customize (the user_profile_details parameter focuses extraction on what matters for your specific use case).

Enable it with a single click in the Foundry portal, or via SDK:

from azure.ai.projects.models import MemoryStoreDefaultDefinition, MemoryStoreDefaultOptions

definition = MemoryStoreDefaultDefinition(
    chat_model="gpt-5",
    embedding_model="text-embedding-3-small",
    options=MemoryStoreDefaultOptions(
        user_profile_enabled=True,
        user_profile_details="Food preferences for a meal planning agent",
        chat_summary_enabled=True,
    ),
)
memory_store = project_client.memory_stores.create(
    name="my_memory_store",
    description="Example memory store for conversations",
    definition=definition,
)

Free during preview — you pay only for the underlying chat and embedding model calls.

Read the Deep Dive

Agent-to-Agent (A2A) Tool (Preview)

The A2A tool adds inter-agent communication to Foundry agents — point it at any endpoint that implements the A2A protocol, and your agent can invoke it as a first-class tool. This is the structured evolution of “Connected Agents” in Foundry Classic, with cleaner semantics and explicit authentication options: key-based, OAuth2, or Entra Agent Identity.

The distinction that matters for system design:

A2A tool: Agent A calls Agent B; B’s answer returns to A; A synthesizes the final user response. Agent A stays in control of the thread.
Multi-agent workflow: Agent B takes full ownership of the thread from the point of handoff — Agent A is out of the loop.

Configure via the Foundry portal (Tools → Connect tool → Custom → Agent2Agent) or in code:

from azure.ai.projects.models import A2ATool, PromptAgentDefinition

a2a_conn = project_client.connections.get(os.environ["A2A_PROJECT_CONNECTION_NAME"])
agent = project_client.agents.create_version(
    agent_name="my-agent",
    definition=PromptAgentDefinition(
        model=os.environ["FOUNDRY_MODEL_DEPLOYMENT_NAME"],
        instructions="You are a helpful assistant.",
        tools=[A2ATool(project_connection_id=a2a_conn.id)],
    ),
)

A2A Tool Docs

Tools

Computer Use (Preview)

Computer Use lets Foundry agents visually interact with desktop and browser environments using the computer-use-preview model. Instead of calling structured APIs, the agent receives a screenshot and acts — click, type, scroll, navigate. Use it for UI testing automation, navigating legacy web apps that predate REST APIs, extracting data from visual-only interfaces, and RPA-style workflows where brittle CSS selectors previously dominated.

The .NET SDK (Azure.AI.Agents.Persistent 1.2.0-beta.8, December 2025) added first-class Computer Use tool support. Python and TypeScript support is in active development — track the changelogs.

Computer Use Docs

Platform

Foundry MCP Server (Preview)

The Foundry MCP Server is a cloud-hosted, fully managed MCP endpoint at https://mcp.ai.azure.com — the production successor to the experimental local MCP server shipped at Build 2025. It went live December 3, 2025. No local uptime to manage. Connect it from VS Code (mcp.json), Visual Studio 2026 Insiders, or add it as a tool connection in the Foundry portal with one click.

Conversational workflows you can drive through it today:

Model operations: Browse the catalog, compare benchmarks, get upgrade recommendations based on capabilities and deprecation schedules, check quota headroom, deploy and deprecate deployments
Agent management: Create, update, and version agents without leaving your editor
Evaluation pipelines: Chain evaluation_dataset_create → evaluation_create → evaluation_comparison_create for automated quality loops in a single chat thread

Security: Entra ID authentication end-to-end (OBO tokens scoped to https://mcp.ai.azure.com). Every operation runs under the signed-in user’s Azure RBAC permissions with full audit logging. Tenant admins control access via Azure Policy Conditional Access.

// .vscode/mcp.json
{
  "servers": {
    "foundry-mcp": { "type": "http", "url": "https://mcp.ai.azure.com" }
  }
}

Read Announcement

Microsoft Foundry for VS Code — January 2026 Update

The Microsoft Foundry extension for VS Code shipped a focused update on January 20, 2026:

Multi-workflow Visualizer: View, navigate, and debug multiple interconnected workflows in a single project panel — previously limited to one at a time.
Prompt agents in Playground: All prompt agents in your project are now surfaced directly in the Playground for interactive testing. No context-switching.
Open code for any agent type: The extension generates and opens sample code for prompt agents, YAML-based workflows, hosted agents, and Foundry classic agents. Drop it straight into your existing project.
Separated v1/v2 resource view: Classic Foundry resources and new-gen agents display in clearly distinct views, eliminating the common confusion about which generation a resource belongs to.

Read the Update

New Foundry Experience at ai.azure.com

The new unified Foundry portal experience — available at ai.azure.com via the “New Foundry” toggle — introduced a meaningfully different mental model from Foundry Classic:

The Tools tab is the single entry point for discovering, connecting, and managing agentic integrations: MCP servers, A2A endpoints, Azure AI Search, SharePoint, Fabric, and more — across more than 1,400 business systems.
Multi-agent workflows are built visually in the portal, distinct from the single-agent flow of Foundry Classic.
The separated v1/v2 resource view ensures Classic and new-gen agents don’t share ambiguous panels.

Open Microsoft Foundry

Deprecation Notice

AzureML SDK v1 — End of Life June 30, 2026

The Azure Machine Learning SDK v1 reaches end of support on June 30, 2026. After this date, existing workflows may face security risks and breaking changes without active Microsoft support. Note that the AzureML CLI v1 extension already reached end of support on September 30, 2025. If you’re still running v1-based training pipelines, the SDK v2 migration guide is the place to start — v2 brings a significantly improved authoring experience, YAML-first job definitions, and continued investment from the Azure ML team.

Migration Guide

SDK & Language Changelog (Dec 2025 – Jan 2026)

All Microsoft Foundry SDK development is consolidating into a single azure-ai-projects package per language. Agents, inference, evaluations, and memory operations that previously lived in separate packages (azure-ai-agents, etc.) are unified under the azure-ai-projects v2 beta line. All active development happens on preview/beta branches — pin accordingly.

Python

azure-ai-projects 2.0.0b3 (2026-01-06)

The v2 line is the new canonical SDK for everything Foundry: agents (now built on the OpenAI Responses protocol), evaluations, memory stores, and model inference. This release bundles openai and azure-identity as direct dependencies — no separate installs required.

Key changes across the 2.0.0b1–b3 window:

Agents on Responses protocol: AIProjectClient now handles agent ops directly; azure-ai-agents dependency dropped.
get_openai_client() now returns an openai.OpenAI client pre-configured for your Foundry project endpoint (Responses API).
Class renames: AgentObject → AgentDetails, MemoryStoreObject → MemoryStoreDetails, AgentVersionObject → AgentVersionDetails.
Tracing overhaul: span names, attribute keys, and operation names changed to align with OpenTelemetry gen_ai.* conventions (e.g. gen_ai.provider.name is now "microsoft.foundry").
New operations: .memory_stores, .evaluation_rules, .evaluators, .insights, .schedules on AIProjectClient.

Action: Upgrade to azure-ai-projects==2.0.0b3. Remove any standalone azure-ai-agents pins — agent creation and runs are now first-class methods on AIProjectClient.

Python Changelog

azure-ai-evaluation 1.14.0 (2026-01-05)

Evaluation still ships as a standalone package while consolidation completes — expect this to merge into azure-ai-projects in a future beta. 1.14.0 is primarily a bug-fix release: corrected binary scoring for CodeVulnerability and UngroundedAttributes evaluators in the RedTeam scanner, and fixed GroundednessEvaluator not honoring is_reasoning_model when the query parameter was supplied.

Evaluation Changelog

.NET

Azure.AI.Agents.Persistent 1.2.0-beta.8 (2025-12-01)

Added first-class Computer Use support for agents, letting you wire up computer-use-preview model runs directly from the persistent agents client. PersistentAgentsChatClient got improved error handling for incomplete-state streaming runs.

Breaking: none in this release. Action: Pin Azure.AI.Agents.Persistent to 1.2.0-beta.8 to get Computer Use.

Agents Changelog

Azure.AI.Projects 1.2.0-beta.5 (2025-12-12)

Updated for transitive compatibility with OpenAI 2.8.0, including substantial changes to the [Experimental] Responses API surface. Also fixes file uploading for fine-tuning jobs. The 1.2.0-beta.1 entry (November) is also worth noting if you haven’t upgraded — it introduced the full Microsoft Foundry Agents Service feature set, memory, evaluations, red teaming, schedules, and insights on AIProjectClient.

Breaking: Responses API surface changed with OpenAI 2.8.0 compatibility update — review your [Experimental] Responses code paths. Action: Upgrade to Azure.AI.Projects 1.2.0-beta.5.

Projects Changelog

JavaScript / TypeScript

@azure/ai-projects 2.0.0-beta.2 → 2.0.0-beta.4 (Dec 2025 – Jan 2026)

Three betas landed in quick succession — the highlights:

2.0.0-beta.2 (2025-12-02): Re-added project.telemetry route to restore access to Application Insights connection string (removed in beta.1).
2.0.0-beta.3 (2026-01-09): Fixed response JSON schema deserializer bug.
2.0.0-beta.4 (2026-01-29): Major class renames to align with OpenAI naming conventions — GA tools now use a Tool suffix; preview tools use PreviewTool. Key renames: AzureAISearchAgentTool → AzureAISearchTool, BrowserAutomationAgentTool → BrowserAutomationPreviewTool, A2ATool → A2APreviewTool, SharepointAgentTool → SharepointPreviewTool, MicrosoftFabricAgentTool → MicrosoftFabricPreviewTool.

Breaking: The 2.0.0-beta.4 class renames are breaking. If you reference any *AgentTool class, update to the new suffixed name. Action: Upgrade to @azure/ai-projects@2.0.0-beta.4 and search your codebase for the renamed classes. This mirrors the same rename convention coming to the Python 2.0.0b4 release.

JS/TS Changelog

Stay Connected

Plenty more is in flight — the February edition will land on a much shorter timeline. In the meantime, explore any of these models directly in the Microsoft Foundry model catalog or join the developer community to share what you’re building.

Join the Foundry Community

What’s new in Microsoft Foundry | Dec 2025 & Jan 2026

TL;DR

Models

GPT‑5.2 — The New Enterprise Reasoning Standard

GPT‑5.1 Codex Max — AI for Autonomous Enterprise Coding

Mistral Large 3 — Open-Weight Enterprise Intelligence

DeepSeek V3.2 and V3.2‑Speciale

Kimi‑K2 Thinking — Deep Reasoning from Moonshot AI

Cohere Rerank 4 — State-of-the-Art Retrieval for RAG

Images

GPT‑image‑1.5 — Faster, Higher-Quality Image Generation

FLUX.2 [pro] from Black Forest Labs

Audio

Updated Audio Models: Realtime Mini, ASR, and TTS

Fine-Tuning

New Open-Source Base Models for Fine-Tuning

Agents

Memory in Foundry Agent Service (Public Preview)

Agent-to-Agent (A2A) Tool (Preview)

Tools

Computer Use (Preview)

Platform

Foundry MCP Server (Preview)

Microsoft Foundry for VS Code — January 2026 Update

New Foundry Experience at ai.azure.com

Deprecation Notice

AzureML SDK v1 — End of Life June 30, 2026

SDK & Language Changelog (Dec 2025 – Jan 2026)

Python

.NET

JavaScript / TypeScript

Stay Connected

Author

0 comments

Leave a commentCancel reply

Read next

DPO Fine-Tuning Using Microsoft Foundry SDK

Beyond the Prompt – Why and How to Fine-tune Your Own Models

TL;DR

Models

GPT‑5.2 — The New Enterprise Reasoning Standard

GPT‑5.1 Codex Max — AI for Autonomous Enterprise Coding

Mistral Large 3 — Open-Weight Enterprise Intelligence

DeepSeek V3.2 and V3.2‑Speciale

Kimi‑K2 Thinking — Deep Reasoning from Moonshot AI

Cohere Rerank 4 — State-of-the-Art Retrieval for RAG

Images

GPT‑image‑1.5 — Faster, Higher-Quality Image Generation

FLUX.2 [pro] from Black Forest Labs

Audio

Updated Audio Models: Realtime Mini, ASR, and TTS

Fine-Tuning

New Open-Source Base Models for Fine-Tuning

Agents

Memory in Foundry Agent Service (Public Preview)

Agent-to-Agent (A2A) Tool (Preview)

Tools

Computer Use (Preview)

Platform

Foundry MCP Server (Preview)

Microsoft Foundry for VS Code — January 2026 Update

New Foundry Experience at ai.azure.com

Deprecation Notice

AzureML SDK v1 — End of Life June 30, 2026

SDK & Language Changelog (Dec 2025 – Jan 2026)

Python

.NET

JavaScript / TypeScript

Stay Connected

Author

0 comments

Leave a commentCancel reply

Read next

DPO Fine-Tuning Using Microsoft Foundry SDK

Beyond the Prompt – Why and How to Fine-tune Your Own Models

Stay informed