TL;DR
Long-context GPT-4.1, GPT-image-1, new o-series reasoning and GPT-4o audio models headline this month’s releases. On the agent side we get cross-cloud A2A, BYO thread storage, an MCP server starter, and a turnkey AI Red Team. Developers also gain a VS Code extension, richer evaluation metrics, persistent memory via Mem0, a full RAG demo suite, and new Content Understanding & Document Intelligence endpoints—everything you need to build, test, and ship safer GenAI apps on a single platform.
Join the new Azure AI Foundry Developer Forum on GitHub
We launched the new GitHub Discussions Developer Forum last week and we’re inviting you to connect with engineers and peers to ask questions, showcase your projects, vote in polls, and shape the roadmap—all in one place. Bring your ideas, code, and curiosity!
Models
GPT-4.1 One-Million-Token Context
GPT-4.1 (and its nano/mini variants) lifts Azure’s context ceiling to 1 million tokens, letting you pass entire codebases or multi-gigabyte corpora in one shot, while retaining GPT-4-class reasoning and function calling. That means fewer chunk-and-stitch hacks, simpler prompts, and major latency savings for large-document RAG or full-repo code reviews. Learn how to call it with the Responses API:
GPT-image-1 Text-&-Image Generation
GPT-image-1 arrives in limited preview with sharper fidelity, reliable text rendering, editing / in-painting, and image-as-input support—so you can build marketing creatives, design mocks, and visual KB answers directly in Foundry using the same REST patterns as DALLE 3.
Above the main workspace, a prominent banner stretches across the foundry floor, displaying the words “Happy Building” in bold, industrial-style lettering. The sign adds a touch of positivity and motivation amidst the intense industrial setting.
Happy Building!
o4-mini & o3 Reasoning Models
Need faster, cheaper reasoning? The new o-series pairs GPT-4-level logical depth with lower latency and aggressive pricing, making them ideal for agent planning, re-ranking, or embedded analytics where every millisecond (and penny) counts.
GPT-4o Audio (Transcribe & TTS)
gpt-4o-transcribe
, gpt-4o-mini-transcribe
, and gpt-4o-mini-tts
bring high-quality speech-to-text and controllable text-to-speech to Azure. Stream captions, build multilingual voice bots, or generate audio replies—all via familiar /audio
and /realtime
endpoints.
Agents
AI Red Teaming Agent (Preview)
Built atop Microsoft’s PyRIT toolkit, this agent fires automated jailbreak and prompt-injection probes at your models, scores Attack Success Rate, and logs findings into Foundry dashboards—making shift-left safety a one-command reality.
Semantic Kernel + A2A Interop
A new plug-in teaches Semantic Kernel to speak Google’s Agent-to-Agent JSON-RPC protocol, enabling secure cross-cloud agent collaboration—exchange context, not credentials, and orchestrate multi-modal workflows spanning Azure, GCP, and OSS runtimes.
MCP Server Starter (Typescript)
Spin up an MCP-compliant server in minutes; the template wires Azure AI Agents to Claude Desktop (or any MCP client) via standard JSON messages—no bespoke glue code required.
BYO Thread Storage + Monitor
Agent Service now lets you store conversation threads in your own Cosmos DB and surfaces run metrics in Azure Monitor—boosting data residency compliance and giving SREs first-class observability out of the box.
Tools
VS Code Foundry Extension
Test models, deploy agents, and copy sample code without leaving VS Code—goodbye portal context-switching, hello faster inner loops.
Quality & Safety Evaluators
Four new quality metrics (intent-resolution, tool-call accuracy, task adherence, completeness) plus code-vulnerability and ungrounded-attribute safety checks plug straight into CI/CD so every build ships with score gates.
Fine-tuning support for GPT-4.1, GPT-4.1-mini, Phi-4 and Mistral models and more
The new GPT-4.1 and GPT-4.1-mini models now support fine-tuning, offering enhanced reasoning and instruction-following capabilities, making them ideal for complex enterprise applications. Additionally, expanded serverless fine-tuning support for models like Mistral, Phi, and NTT across all U.S. regions where base model inferencing is available, improving latency and compliance with data residency requirements. The Evaluation API now supports code-first grading, allowing developers to score model outputs using built-in or custom graders, which simplifies A/B testing, regression validation, and iterative model refinement.
Mem0 Persistent Memory Layer
Mem0 + Azure AI Search lets assistants remember user details across sessions via semantic retrieval—boosting personalization without extra infra.
Content Understanding 2024-12-01 Preview
The new API adds generative & classification fields, faster video segmentation, and multi-analyzer docs, producing structured JSON ready for LLM ingestion across docs, audio, and video.
Document Intelligence v4.0 Container
Run the Layout model on-prem or at the edge via new v4.0 containers—perfect for air-gapped PDF/OCR scenarios that need local processing yet Azure-compatible APIs.
Happy building—and let us know what you ship with #AzureAIFoundry!
0 comments