What's New for Developers: Tools and Features

TL;DR

Long-context GPT-4.1, GPT-image-1, new o-series reasoning and GPT-4o audio models headline this month’s releases. On the agent side we get cross-cloud A2A, BYO thread storage, an MCP server starter, and a turnkey AI Red Team. Developers also gain a VS Code extension, richer evaluation metrics, persistent memory via Mem0, a full RAG demo suite, and new Content Understanding & Document Intelligence endpoints—everything you need to build, test, and ship safer GenAI apps on a single platform.

Join the new Azure AI Foundry Developer Forum on GitHub

We launched the new GitHub Discussions Developer Forum last week and we’re inviting you to connect with engineers and peers to ask questions, showcase your projects, vote in polls, and shape the roadmap—all in one place. Bring your ideas, code, and curiosity!

Open Discussions

Models

GPT-4.1 One-Million-Token Context

GPT-4.1 (and its nano/mini variants) lifts Azure’s context ceiling to 1 million tokens, letting you pass entire codebases or multi-gigabyte corpora in one shot, while retaining GPT-4-class reasoning and function calling. That means fewer chunk-and-stitch hacks, simpler prompts, and major latency savings for large-document RAG or full-repo code reviews. Learn how to call it with the Responses API:

Learn more

GPT-image-1 Text-&-Image Generation

GPT-image-1 arrives in limited preview with sharper fidelity, reliable text rendering, editing / in-painting, and image-as-input support—so you can build marketing creatives, design mocks, and visual KB answers directly in Foundry using the same REST patterns as DALLE 3.

Happy Building!

Generate images

o4-mini & o3 Reasoning Models

Need faster, cheaper reasoning? The new o-series pairs GPT-4-level logical depth with lower latency and aggressive pricing, making them ideal for agent planning, re-ranking, or embedded analytics where every millisecond (and penny) counts.

A graph showing improvements reasoning models demonstrate across challenging academic benchmarks such as GPQA Diamond, Codeforces, and AIME 2024. — Source: https://openai.com/index/introducing-o3-and-o4-mini/

Learn more

GPT-4o Audio (Transcribe & TTS)

gpt-4o-transcribe, gpt-4o-mini-transcribe, and gpt-4o-mini-tts bring high-quality speech-to-text and controllable text-to-speech to Azure. Stream captions, build multilingual voice bots, or generate audio replies—all via familiar /audio and /realtime endpoints.

Demonstration of the Azure OpenAI TTS Soundboard

Get started

Agents

AI Red Teaming Agent (Preview)

Built atop Microsoft’s PyRIT toolkit, this agent fires automated jailbreak and prompt-injection probes at your models, scores Attack Success Rate, and logs findings into Foundry dashboards—making shift-left safety a one-command reality.

Learn more

Semantic Kernel + A2A Interop

A new plug-in teaches Semantic Kernel to speak Google’s Agent-to-Agent JSON-RPC protocol, enabling secure cross-cloud agent collaboration—exchange context, not credentials, and orchestrate multi-modal workflows spanning Azure, GCP, and OSS runtimes.

Learn more

MCP Server Starter (Typescript)

Spin up an MCP-compliant server in minutes; the template wires Azure AI Agents to Claude Desktop (or any MCP client) via standard JSON messages—no bespoke glue code required.

Get started

BYO Thread Storage + Monitor

Agent Service now lets you store conversation threads in your own Cosmos DB and surfaces run metrics in Azure Monitor—boosting data residency compliance and giving SREs first-class observability out of the box.

Quickstart

Tools

VS Code Foundry Extension

Test models, deploy agents, and copy sample code without leaving VS Code—goodbye portal context-switching, hello faster inner loops.

Learn more

Quality & Safety Evaluators

Four new quality metrics (intent-resolution, tool-call accuracy, task adherence, completeness) plus code-vulnerability and ungrounded-attribute safety checks plug straight into CI/CD so every build ships with score gates.

Learn more

Fine-tuning support for GPT-4.1, GPT-4.1-mini, Phi-4 and Mistral models and more

The new GPT-4.1 and GPT-4.1-mini models now support fine-tuning, offering enhanced reasoning and instruction-following capabilities, making them ideal for complex enterprise applications. Additionally, expanded serverless fine-tuning support for models like Mistral, Phi, and NTT across all U.S. regions where base model inferencing is available, improving latency and compliance with data residency requirements. The Evaluation API now supports code-first grading, allowing developers to score model outputs using built-in or custom graders, which simplifies A/B testing, regression validation, and iterative model refinement.

Learn more

Mem0 Persistent Memory Layer

Mem0 + Azure AI Search lets assistants remember user details across sessions via semantic retrieval—boosting personalization without extra infra.

Get started

Content Understanding 2024-12-01 Preview

The new API adds generative & classification fields, faster video segmentation, and multi-analyzer docs, producing structured JSON ready for LLM ingestion across docs, audio, and video.

Read the docs

Document Intelligence v4.0 Container

Run the Layout model on-prem or at the edge via new v4.0 containers—perfect for air-gapped PDF/OCR scenarios that need local processing yet Azure-compatible APIs.

Read the docs

Happy building—and let us know what you ship with #AzureAIFoundry!

What’s new in Azure AI Foundry | April 2025

TL;DR

Join the new Azure AI Foundry Developer Forum on GitHub