April 15th, 2026
heartmind blown4 reactions

Choosing the Right Azure Hosting Model for AI Agents: A Deep Dive into Foundry Hosted Agents

AI agents are quickly moving from experiments to production‑critical components of modern applications. But while many teams know how to build agents, far fewer are confident they’re hosting them on the right foundation.

Most organizations start by deploying agents the same way they deploy microservices—containers, functions, or app services. That approach works initially. But as agents evolve to support long‑running conversations, tool orchestration, stateful workflows, and continuous iteration, infrastructure decisions start to matter in new ways.

Azure offers multiple ways to host AI agents, each with different tradeoffs across control, operational overhead, scalability, and agent‑native capabilities. Choosing the right option isn’t about finding a single “best” service—it’s about matching your agent’s requirements to the right hosting model.

In this post, we’ll walk through the Azure agent hosting landscape—from serverless and container‑based options to fully managed agent platforms—and then take a deeper look at Microsoft Foundry Hosted Agents, a purpose‑built hosting model designed specifically for agentic workloads. Along the way, we’ll highlight when each option makes sense and how to choose the one that best aligns with your architecture, team, and scale.

The Azure Agent Hosting Landscape

When deploying AI agents to production, you have several Azure options to consider like Azure Container Apps, Azure Kubernetes Service, Azure App Service, Azure Functions, Microsoft Foundry Agents and Microsoft Foundry Hosted Agents. Let’s examine each option before diving deep into Hosted Agents.

Hosting Option Control Complexity Best For Strengths Considerations
Azure Container Apps High Medium General containerized workloads, custom orchestration, full control over your container runtime while avoiding Kubernetes complexity. Container control, KEDA autoscaling, VNet/Dapr Manual observability, self-managed conversation state
Azure Kubernetes Service Very High High Enterprise-scale, strict compliance,multi-cluster deployment, or complex networking Max flexibility, multi-cluster, GitOps High operational overhead, K8s expertise required
Azure App Service Medium Low Simple web-based agents, PaaS deployment model with no container orchestration Managed PaaS, CI/CD slots, Entra Auth Instance-based scaling, no agent abstractions
Azure Functions Low Low Event-driven agents, serverless triggers, Pay-per-execution, rich built-in triggers Execution time limits, cold starts
Microsoft Foundry Agents Low Very Low Prompt-based agents, built-in tools, managed, code-optional agent configured through the portal or SDK—no containers, no infrastructure. Zero infra, built-in tools, portal-driven No custom framework, limited to model + tools
Microsoft Foundry Hosted Agents Medium Very Low Custom frameworks, agent-native deployment, simplicity of managed infrastructure with the flexibility of custom agent code, BYO framework, built-in OTel, scale-to-zero Preview SLAs, no private networking yet

 

Deep Dive: Microsoft Foundry Hosted Agents

For teams that want the simplicity of managed infrastructure with the flexibility of custom agent code, Hosted Agents in Microsoft Foundry represent the sweet spot. This is where Azure meets agent-native deployment.

What Are Hosted Agents?

Hosted Agents are containerized agentic AI applications that run on Foundry Agent Service—Microsoft’s managed platform for AI agents. Unlike traditional container hosting, Hosted Agents provide:

  • Agent-native abstractions: Conversations, responses, and tool calls are first-class concepts
  • Managed lifecycle: Create, start, update, stop, and delete with simple API calls
  • Built-in observability: OpenTelemetry traces, metrics, and logs out of the box
  • Framework support: Bring LangGraph, Microsoft Agent Framework, or custom code

Hosted Agent Architecture

The Hosting Adapter: Bridging Frameworks to Foundry

The secret sauce of Hosted Agents is the Hosting Adapter—a framework abstraction layer that exposes your agent as an HTTP service with built-in Foundry integration.

For LangGraph agents, you simply wrap your graph with the adapter:

from azure.ai.agentserver.langgraph import from_langgraph

# Your LangGraph agent
graph = StateGraph(MessagesState)
graph.add_node("agent", call_model)
graph.add_node("tools", tool_node)
# ... build your graph

app = graph.compile()

# Wrap with the hosting adapter - that's it!
if __name__ == "__main__":
    from_langgraph(app).run()

For Microsoft Agent Framework:

from azure.ai.agentserver.agentframework import from_agent_framework

agent = ChatAgent(
    chat_client=AzureAIAgentClient(...),
    instructions="You are a helpful assistant.",
    tools=[get_local_time],
)

if __name__ == "__main__":
    from_agent_framework(agent).run()

 

The hosting adapter automatically provides:

Capability What It Handles
Protocol Translation Converts Foundry Responses API ↔ your framework’s format
Conversation Management Message serialization, history management
Streaming Server-sent events for real-time responses
Observability TracerProvider, MeterProvider, LoggerProvider via OpenTelemetry
Local Testing Runs on localhost:8088 for local development

Building and Deploying a Hosted Agent

Let’s walk through deploying a LangGraph calculator agent to Microsoft Foundry.

Step 1: Define Your Agent Code

Create main.py with your LangGraph agent:

from langchain_core.tools import tool
from langgraph.graph import MessagesState, StateGraph, START, END
from azure.ai.agentserver.langgraph import from_langgraph

@tool
def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b

@tool
def add(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b

# Build the graph
tools = [multiply, add]
# ... graph construction ...

app = graph.compile()

if __name__ == "__main__":
    from_langgraph(app).run()

Step 2: Create the Agent Manifest

Define agent.yaml to describe your agent:

name: CalculatorAgent
description: A LangGraph agent that performs arithmetic calculations.
metadata:
  tags:
    - calculator
    - math
template:
  name: CalculatorAgentLG
  kind: hosted                    # This makes it a Hosted Agent
  protocols:
    - protocol: responses
      version: v1
  environment_variables:
    - name: AZURE_OPENAI_ENDPOINT
      value: ${AZURE_OPENAI_ENDPOINT}
    - name: AZURE_AI_MODEL_DEPLOYMENT_NAME
      value: "{{chat}}"           # Resolved at runtime
resources:
  - kind: model
    id: gpt-4o
    name: chat

Step 3: Containerize with Docker

FROM python:3.11-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

# The hosting adapter listens on port 8088
EXPOSE 8088

CMD ["python", "main.py"]

Step 4: Deploy with Azure Developer CLI

The azd ai agent extension streamlines deployment:

# Install the extension
azd ext install azure.ai.agents

# Initialize your project
azd ai agent init

# Build, push, and deploy in one command
azd up

 

This single command: 1. Builds your container image 2. Pushes to Azure Container Registry 3. Creates the Foundry project (if needed) 4. Deploys model endpoints 5. Creates and starts your Hosted Agent

Managing Hosted Agent Lifecycle

Once deployed, manage your agent with the Azure CLI:

# Start an agent (with scale-to-zero support)
az cognitiveservices agent start \
    --account-name myFoundry \
    --project-name myProject \
    --name CalculatorAgent \
    --agent-version 1 \
    --min-replicas 0 \
    --max-replicas 3

# Update replicas without creating a new version
az cognitiveservices agent update \
    --min-replicas 1 \
    --max-replicas 5

# Stop the agent
az cognitiveservices agent stop \
    --account-name myFoundry \
    --project-name myProject \
    --name CalculatorAgent \
    --agent-version 1

Invoking Your Hosted Agent

Use the Azure AI Projects SDK to invoke your agent:

from azure.ai.projects import AIProjectClient
from azure.identity import DefaultAzureCredential

client = AIProjectClient(
    endpoint="https://your-project.services.ai.azure.com/api/projects/your-project",
    credential=DefaultAzureCredential(),
    allow_preview=True,
)

# Get the OpenAI-compatible client
openai_client = client.get_openai_client()

# Invoke the agent
response = openai_client.responses.create(
    input=[{"role": "user", "content": "What is 25 * 17 + 42?"}],
    extra_body={
        "agent_reference": {
            "name": "CalculatorAgent",
            "type": "agent_reference"
        }
    }
)

print(response.output_text)
# Output: 25 * 17 = 425, then 425 + 42 = 467

 

Built-in Observability

Hosted Agents automatically export telemetry to Application Insights (or any OpenTelemetry collector):

# Traces are exported automatically - no code changes needed!
# View in Azure Portal → Application Insights → Transaction Search

You can also stream container logs for debugging:

curl -N "https://{endpoint}/api/projects/{project}/agents/{agent}/versions/1/containers/default:logstream?kind=console&tail=100" \
  -H "Authorization: Bearer $(az account get-access-token --resource https://ai.azure.com --query accessToken -o tsv)"

Conversation Management

Unlike raw container deployments where you manage state yourself, Hosted Agents integrate with Foundry’s conversation system:

# Create a persistent conversation
conversation = openai_client.conversations.create()

# First turn
response1 = openai_client.responses.create(
    conversation=conversation.id,
    extra_body={"agent_reference": {"name": "CalculatorAgent", "type": "agent_reference"}},
    input="Remember: my favorite number is 42.",
)

# Later turn - agent has context
response2 = openai_client.responses.create(
    conversation=conversation.id,
    extra_body={"agent_reference": {"name": "CalculatorAgent", "type": "agent_reference"}},
    input="Multiply my favorite number by 10.",
)
# Agent knows 42 from the previous turn

 

Resource Scaling Options

Hosted Agents support flexible resource allocation:

CPU Memory
0.25 0.5 Gi
0.5 1.0 Gi
1.0 2.0 Gi
2.0 4.0 Gi
4.0 8.0 Gi

And horizontal scaling with replica configuration:

Setting Description
min-replicas: 0 Scale to zero when idle (cost savings, cold start on first request)
min-replicas: 1 Always warm (no cold starts, steady cost)
max-replicas: 5 Maximum horizontal scale (preview limit)

Publishing to Channels

Once your agent is production-ready, publish it to multiple channels:

  • Web Application Preview: Shareable demo interface
  • Microsoft 365 Copilot & Teams: Appear in the agent store
  • Stable API Endpoint: Consistent REST API for custom apps
# Publishing creates a dedicated agent identity
# (separate from the project managed identity)

Important: After publishing, reconfigure RBAC permissions since the project managed identity doesn’t transfer to the new agent identity.

Decision Framework: Choosing Your Hosting Option

Use this flowchart to select the right option:

DecisionTree image

What’s Next

AI agents introduce new architectural considerations that traditional application hosting models don’t fully address. Conversations span multiple turns, tools execute on behalf of users, state must persist reliably, and observability needs to extend beyond simple request/response metrics.

Azure gives you flexibility across this spectrum. You can run agents on Functions for short, event‑driven tasks, on App Service or Container Apps for familiar PaaS and serverless container workflows, or on AKS when maximum control and enterprise customization are required. Each option has its place—and understanding those tradeoffs is key to building reliable agent‑powered systems.

Microsoft Foundry Hosted Agents represent a shift toward an agent‑centric hosting model: one that abstracts infrastructure management while preserving the freedom to bring your own frameworks, code, and tools. By combining managed lifecycle, built‑in observability, conversation management, and seamless integration with Azure AI services, Hosted Agents let teams focus more on agent behavior and less on platform plumbing.

As agentic applications become more central to intelligent systems, hosting choices will increasingly influence velocity, reliability, and long‑term maintainability. Whether you’re deploying your first agent or scaling a fleet of them across environments, choosing the right Azure hosting model early will pay dividends as your solution grows.

Ready to start building? Check out the Foundry Samples repository for complete working examples, deploy a Hosted Agent, and experience what agent‑native hosting on Azure looks like in practice.

 

Resources:

 

Author

Principle Partner Solution Architect

0 comments

Leave a comment

Your email address will not be published. Required fields are marked *