{"id":5227,"date":"2026-04-08T00:07:55","date_gmt":"2026-04-08T07:07:55","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/agent-framework\/?p=5227"},"modified":"2026-04-14T16:29:58","modified_gmt":"2026-04-14T23:29:58","slug":"ag-ui-multi-agent-workflow-demo","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/agent-framework\/ag-ui-multi-agent-workflow-demo\/","title":{"rendered":"Building a Real-Time Multi-Agent UI with AG-UI and Microsoft Agent Framework Workflows"},"content":{"rendered":"<p class=\"wp-block-paragraph\">Multi-agent systems demo beautifully. Putting them in front of real users is another story.<\/p>\n<p class=\"wp-block-paragraph\">In early prototypes, a terminal or a basic chat window is enough. But once agents start handing off to each other, pausing for approvals, or asking follow-up questions, those interfaces fall apart. Which agent is active? Why is the system waiting? What&#8217;s it about to do on the user&#8217;s behalf? Without answers to those questions, a multi-agent workflow stops feeling like a product and starts feeling opaque.<\/p>\n<p class=\"wp-block-paragraph\">This post shows what a better answer looks like. We&#8217;ll build a customer support workflow that pairs Microsoft Agent Framework (MAF) handoffs with <a href=\"https:\/\/github.com\/ag-ui-protocol\/ag-ui\">AG-UI<\/a>, an open protocol for streaming agent execution events to a frontend over Server-Sent Events (SSE). The result is a real-time UI that shows users what&#8217;s happening, lets them respond when agents need input, and keeps them in control of sensitive actions like issuing refunds.<\/p>\n<hr \/>\n<p><strong>Note:<\/strong> This demo runs on MAF <strong>Python<\/strong> today. C# support for MAF + AG-UI is still in development.<\/p>\n<hr \/>\n<h2 class=\"wp-block-heading\">What You Will Build<\/h2>\n<p class=\"wp-block-paragraph\">The demo is a customer support workflow with three specialized agents:<\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Triage Agent<\/strong> analyzes the customer&#8217;s request and routes to the right specialist.<\/li>\n<li><strong>Refund Agent<\/strong> looks up order details, gathers context, and submits refund requests.<\/li>\n<li><strong>Order Agent<\/strong> handles replacements and shipping preferences.<\/li>\n<\/ul>\n<div style=\"position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;\"><iframe style=\"position: absolute; top: 0; left: 0; width: 100%; height: 100%; border: 0;\" title=\"Multi-Agent Handoff Workflow\" src=\"https:\/\/www.youtube.com\/embed\/0Yy1qZqMFUM?vq=hd1080&amp;rel=0&amp;modestbranding=1&amp;loop=1&amp;playlist=0Yy1qZqMFUM\" allowfullscreen=\"allowfullscreen\">\n  <\/iframe><\/div>\n<h2 class=\"wp-block-heading\">Defining the Workflow with HandoffBuilder<\/h2>\n<p class=\"wp-block-paragraph\">The orchestration layer uses MAF&#8217;s <code>HandoffBuilder<\/code>, which lets you declare agents, their tools, and an explicit handoff topology. This is not a simple chain. Each agent can route to specific other agents based on descriptions you provide, and the framework enforces these routing constraints at the orchestration level.<\/p>\n<pre class=\"wp-block-code\"><code class=\"language-python\">from agent_framework import Agent, tool\r\nfrom agent_framework.orchestrations import HandoffBuilder\r\n\r\n\r\n@tool(approval_mode=\"always_require\")\r\ndef submit_refund(\r\n    refund_description: str,\r\n    amount: str,\r\n    order_id: str,\r\n) -&gt; str:\r\n    \"\"\"Capture a refund request for manual review before processing.\"\"\"\r\n    return f\"refund recorded for order {order_id} (amount: {amount})\"\r\n\r\n\r\n@tool(approval_mode=\"always_require\")\r\ndef submit_replacement(\r\n    order_id: str,\r\n    shipping_preference: str,\r\n    replacement_note: str,\r\n) -&gt; str:\r\n    \"\"\"Capture a replacement request for manual review before processing.\"\"\"\r\n    return (\r\n        f\"replacement recorded for order {order_id} \"\r\n        f\"(shipping: {shipping_preference})\"\r\n    )\r\n\r\n\r\ntriage = Agent(\r\n    id=\"triage_agent\",\r\n    name=\"triage_agent\",\r\n    instructions=\"...\",\r\n    client=client,\r\n    require_per_service_call_history_persistence=True,\r\n)\r\n\r\nrefund = Agent(\r\n    id=\"refund_agent\",\r\n    name=\"refund_agent\",\r\n    instructions=\"...\",\r\n    client=client,\r\n    tools=[submit_refund],\r\n    require_per_service_call_history_persistence=True,\r\n)\r\n\r\norder = Agent(\r\n    id=\"order_agent\",\r\n    name=\"order_agent\",\r\n    instructions=\"...\",\r\n    client=client,\r\n    tools=[submit_replacement],\r\n    require_per_service_call_history_persistence=True,\r\n)<\/code><\/pre>\n<p class=\"wp-block-paragraph\">The <code>@tool(approval_mode=\"always_require\")<\/code> decorator is the key integration point with HITL. When an agent calls one of these tools, the workflow pauses and emits an interrupt event that the frontend can render as an approval prompt. The workflow does not resume until the operator approves or rejects the call.<\/p>\n<p class=\"wp-block-paragraph\">With the agents defined, the handoff topology is explicit:<\/p>\n<pre class=\"wp-block-code\"><code class=\"language-python\">builder = HandoffBuilder(\r\n    name=\"ag_ui_handoff_workflow_demo\",\r\n    participants=[triage, refund, order],\r\n    termination_condition=termination_condition,\r\n)\r\n\r\n(\r\n    builder\r\n    .add_handoff(\r\n        triage,\r\n        [refund],\r\n        description=(\r\n            \"Refunds, damaged-item claims, \"\r\n            \"refund status updates.\"\r\n        ),\r\n    )\r\n    .add_handoff(\r\n        triage,\r\n        [order],\r\n        description=(\r\n            \"Replacement, exchange, \"\r\n            \"shipping preference changes.\"\r\n        ),\r\n    )\r\n    .add_handoff(\r\n        refund,\r\n        [order],\r\n        description=\"Replacement logistics needed after refund.\",\r\n    )\r\n    .add_handoff(\r\n        refund,\r\n        [triage],\r\n        description=(\r\n            \"Final case closure when refund-only work is complete.\"\r\n        ),\r\n    )\r\n    .add_handoff(\r\n        order,\r\n        [triage],\r\n        description=\"After replacement\/shipping tasks complete.\",\r\n    )\r\n    .add_handoff(\r\n        order,\r\n        [refund],\r\n        description=(\r\n            \"User pivots from replacement to refund processing.\"\r\n        ),\r\n    )\r\n)\r\n\r\nworkflow = builder.with_start_agent(triage).build()<\/code><\/pre>\n<p class=\"wp-block-paragraph\">Each <code>add_handoff<\/code> call declares a directed edge in the routing graph with a natural-language description. The framework uses these descriptions to generate handoff tools for each agent, so routing decisions are grounded in the orchestration topology rather than relying solely on prompt instructions.<\/p>\n<h2 class=\"wp-block-heading\">Connecting to AG-UI<\/h2>\n<p class=\"wp-block-paragraph\">AG-UI is a protocol that streams agent execution events over SSE. MAF&#8217;s <code>agent_framework.ag_ui<\/code> package provides a bridge that wraps any MAF workflow into an AG-UI-compatible endpoint with a single function call.<\/p>\n<pre class=\"wp-block-code\"><code class=\"language-python\">from agent_framework.ag_ui import (\r\n    AgentFrameworkWorkflow,\r\n    add_agent_framework_fastapi_endpoint,\r\n)\r\nfrom fastapi import FastAPI\r\n\r\napp = FastAPI()\r\n\r\ndemo_workflow = AgentFrameworkWorkflow(\r\n    workflow_factory=lambda _thread_id: create_handoff_workflow(),\r\n    name=\"ag_ui_handoff_workflow_demo\",\r\n)\r\n\r\nadd_agent_framework_fastapi_endpoint(\r\n    app=app,\r\n    agent=demo_workflow,\r\n    path=\"\/handoff_demo\",\r\n)<\/code><\/pre>\n<p class=\"wp-block-paragraph\">That is it. The <code>workflow_factory<\/code> creates a fresh workflow instance per thread, maintaining isolated state for each conversation. The endpoint handles all the SSE plumbing: streaming <code>RUN_STARTED<\/code>, <code>STEP_STARTED<\/code>, <code>TEXT_MESSAGE_*<\/code>, <code>TOOL_CALL_*<\/code>, and <code>RUN_FINISHED<\/code> events as the workflow executes.<\/p>\n<h2 class=\"wp-block-heading\">Two Types of Interrupts<\/h2>\n<p class=\"wp-block-paragraph\">The demo showcases two distinct interrupt patterns that a real-world agent application needs:<\/p>\n<p class=\"wp-block-paragraph\"><strong>Tool approval interrupts<\/strong> fire when an agent calls a tool marked with <code>approval_mode=\"always_require\"<\/code>. The workflow pauses, the frontend renders an approval modal showing the tool name and arguments, and the operator decides whether to approve or reject the call. This is essential for actions like processing refunds or submitting replacement orders where you want a human in the loop.<\/p>\n<p class=\"wp-block-paragraph\"><strong>Information request interrupts<\/strong> fire when an agent needs additional input from the user, such as an order ID or a shipping preference. The workflow pauses and the frontend presents the agent&#8217;s question in the chat. The user responds normally, and their answer is submitted back to resume the workflow from where it left off.<\/p>\n<p class=\"wp-block-paragraph\">Under the hood, information request interrupts are powered by <code>HandoffAgentUserRequest<\/code>. When an agent completes its turn without requesting a handoff to another agent, the workflow issues a <code>HandoffAgentUserRequest<\/code> containing the agent&#8217;s response. This pauses execution and emits an interrupt event to the frontend. When the user replies, a response handler broadcasts the user&#8217;s message to all agents, appends it to the conversation cache, and resumes the active agent:<\/p>\n<pre class=\"wp-block-code\"><code class=\"language-python\">from agent_framework.orchestrations import HandoffAgentUserRequest\r\n\r\n\r\n# Inside the handoff executor, when no handoff is requested:\r\n# The workflow pauses and waits for user input.\r\nawait ctx.request_info(\r\n    HandoffAgentUserRequest(agent_response),\r\n    list[Message],\r\n)\r\n\r\n\r\n# When the user responds, the handler resumes the workflow:\r\n@response_handler\r\nasync def handle_response(\r\n    self,\r\n    original_request: HandoffAgentUserRequest,\r\n    response: list[Message],\r\n    ctx: WorkflowContext,\r\n) -&gt; None:\r\n    if not response:\r\n        # Empty response signals termination\r\n        await ctx.yield_output(self._full_conversation)\r\n        return\r\n\r\n    await self._broadcast_messages(response, ctx)\r\n    self._cache.extend(response)\r\n    await self._run_agent_and_emit(ctx)<\/code><\/pre>\n<p class=\"wp-block-paragraph\">This means the conversation flow is fully controlled by the agents themselves. When the refund agent asks &#8220;What is your order ID?&#8221;, that question becomes a <code>HandoffAgentUserRequest<\/code> interrupt. The user&#8217;s answer flows back through the same resume mechanism, and the agent picks up exactly where it left off.<\/p>\n<p class=\"wp-block-paragraph\">Both interrupt types use the same underlying <code>resume.interrupts<\/code> mechanism in AG-UI. The frontend sends a resume payload with the interrupt ID and the response value, and the workflow picks up exactly where it paused.<\/p>\n<h2 class=\"wp-block-heading\">The Frontend Experience<\/h2>\n<p class=\"wp-block-paragraph\">The React frontend consumes the SSE stream and renders the workflow state in real time. Key UI elements include:<\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Active agent indicator<\/strong> showing which specialist is currently handling the case<\/li>\n<li><strong>Case snapshot card<\/strong> that updates as the workflow gathers order details, amounts, and shipping preferences<\/li>\n<li><strong>Chat panel<\/strong> with streaming assistant messages and user input<\/li>\n<li><strong>Approval modal<\/strong> that surfaces tool call details for the operator to review before approving or rejecting<\/li>\n<\/ul>\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-content\/uploads\/sites\/78\/2026\/04\/ag-ui-handoff-1.webp\" alt=\"The HITL approval modal showing the submit_refund tool call with refund details and Approve\/Reject buttons\" \/><\/figure>\n<p class=\"wp-block-paragraph\">The frontend maintains a queue of pending interrupts so that multiple approval requests or information requests can be handled in sequence without losing state.<\/p>\n<h2 class=\"wp-block-heading\">Running the Demo<\/h2>\n<p class=\"wp-block-paragraph\">The backend is a FastAPI server and the frontend is a Vite + React application. To run:<\/p>\n<pre class=\"wp-block-code\"><code class=\"language-bash\"># Backend (from python\/ directory)\r\nuv sync\r\nuv run python samples\/05-end-to-end\/ag_ui_workflow_handoff\/backend\/server.py\r\n\r\n# Frontend (in a separate terminal)\r\ncd samples\/05-end-to-end\/ag_ui_workflow_handoff\/frontend\r\nnpm install\r\nnpm run dev<\/code><\/pre>\n<p class=\"wp-block-paragraph\">Open <code>http:\/\/127.0.0.1:5173<\/code> in a browser and try a prompt like &#8220;I need a refund for order 987654.&#8221; The triage agent routes to the refund specialist, which looks up order details, gathers a reason, and submits a refund request for your approval.<\/p>\n<h2 class=\"wp-block-heading\">What This Demonstrates<\/h2>\n<p class=\"wp-block-paragraph\">This sample validates several capabilities working together in a single application:<\/p>\n<ul class=\"wp-block-list\">\n<li><strong>MAF workflows as AG-UI backends.<\/strong> Any workflow built with <code>HandoffBuilder<\/code> (or other MAF orchestration patterns) can be exposed as an AG-UI endpoint with minimal glue code.<\/li>\n<li><strong>Dynamic, non-linear routing.<\/strong> Agents hand off based on conversation context, not a fixed sequence. The routing graph is declared in code and enforced by the framework.<\/li>\n<li><strong>Human-in-the-loop at the tool level.<\/strong> Sensitive actions require explicit approval. The workflow pauses cleanly and resumes after the operator responds.<\/li>\n<li><strong>Thread-scoped state.<\/strong> Each conversation gets its own workflow instance, so multiple users or sessions can run concurrently without interference.<\/li>\n<li><strong>Real-time streaming UI.<\/strong> Every token, tool call, and state change is streamed to the frontend as it happens.<\/li>\n<\/ul>\n<h2 class=\"wp-block-heading\">Learn More<\/h2>\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/github.com\/ag-ui-protocol\/ag-ui\">AG-UI Protocol specification<\/a><\/li>\n<li><a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/ai-foundry\/agent-framework\/\">Microsoft Agent Framework documentation<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/microsoft\/agent-framework\/tree\/main\/python\/samples\/05-end-to-end\/ag_ui_workflow_handoff\">Full sample code on GitHub<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/microsoft\/agent-framework\/discussions\">Microsoft Agent Framework discussion board<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Multi-agent systems demo beautifully. Putting them in front of real users is another story. In early prototypes, a terminal or a basic chat window is enough. But once agents start handing off to each other, pausing for approvals, or asking follow-up questions, those interfaces fall apart. Which agent is active? Why is the system waiting? [&hellip;]<\/p>\n","protected":false},"author":150043,"featured_media":5230,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[148,143,34],"tags":[149,150,151,53],"class_list":["post-5227","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ag-ui","category-agent-framework","category-python-2","tag-ag-ui","tag-agent-framework","tag-demo","tag-python"],"acf":[],"blog_post_summary":"<p>Multi-agent systems demo beautifully. Putting them in front of real users is another story. In early prototypes, a terminal or a basic chat window is enough. But once agents start handing off to each other, pausing for approvals, or asking follow-up questions, those interfaces fall apart. Which agent is active? Why is the system waiting? [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/5227","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/users\/150043"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/comments?post=5227"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/5227\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media\/5230"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media?parent=5227"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/categories?post=5227"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/tags?post=5227"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}