Overview
Today, we are excited to announce the Browser Automation Tool (Preview) as the newest action tool in Azure AI Foundry Agent Service through API and SDK. This tool enables developers to build agents capable of performing real-world browser tasks—such as searching, navigating, filling forms, booking appointments, and more—through natural language prompts. Powered by Playwright Workspaces, Browser Automation Tool brings isolated, cloud-hosted browser automation to your AI agents, supporting multi-turn interactions that mimic a real user’s browsing experience.
Key Benefits
- Natural Language to Automation: Allow users to accomplish browser-based workflows simply by describing their goals in plain language.
- Realistic Web Interactions: Automate complex web UIs, including form fills, filters, reservations, and multi-step processes, just as a human user would.
- Isolated Execution: Each session runs in a sandboxed browser hosted within your own Azure subscription using Playwright Workspaces—no need to manage VMs or browsers manually.
- Multi-turn Conversations: Supports iterative, conversational automation. Users can refine or correct their request in real time.
- Modern, Reliable Automation: Leveraging “down-parsing,” the tool reads the page structure (DOM/accessibility tree), allowing the agent to reason about web elements by their roles and labels—not just pixels.
- Flexible Use Cases: Automate bookings, product research, form submissions, customer support tasks, and more.
Example Use Cases
- Booking & Reservations: Automate class sign-ups, table bookings, or appointment scheduling by navigating complex forms.
- Product Discovery: Search and summarize ecommerce listings or reviews based on user criteria.
- Web Form Interactions: Submit documents, or update profile information automatically.
- Customer Support Tasks: Retrieve ticket updates, check account status, or navigate to specific customer information across web apps.
How It Works
- User Query: The user sends a natural language request to an agent (e.g., “Show me all available yoga classes this week from url xxxx”).
- Session Provisioning: When an agent receives a request to perform browser automation, Azure AI Foundry Agent Service connects to your Playwright Workspaces (which you have already provisioned in your Azure subscription). The Playwright Workspaces service then launches an isolated, sandboxed browser session to execute the requested actions. All browser automation runs within your Azure boundary, managed by Playwright Workspaces.
- Agent Reasoning: The model analyzes the web page by parsing its DOM structure, not just images, and determines the actions needed (such as clicks, form fills, navigation).
- Action Execution: The Browser Automation Tool performs each action inside the sandboxed session, capturing the updated state after every step.
- Multi-turn Loop: The agent receives feedback and iterates—continuing to execute actions and update the user until the workflow is complete or the user stops the session.
This approach combines the power of LLMs with reliable browser automation, offering much higher resilience and intelligence than pixel-based “mouse click” bots.
Security & Responsible Use
WARNING:
Browser Automation Tool comes with significant security risks. Both errors in judgment by the AI and the presence of malicious or confusing instructions on web pages which the AI encounters may cause it to execute commands you or others do not intend, which could compromise the security of your or other users’ browsers, computers, and any accounts to which the browser or AI has access, including personal, financial, or enterprise systems. By using the Browser Automation Tool, you are acknowledging that you bear responsibility and liability for any use of it and of any resulting agents you create with it, including with respect to any other users to whom you make Browser Automation Tool functionality available, including through resulting agents.
We strongly recommend using the Browser Automation Tool on low-privilege virtual machines with no access to sensitive data or critical resources.
See the Transparency Note for more guidance.
Code Samples
import os
from azure.identity import DefaultAzureCredential
from azure.ai.agents import AgentsClient
from azure.ai.agents.models import MessageRole
from azure.ai.projects import AIProjectClient
# Create a project client from a project endpoint, copied from your AI Foundry project.
# Example: project_endpoint = "https://<your-ai-services-resource-name>.services.ai.azure.com/api/projects/<your-project-name>"
project_endpoint = “YOUT_PROJECT_ENDPOINT”
project_client = AIProjectClient(
endpoint=project_endpoint,
credential=DefaultAzureCredential()
)
playwright_connection = project_client.connections.get(
name="YOUR_PLAYWRIGHT_CONNECTION_NAME"
)
print(playwright_connection.id)
with project_client:
agent = project_client.agents.create_agent(
model="YOUR_MODEL_NAME",
name="my-agent",
instructions="use the tool to respond",
tools=[{
"type": "browser_automation",
"browser_automation": {
"connection": {
"id": playwright_connection.id,
}
}
}],
)
print(f"Created agent, ID: {agent.id}")
thread = project_client.agents.threads.create()
print(f"Created thread and run, ID: {thread.id}")
# Create message to thread
message = project_client.agents.messages.create(
thread_id=thread.id,
role="user",
content="YOUR_QUERY_TO_THE_AGENT")
print(f"Created message: {message['id']}")
# Create and process an Agent run in thread with tools
run = project_client.agents.runs.create_and_process(
thread_id=thread.id,
agent_id=agent.id,
)
print(f"Run created, ID: {run.id}")
print(f"Run finished with status: {run.status}")
if run.status == "failed":
print(f"Run failed: {run.last_error}")
run_steps = project_client.agents.run_steps.list(thread_id=thread.id, run_id=run.id)
for step in run_steps:
print(step)
print(f"Step {step['id']} status: {step['status']}")
# Check if there are tool calls in the step details
step_details = step.get("step_details", {})
tool_calls = step_details.get("tool_calls", [])
if tool_calls:
print(" Tool calls:")
for call in tool_calls:
print(f" Tool Call ID: {call.get('id')}")
print(f" Type: {call.get('type')}")
function_details = call.get("function", {})
if function_details:
print(f" Function name: {function_details.get('name')}")
print() # add an extra newline between steps
# Delete the Agent when done
project_client.agents.delete_agent(agent.id)
print("Deleted agent")
# Fetch and log all messages
response_message = project_client.agents.messages.get_last_message_by_role(thread_id=thread.id, role=MessageRole.AGENT)
if response_message:
for text_message in response_message.text_messages:
print(f"Agent response: {text_message.text.value}")
for annotation in response_message.url_citation_annotations:
print(f"URL Citation: [{annotation.url_citation.title}]({annotation.url_citation.url})")
# </create run>
Getting Started
Prerequisites
- Azure subscription with permissions to create Playwright Workspaces and Azure AI Foundry resources
- Python 3.8+ (or use your preferred SDK)
Step-by-Step Setup
- Provision a Playwright Workspace
- Create a Playwright Workspace Resource
- Generate an Access Token
- Note your Workspace Region Endpoint
- Configure Permissions
-
- Assign your Project Identity the “Contributor” role on the Playwright Workspace, or set a custom role.
- Role Assignment Guide
-
- Connect Playwright Workspace to Foundry
- In the Azure AI Foundry portal, open your AI Project.
- Go to Management Center → Connected Resources.
- Create a new connection:
- Type: Serverless Model
- Target URI: Playwright Workspace Region Endpoint (e.g., wss://<region>.api.playwright.microsoft.com/playwrightworkspaces/<workspaceId>/browsers)
- Key: Playwright Access Token
- Create Your Agent
- Use the connection ID from the previous step when configuring your Browser Automation Tool in the agent code.
Learn More & Get Started
- Get started with Azure AI Foundry and jump directly into Visual Studio Code .
- Download the Azure AI Foundry SDK .
- Read the documentation to learn more about the feature.
- Take the Azure AI Foundry Learn courses .
We look forward to seeing the innovative automation experiences you build!
0 comments
Be the first to start the discussion.