{"id":139,"date":"2025-04-04T09:29:00","date_gmt":"2025-04-04T09:29:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/foundry\/?p=139"},"modified":"2025-11-24T16:23:01","modified_gmt":"2025-11-25T00:23:01","slug":"ai-red-teaming-agent-preview","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/foundry\/ai-red-teaming-agent-preview\/","title":{"rendered":"Introducing AI Red Teaming Agent: Accelerate your AI safety and security journey with Azure AI Foundry"},"content":{"rendered":"<p><div style=\"width: 1620px;\" class=\"wp-video\"><video class=\"wp-video-shortcode\" id=\"video-139-1\" width=\"1620\" height=\"1080\" preload=\"metadata\" controls=\"controls\"><source type=\"video\/mp4\" src=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/04\/AI-red-teaming-agent_final_blog_asset-1.mp4?_=1\" \/><a href=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/04\/AI-red-teaming-agent_final_blog_asset-1.mp4\">https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/04\/AI-red-teaming-agent_final_blog_asset-1.mp4<\/a><\/video><\/div><\/p>\n<p>Today we are excited to announce the public preview of the AI Red Teaming Agent for generative AI systems. We have integrated Microsoft AI Red Team\u2019s open-source toolkit <a href=\"https:\/\/github.com\/Azure\/PyRIT\" target=\"_blank\" rel=\"noopener\">PyRIT<\/a> (Python Risk Identification Tool) into Azure AI Foundry to complement its <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/ai-foundry\/concepts\/evaluation-metrics-built-in?tabs=warning#risk-and-safety-evaluators\" target=\"_blank\" rel=\"noopener\">existing risk and safety evaluations<\/a> with the ability to systematically evaluate how your AI model or application behaves when being probed by an adversary.<\/p>\n<p>Traditional red teaming involves exploiting the cyber kill chain and describes the process by which a system is tested for security vulnerabilities. However, with the rise of generative AI, the term <a href=\"https:\/\/aka.ms\/llm_red_teaming\" target=\"_blank\" rel=\"noopener\">AI red teaming<\/a> has been coined to describe probing for novel risks (both content safety and security related) that these systems present and refers to simulating the behavior of an adversarial user who is trying to cause your AI system to misbehave in a particular way.<\/p>\n<p>Microsoft\u2019s AI Red Team \u2013 one of the earliest AI Red Teams\u2013 has led the industry in this area: laying the foundation of Adversarial ML Threat Matrix (which later became <a href=\"https:\/\/atlas.mitre.org\/matrices\/ATLAS\" target=\"_blank\" rel=\"noopener\">MITRE ATLAS Matrix<\/a>), releasing the first industry wide taxonomy of ML failure modes, and creating one of the earliest open-source toolkits to test GenAI systems called <a href=\"https:\/\/github.com\/Azure\/PyRIT\" target=\"_blank\" rel=\"noopener\">PyRIT<\/a>. PyRIT comes with a collection of built-in strategies for defeating AI safety systems, which is leveraged by AI Red Teaming Agent in Azure AI Foundry to provide insights into the risk posture of the generative AI system.<\/p>\n<p>AI Red Teaming Agent helps you do this in three ways:<\/p>\n<ul>\n<li><strong>Automated scans for content safety risks:<\/strong> Firstly, you can automatically scan your model and application endpoints for safety risks by simulating adversarial probing.<\/li>\n<li><strong>Evaluate probing success:<\/strong> Next, you can evaluate and score each attack-response pair to generate insightful metrics such as Attack Success Rate (ASR).<\/li>\n<li><strong>Reporting and logging:<\/strong> Finally, you can generate a score card of the attack probing techniques and risk categories to help you decide if the system is ready for deployment. Findings can be logged, monitored, and tracked over time directly in Azure AI Foundry, ensuring compliance and continuous risk mitigation.<\/li>\n<\/ul>\n<p>Together these components (scanning, evaluating, and reporting) help teams understand how AI systems respond to common attacks, ultimately guiding a comprehensive risk management strategy.<\/p>\n<p>Over the past year, we&#8217;ve seen evaluations becoming a standard practice among our customers. A recent <a href=\"https:\/\/info.microsoft.com\/ww-landing-customizing-generative-ai-top-techniques-for-unique-value.html?LCID=EN-US\" target=\"_blank\" rel=\"noopener\">MIT Technology Review Insights<\/a> report found over half (54%) of surveyed businesses using manual methods to evaluate generative AI models, and 26% are either beginning to apply automated methods or are now doing so consistently. Larger enterprises are adopting evaluations not just for quality concerns but also for security and safety-related risks. <a href=\"https:\/\/owasp.org\/www-project-top-10-for-large-language-model-applications\/\" target=\"_blank\" rel=\"noopener\">OWASP Top 10 for LLM Applications 2025<\/a> and <a href=\"https:\/\/atlas.mitre.org\/matrices\/ATLAS\" target=\"_blank\" rel=\"noopener\">MITRE ATLAS Matrix<\/a> highlight some of the common security and safety issues specific to GenAI applications. These frameworks along with carefully crafted risk management strategies are top of mind for organizations like Accenture:<\/p>\n<blockquote><p>&#8220;At Accenture, we&#8217;re creating more agentic applications for our clients than ever before. Azure AI Foundry gives us a one-stop shop for all the right tools and services. To meet the growing demand while ensuring our applications are safe and secure, we&#8217;re looking into the AI Red Teaming Agent to automatically scan and ensure responsible development.&#8221;- Nayan Paul, Managing Director, Accenture<\/p><\/blockquote>\n<p>AI red teaming relies on the creative human expertise of highly skilled safety and security professionals to simulate attacks. The process is resource and time intensive and can create a bottleneck for many organizations to accelerate AI adoption. With the AI Red Teaming Agent, organizations can now leverage Microsoft\u2019s deep expertise to scale and accelerate their AI development with Trustworthy AI at the forefront.<\/p>\n<p><div  class=\"d-flex justify-content-center\"><a class=\"cta_button_link btn-primary mb-24\" href=\"#\" target=\"_blank\">Get Started<\/a><\/div><\/p>\n<h1>Integrating with Azure AI Foundry<\/h1>\n<p>Customers will now be able to probe their AI systems for content safety failures automatically using a variety of adversarial strategies.<\/p>\n<p>With the AI Red Teaming Agent, teams can now:<\/p>\n<ul>\n<li>Run automated scans leveraging a comprehensive set of content safety attack techniques with the <a href=\"https:\/\/aka.ms\/airedteamingagent-howtodoc\" target=\"_blank\" rel=\"noopener\">Azure AI Evaluation SDK<\/a> to<\/li>\n<li>Simulate adversarial prompting against your model or application endpoints using the AI Red Teaming Agent\u2019s fine-tuned adversarial LLM and<\/li>\n<li>Evaluate the attack-response pairs with Risk and Safety Evaluators to generate Attack Success Rates (ASR)<\/li>\n<li>Throughout the AI development lifecycle, generate AI red teaming reports that visualize and track safety improvements in your Azure AI Foundry project<\/li>\n<\/ul>\n<p>Designed for both experts and teams early in their AI safety journey, this experience makes AI red teaming accessible while offering customization for advanced users. The AI Red Teaming Agent applies a recommended set of <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/ai-foundry\/concepts\/ai-red-teaming-agent#supported-attack-strategies\" target=\"_blank\" rel=\"noopener\">attack techniques<\/a> categorized by complexity (easy, moderate, difficult) to achieve single turn attack success.<\/p>\n<pre><code class=\"language-csharp\">\u202fmy_redteaming_results\u202f=\u202fawait\u202fred_team_agent.scan( \r\n \u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202ftarget=azure_openai_config, \r\n \u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202fscan_name=\"My\u202fFirst\u202fRedTeam Scan\", \r\n \u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202fattack_strategies=[ \r\n \u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202fAttackStrategy.EASY, \r\n \u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202fAttackStrategy.MODERATE, \r\n \u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202fAttackStrategy.DIFFICULT, \r\n \u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f]  \r\n \u202f\u202f\u202f\u202f\u202f\u202f\u202f)<\/code><\/pre>\n<p>Expert users can also specify PyRIT attack techniques (such as Flipping the characters) or compose multi-step attack strategies (base64-encoding the characters then converting to ROT13):<\/p>\n<pre><code class=\"language-csharp\"> \u202fmy_adv_redteaming_results\u202f=\u202fawait\u202fred_team_agent.scan( \r\n \u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202ftarget=azure_openai_config, \r\n \u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202fscan_name=\"My\u202fFirst\u202fRedTeam Scan\", \r\n \u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202fattack_strategies=[ \r\n \u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202fAttackStrategy.Compose([AttackStrategy.Base64, AttackStrategy.ROT13]), \r\n \u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202f\u202fAttackStrategy.Flip,\u202f]  \r\n \u202f\u202f\u202f\u202f\u202f\u202f\u202f) <\/code><\/pre>\n<p>Once you&#8217;ve completed a scan on your AI system, you can extract a detailed scorecard of your local run in your development environment to share with stakeholders or integrate it into your own governance platform.<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/04\/AIRedTeamingAgentDevBlog-JSON.jpg\"><img decoding=\"async\" class=\"aligncenter size-full wp-image-210\" src=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/04\/AIRedTeamingAgentDevBlog-JSON.jpg\" alt=\"Image AIRedTeamingAgentDevBlog JSON\" width=\"800\" height=\"449\" srcset=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/04\/AIRedTeamingAgentDevBlog-JSON.jpg 800w, https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/04\/AIRedTeamingAgentDevBlog-JSON-300x168.jpg 300w, https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/04\/AIRedTeamingAgentDevBlog-JSON-768x431.jpg 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/><\/a><\/p>\n<p>Alternatively, you can view the comprehensive report directly within your Azure AI Foundry project. The AI Red Teaming tab within the project offers a detailed breakdown of each scan, categorized by attack complexity or risk category. It also provides a row-level view of each attack-response pair, enabling deeper insights into system issues and behaviors.<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/04\/AIRedTeamingAgentDevBlogFinal-1.gif\"><img decoding=\"async\" class=\"aligncenter size-large wp-image-212\" src=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/04\/AIRedTeamingAgentDevBlogFinal-1-1024x575.gif\" alt=\"Image AIRedTeamingAgentDevBlogFinal 1\" width=\"1024\" height=\"575\" srcset=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/04\/AIRedTeamingAgentDevBlogFinal-1-1024x575.gif 1024w, https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/04\/AIRedTeamingAgentDevBlogFinal-1-300x169.gif 300w, https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/04\/AIRedTeamingAgentDevBlogFinal-1-768x431.gif 768w, https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/04\/AIRedTeamingAgentDevBlogFinal-1-1536x863.gif 1536w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><\/p>\n<h1>Getting Started<\/h1>\n<p>AI Red Teaming Agent is available in public preview for all Azure AI Foundry customers. We&#8217;ve prepared detailed documentation and samples to help teams integrate these capabilities into their existing generative AI development processes:<\/p>\n<ul>\n<li><a href=\"https:\/\/aka.ms\/airedteamingagent-conceptdoc\" target=\"_blank\" rel=\"noopener\">Read the documentation<\/a><\/li>\n<li><a href=\"https:\/\/aka.ms\/airedteamingagent-sample\" target=\"_blank\" rel=\"noopener\">Get started with a hands-on example<\/a><\/li>\n<\/ul>\n<h1>Pricing<\/h1>\n<p>AI Red Teaming Agent uses Azure AI Risk and Safety Evaluations to assess attack success from the automated red teaming scan. Therefore, customers will be billed based on the consumption of Risk and Safety Evaluations as listed in <a href=\"https:\/\/azure.microsoft.com\/en-us\/pricing\/details\/ai-foundry\/\" target=\"_blank\" rel=\"noopener\">our Azure pricing page<\/a>. Click on the tab labeled \u201cComplete AI Toolchain\u201d to view the pricing details.<\/p>\n<h1>Looking Forward<\/h1>\n<p>This launch underscores our commitment to trustworthy AI as a continuous, integrated practice\u2014not a one-time checkbox. At Microsoft, we believe that AI red teaming is a core part of the software engineering process for generative AI applications, not an afterthought. We\u2019re focused on helping teams embed AI red teaming capabilities and automated evaluation into every stage of development. The most effective strategies we\u2019ve seen leverage automated tools to surface potential risks, which are then analyzed by expert human teams for deeper insights. If your organization is just starting with AI red teaming, we encourage you to explore the resources created by our own AI red team at Microsoft to help you get started.<\/p>\n<ul>\n<li><a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/ai-services\/openai\/concepts\/red-teaming\" target=\"_blank\" rel=\"noopener\">Planning red teaming for large language models (LLMs) and their applications &#8211; Azure OpenAI Service | Microsoft Learn<\/a><\/li>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/security\/blog\/2025\/01\/13\/3-takeaways-from-red-teaming-100-generative-ai-products\/\">3 takeaways from red teaming 100 generative AI products<\/a><\/li>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/security\/blog\/2023\/08\/07\/microsoft-ai-red-team-building-future-of-safer-ai\/\" target=\"_blank\" rel=\"noopener\">Microsoft AI Red Team building future of safer AI<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>AI Red Teaming Agent, integrated into Azure AI Foundry, enhances the safety and security of generative AI systems by providing automated scans, evaluating probing success, and generating detailed scorecards to guide risk management strategies.<\/p>\n","protected":false},"author":186255,"featured_media":191,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[3,5,4,2],"class_list":["post-139","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-microsoft-foundry","tag-ai-development","tag-ai-tools","tag-generative-ai","tag-microsoft-foundry"],"acf":[],"blog_post_summary":"<p>AI Red Teaming Agent, integrated into Azure AI Foundry, enhances the safety and security of generative AI systems by providing automated scans, evaluating probing success, and generating detailed scorecards to guide risk management strategies.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/posts\/139","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/users\/186255"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/comments?post=139"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/posts\/139\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/media\/191"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/media?parent=139"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/categories?post=139"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/tags?post=139"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}