{"id":399,"date":"2025-05-19T09:00:53","date_gmt":"2025-05-19T16:00:53","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/foundry\/?p=399"},"modified":"2025-05-19T09:27:31","modified_gmt":"2025-05-19T16:27:31","slug":"unlock-instant-on-device-ai-with-foundry-local","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/foundry\/unlock-instant-on-device-ai-with-foundry-local\/","title":{"rendered":"Unlock Instant On-Device AI with Foundry Local"},"content":{"rendered":"<p>You\u2019re building a next generation AI-powered app. It needs to be fast, private, and work anywhere, even without internet connectivity. This isn\u2019t just about prototyping. You\u2019re shipping a real app to real users, with AI that delivers value and scales cost-effectively.<\/p>\n<p>Meet <strong>Foundry Local<\/strong>\u2014the high-performance local AI runtime stack that brings Azure AI Foundry\u2019s power to client devices. Now in preview on <strong>Windows<\/strong> and <strong>macOS<\/strong>, Foundry Local lets you build and ship cross-platform AI apps that run models, tools, and agents directly on-device. It is included in <a href=\"https:\/\/blogs.windows.com\/windowsdeveloper\/?p=57397\">Windows AI Foundry<\/a>, delivering best in class AI capabilities and excellent cross-silicon performance on hundreds of millions of Windows devices.<\/p>\n<p>This is local AI, efficient and ready for production.<\/p>\n<p><div  class=\"d-flex justify-content-left\"><a class=\"cta_button_link btn-primary mb-24\" href=\"https:\/\/aka.ms\/foundry-local-docs\" target=\"_blank\">Get Started Today<\/a><\/div><\/p>\n<p><div style=\"width: 1920px;\" class=\"wp-video\"><video class=\"wp-video-shortcode\" id=\"video-399-1\" width=\"1920\" height=\"1080\" preload=\"metadata\" controls=\"controls\"><source type=\"video\/mp4\" src=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/05\/endtoEnd_Sizzle-5-1.mp4?_=1\" \/><a href=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/05\/endtoEnd_Sizzle-5-1.mp4\">https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/05\/endtoEnd_Sizzle-5-1.mp4<\/a><\/video><\/div><\/p>\n<h2>What is Foundry Local?<\/h2>\n<p>Foundry Local brings the power and trust of Azure AI Foundry to your device. It includes everything you need to run AI apps locally.<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/05\/foundry_local.png\"><img decoding=\"async\" class=\"wp-image-533 size-large aligncenter\" src=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/05\/foundry_local-1024x567.png\" alt=\"Foundry Local Stack\" width=\"1024\" height=\"567\" srcset=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/05\/foundry_local-1024x567.png 1024w, https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/05\/foundry_local-300x166.png 300w, https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/05\/foundry_local-768x425.png 768w, https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/05\/foundry_local-1536x850.png 1536w, https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/05\/foundry_local.png 1840w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><\/p>\n<h4>High-Performance Model Execution with ONNX Runtime<\/h4>\n<p>Foundry Local is built on ONNX Runtime for top-tier performance across CPUs, NPUs, and GPUs. On Windows, it is integrated and optimized through deep collaboration with hardware vendors like AMD, Intel, NVIDIA, and Qualcomm on the foundation of Windows ML. On Mac, it includes GPU acceleration on Apple silicon. Foundry Local can choose the optimal silicon on the devices to run local models, with the option to specify the silicon (CPU, GPU, NPU) for execution.<\/p>\n<h4>Go from Exploration to Production with Foundry Local Management Service<\/h4>\n<p>With Foundry Local, moving from prototype to production is effortless. You can use a wide range of edge-optimized AI Foundry models\u2014including DeepSeek R1, Qwen 2.5 Instruct, Phi-4 Reasoning, Mistral and additional ONNX-format models from Hugging Face \u2013 directly in your app. Foundry Local Management Service takes care of downloading and loading models at runtime. You can build a Windows\/macOS\/cross-platform application using Foundry Local and ship it to your customers.<\/p>\n<h4>Seamless Local AI Access with Foundry CLI &amp; SDK<\/h4>\n<p>Use the new Foundry CLI to manage local models, tools, and agents with ease. With Foundry Local SDK and the Azure Inference SDK, you can interact with Foundry Local and integrate model management and local inference directly into your app. You can also use OpenAI-compatible chat completion APIs to integrate your application with Foundry Local.<\/p>\n<h4>Local AI Agents using MCP<\/h4>\n<p>Foundry Local is redefining local AI workflows with intelligent agents at the core. Using the Model Context Protocol (MCP) to call local tools, it offers a new path to smart automation\u2014right on your device.\u00a0 If you&#8217;d like to participate in this private preview, please fill out the <a href=\"https:\/\/aka.ms\/FLAgentPrp\">form<\/a> here.<\/p>\n<p><div style=\"width: 3840px;\" class=\"wp-video\"><video class=\"wp-video-shortcode\" id=\"video-399-2\" width=\"3840\" height=\"2160\" preload=\"metadata\" controls=\"controls\"><source type=\"video\/mp4\" src=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/05\/two_agents-1.mp4?_=2\" \/><a href=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/05\/two_agents-1.mp4\">https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/05\/two_agents-1.mp4<\/a><\/video><\/div><\/p>\n<h2>Get Started<\/h2>\n<p><strong>On Windows<\/strong><\/p>\n<ol>\n<li>Open Windows Terminal<\/li>\n<li>Install Foundry Local using winget\n<pre><code class=\"language-bash\">winget install Microsoft.FoundryLocal\r\n<\/code><\/pre>\n<\/li>\n<li>Run a model\n<pre><code class=\"language-bash\">foundry model run <span data-teams=\"true\">phi-3.5-mini<\/span>\r\n<\/code><\/pre>\n<\/li>\n<\/ol>\n<p><strong>On MacOS<\/strong><\/p>\n<ol>\n<li>Open Terminal<\/li>\n<li>Install Foundry Local\n<pre><code class=\"language-bash\">brew tap microsoft\/foundrylocal\r\nbrew install foundrylocal\r\n<\/code><\/pre>\n<\/li>\n<li>Run a model\n<pre><code class=\"language-bash\">foundry model run <span data-teams=\"true\">phi-3.5-mini<\/span>\r\n<\/code><\/pre>\n<\/li>\n<\/ol>\n<p><strong>\u00a0<\/strong>For more information, check out Foundry Local documentation and samples <a href=\"http:\/\/aka.ms\/foundry-local-docs\">here<\/a>.<\/p>\n<h2>What Our Private Preview Customers Are Telling Us<\/h2>\n<p>Over 100 customers \u2013 including ISVs and partners such as SoftwareOne, Pieces.app, and Avanade &#8211; have already been using Foundry Local and have helped shape it.<\/p>\n<p><div style=\"width: 1280px;\" class=\"wp-video\"><video class=\"wp-video-shortcode\" id=\"video-399-3\" width=\"1280\" height=\"720\" preload=\"metadata\" controls=\"controls\"><source type=\"video\/mp4\" src=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/05\/Customer_local_captions_updated.mp4?_=3\" \/><a href=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/05\/Customer_local_captions_updated.mp4\">https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2025\/05\/Customer_local_captions_updated.mp4<\/a><\/video><\/div><\/p>\n<blockquote><p>&#8220;<span class=\"TextRun SCXO220956624 BCX8\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun SCXO220956624 BCX8\">Now with <\/span><\/span><span class=\"TextRun SCXO220956624 BCX8\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun SCXO220956624 BCX8\">Foundry Local<\/span><\/span><span class=\"TextRun SCXO220956624 BCX8\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun SCXO220956624 BCX8\">, we have the flexibility to build hybrid agentic solutions leveraging the power of both cloud &amp; on-premise thereby allowing us to deliver greater value to our customers without compromising on compliance or innovation. This is a game-changer!\u201d<\/span><\/span><span class=\"EOP SCXO220956624 BCX8\"> &#8211; <\/span><em>Ratheesh Krishna Geeth, CEO \u2013 Digital Engineering &amp; AI, iLink Digital\u00a0<\/em><\/p>\n<p>\u201cFoundry Local is positioned to provide the robust infrastructure needed to guarantee the integrity and continuous availability of these critical workflows, enabling us to deliver transformative healthcare solutions with the highest standards of reliability and quality.&#8221; &#8211;<em>Brian Hartzer, CEO, Quantium Health\u00a0<\/em><\/p><\/blockquote>\n<p><strong>Foundry Local makes local AI practical, powerful, and production-ready.<\/strong> Whether you\u2019re building a proof-of-concept or shipping a product, it gives you the performance, flexibility, and control to run AI where it matters most<strong>.<\/strong> Let\u2019s build the future of local AI\u2014together.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>You\u2019re building a next generation AI-powered app. It needs to be fast, private, and work anywhere, even without internet connectivity. This isn\u2019t just about prototyping. You\u2019re shipping a real app to real users, with AI that delivers value and scales cost-effectively. Meet Foundry Local\u2014the high-performance local AI runtime stack that brings Azure AI Foundry\u2019s power [&hellip;]<\/p>\n","protected":false},"author":189733,"featured_media":571,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[3,34],"class_list":["post-399","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-microsoft-foundry","tag-ai-development","tag-microsoft-build"],"acf":[],"blog_post_summary":"<p>You\u2019re building a next generation AI-powered app. It needs to be fast, private, and work anywhere, even without internet connectivity. This isn\u2019t just about prototyping. You\u2019re shipping a real app to real users, with AI that delivers value and scales cost-effectively. Meet Foundry Local\u2014the high-performance local AI runtime stack that brings Azure AI Foundry\u2019s power [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/posts\/399","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/users\/189733"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/comments?post=399"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/posts\/399\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/media\/571"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/media?parent=399"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/categories?post=399"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/tags?post=399"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}