{"id":2319,"date":"2026-06-04T06:00:24","date_gmt":"2026-06-04T13:00:24","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/foundry\/?p=2319"},"modified":"2026-07-01T08:40:56","modified_gmt":"2026-07-01T15:40:56","slug":"accelerate-edge-ai-development-with-foundry-local","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/foundry\/accelerate-edge-ai-development-with-foundry-local\/","title":{"rendered":"Accelerate Edge AI Development with Foundry Local"},"content":{"rendered":"<p aria-level=\"1\"><span style=\"font-size: 24pt;\"><strong>Why edge AI development is still hard\u00a0<\/strong><\/span><\/p>\n<p><span data-contrast=\"auto\">AI is no longer confined to cloud experiments. Developers are increasingly expected to deliver AI inside apps, devices, and edge systems where responsiveness, privacy, resilience, and local control are essential. But building those experiences for production is still difficult.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Teams often\u00a0have to\u00a0solve model packaging, runtime fragmentation, hardware differences, and deployment complexity before they can ship a single reliable feature. That slows iteration and makes it harder to move from prototype to product.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">At Microsoft Build 2026,\u00a0we\u2019re\u00a0announcing updates across Foundry Local and Foundry Local on Azure Local that help developers build once and run AI closer to where data is created and decisions are made. These updates expand platform support, improve control over inference and acceleration, add new on-device APIs, and simplify deployment across disconnected, regulated, and sovereign environments.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p aria-level=\"1\"><span style=\"font-size: 24pt;\"><strong>What\u2019s\u00a0new in Foundry Local\u00a0<\/strong><\/span><\/p>\n<p><span data-contrast=\"auto\">The latest Foundry Local updates focus on the areas developers care about most: broader platform reach, familiar APIs, better runtime control, and simpler access to hardware acceleration. Together, these improvements help teams move faster from experimentation to production on AI PCs, edge devices, and enterprise infrastructure.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p>&nbsp;<\/p>\n<p aria-level=\"2\"><span style=\"font-size: 18pt;\"><strong>Foundry Local\u00a0<\/strong><\/span><\/p>\n<p><span data-contrast=\"auto\">Last month we\u00a0announced the\u00a0<\/span><b><span data-contrast=\"auto\">1.1.0 release of Foundry Local\u00a0(<\/span><\/b><a href=\"https:\/\/devblogs.microsoft.com\/foundry\/foundry-local-v1-1\/\"><b><span data-contrast=\"none\">Foundry Local 1.1: Live Transcription, Embeddings, and Responses API | Microsoft Foundry Blog<\/span><\/b><\/a><b><span data-contrast=\"auto\">)<\/span><\/b><span data-contrast=\"auto\">\u202f\u2014 Microsoft\u2019s cross-platform local AI solution that\u00a0let\u00a0developers bring\u202f<\/span><b><span data-contrast=\"auto\">AI directly into their applications<\/span><\/b><span data-contrast=\"auto\">\u202fwith no cloud dependency, no network latency, and no per-token costs.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">The 1.1.0 release added:<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"46\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}\" data-aria-posinset=\"1\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Live audio transcription<\/span><\/b><span data-contrast=\"auto\">\u202ffor real-time speech-to-text scenarios like captioning, voice UIs, and meeting transcription.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"46\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}\" data-aria-posinset=\"2\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Text embeddings<\/span><\/b><span data-contrast=\"auto\">\u202ffor semantic search, RAG, clustering, and similarity matching use cases.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"46\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}\" data-aria-posinset=\"3\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Responses API<\/span><\/b><span data-contrast=\"auto\">\u202fsupport for structured agentic interactions, including tool calling and multimodal vision-language input.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"46\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}\" data-aria-posinset=\"4\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">WebGPU\u00a0execution provider plugin<\/span><\/b><span data-contrast=\"auto\">\u202fdelivered separately to reduce the default package size for applications that\u00a0don\u2019t\u00a0need it.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"46\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}\" data-aria-posinset=\"5\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Reduced JavaScript package size<\/span><\/b><span data-contrast=\"auto\">\u202fby replacing the\u00a0koffi\u00a0FFI layer with a custom Node-API C addon.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"46\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}\" data-aria-posinset=\"6\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Broader .NET compatibility<\/span><\/b><span data-contrast=\"auto\">\u202fby targeting lower framework versions in the C# SDK.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<\/ul>\n<p><span data-contrast=\"auto\">Today we are announcing the\u00a0<\/span><b><span data-contrast=\"auto\">1.2.0 release of\u00a0Foundry Local<\/span><\/b><span data-contrast=\"auto\">, which\u00a0expands\u00a0language support in the\u00a0Live Transcription API, offers a wide range of\u00a0device support\u00a0for Linux, improves cancellation and execution provider workflows, adds new on-device API options, and strengthens the Windows acceleration story with Windows ML (WinML) 2.0.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p aria-level=\"3\"><span data-contrast=\"none\">What\u2019s\u00a0new\u00a0in 1.2.0<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:160,&quot;335559739&quot;:80}\">\u00a0<\/span><\/p>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"45\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;singleLevel&quot;}\" data-aria-posinset=\"1\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Multilingual ASR:\u00a0<\/span><\/b><span data-contrast=\"auto\">Last month\u00a0we\u00a0included support for\u00a0<\/span><b><span data-contrast=\"auto\">real-time speech-to-text streaming<\/span><\/b><span data-contrast=\"auto\">\u202fdirectly from a microphone.\u00a0We\u00a0identified\u202f<\/span><b><span data-contrast=\"auto\">NVIDIA\u2019s\u00a0Nemotron\u00a0Speech Streaming<\/span><\/b><span data-contrast=\"auto\">\u202fas the strongest candidate\u00a0for real-time English streaming on resource-constrained hardware<\/span><span data-contrast=\"auto\">\u00a0(for<\/span><span data-contrast=\"auto\">\u00a0further details, read:\u00a0<\/span><span data-contrast=\"auto\"><a href=\"https:\/\/arxiv.org\/pdf\/2604.14493\">https:\/\/arxiv.org\/pdf\/2604.14493<\/a>)<\/span><span data-contrast=\"auto\">. Today we are happy to announce that Foundry Local 1.2.0\u00a0goes\u00a0multilingual with support for 40+ languages\u00a0via\u00a0the latest\u00a0<\/span><span data-contrast=\"auto\">Nemotron\u00a03.5 ASR Streaming Multilingual model.\u00a0<\/span><span data-contrast=\"auto\">Try out:\u00a0<\/span><a href=\"https:\/\/github.com\/microsoft\/Foundry-Local\/tree\/main\/samples\/python\/live-audio-transcription\"><span data-contrast=\"none\">https:\/\/github.com\/microsoft\/Foundry-Local\/tree\/main\/samples\/python\/live-audio-transcription<\/span><\/a><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559739&quot;:200,&quot;335559740&quot;:276,&quot;469777462&quot;:[720],&quot;469777927&quot;:[0],&quot;469777928&quot;:[0]}\">\u00a0<\/span><\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<pre><span data-contrast=\"auto\">from\u00a0foundry_local_sdk\u00a0import Configuration,\u00a0FoundryLocalManager<\/span><span data-ccp-props=\"{&quot;335557856&quot;:15921906,&quot;335559685&quot;:1080,&quot;335559738&quot;:40,&quot;335559739&quot;:0}\">\u00a0<\/span>\r\n\r\n<span data-contrast=\"auto\">config = Configuration(app_name=\"my_app\")<\/span><span data-ccp-props=\"{&quot;335557856&quot;:15921906,&quot;335559685&quot;:1080,&quot;335559739&quot;:0}\">\u00a0<\/span>\r\n\r\n<span data-contrast=\"auto\">FoundryLocalManager.initialize(config)<\/span> \r\n<span data-contrast=\"auto\">manager =\u00a0FoundryLocalManager.instance<\/span><span data-ccp-props=\"{&quot;335557856&quot;:15921906,&quot;335559685&quot;:1080,&quot;335559739&quot;:0}\">\u00a0<\/span>\r\n\r\n<span data-contrast=\"auto\">model =\u00a0manager.catalog.get_model(<\/span><span data-ccp-props=\"{&quot;335557856&quot;:15921906,&quot;335559685&quot;:1080,&quot;335559739&quot;:0}\">\u00a0<\/span>\r\n\r\n<span data-contrast=\"auto\"> \u00a0\u00a0 \"nemotron-3.5-asr-streaming-0.6b\"<\/span><span data-ccp-props=\"{&quot;335557856&quot;:15921906,&quot;335559685&quot;:1080,&quot;335559739&quot;:0}\">\u00a0<\/span>\r\n\r\n<span data-contrast=\"auto\">)<\/span><span data-ccp-props=\"{&quot;335557856&quot;:15921906,&quot;335559685&quot;:1080,&quot;335559739&quot;:0}\">\u00a0<\/span>\r\n\r\n<span data-contrast=\"auto\">model.download()<\/span> \r\n<span data-contrast=\"auto\">model.load()<\/span>  \r\n\r\n<span data-contrast=\"auto\">session =\u00a0model.get_audio_client().create_live_transcription_session()<\/span> \r\n<span data-contrast=\"auto\">session.settings.sample_rate\u00a0= 16000<\/span> \r\n<span data-contrast=\"auto\">session.settings.channels\u00a0= 1<\/span> \r\n<span data-contrast=\"auto\">session.settings.language\u00a0= \"auto\"\u00a0\u00a0 # or \"de\", \"zh-CN\", \"en\", ...<\/span><span data-ccp-props=\"{&quot;335557856&quot;:15921906,&quot;335559685&quot;:1080,&quot;335559739&quot;:0}\">\u00a0<\/span>\u00a0 \u00a0\r\n\r\n<span data-contrast=\"auto\">session.start()<\/span> \r\n<span data-contrast=\"auto\">session.append(pcm_bytes)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 # push audio chunks from a mic\/file<\/span> \r\n<span data-contrast=\"auto\">for result in\u00a0session.get_stream():<\/span> \r\n<span data-contrast=\"auto\">\u00a0\u00a0\u00a0 print(result.content[0].text)\u00a0\u00a0\u00a0\u00a0# clean text, inline language tags stripped<\/span> \r\n<span data-contrast=\"auto\">session.stop()<\/span><span data-ccp-props=\"{&quot;335557856&quot;:15921906,&quot;335559685&quot;:1080,&quot;335559739&quot;:40}\">\u00a0<\/span><\/pre>\n<p><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559685&quot;:720,&quot;335559731&quot;:0,&quot;335559739&quot;:200,&quot;335559740&quot;:276,&quot;335559991&quot;:360,&quot;469777462&quot;:[720],&quot;469777927&quot;:[0],&quot;469777928&quot;:[8]}\">\u00a0<\/span><\/p>\n<p><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559685&quot;:720,&quot;335559731&quot;:0,&quot;335559739&quot;:200,&quot;335559740&quot;:276,&quot;335559991&quot;:360,&quot;469777462&quot;:[720],&quot;469777927&quot;:[0],&quot;469777928&quot;:[8]}\">\u00a0<\/span><\/p>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"45\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;singleLevel&quot;}\" data-aria-posinset=\"2\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Linux ARM64 support:\u00a0<\/span><\/b><span data-contrast=\"auto\">Run Foundry Local on ARM-based Linux systems, including Raspberry Pi 5, NVIDIA Jetson, AWS Graviton, and Ampere, to extend local AI to more edge and embedded scenarios. Try out:\u00a0<\/span><a href=\"https:\/\/github.com\/microsoft\/Foundry-Local\/blob\/main\/README.md\"><span data-contrast=\"none\">https:\/\/github.com\/microsoft\/Foundry-Local\/blob\/main\/README.md<\/span><\/a><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559685&quot;:720,&quot;335559739&quot;:200,&quot;335559740&quot;:276,&quot;335559991&quot;:360,&quot;469777462&quot;:[720],&quot;469777927&quot;:[0],&quot;469777928&quot;:[8]}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"45\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;singleLevel&quot;}\" data-aria-posinset=\"3\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Faster model downloads via cross-region catalog<\/span><\/b><span data-contrast=\"auto\">:\u00a0Foundry Local\u00a0now fronts the model catalog with Azure Traffic Manager, routing each user to the best-performing region, so end users see noticeably faster first-run model downloads. No code changes\u00a0required\u00a0\u2014 developers just need to bump\u00a0to\u00a0the v1.2.0 SDK.<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559685&quot;:720,&quot;335559739&quot;:200,&quot;335559740&quot;:276,&quot;335559991&quot;:360,&quot;469777462&quot;:[720],&quot;469777927&quot;:[0],&quot;469777928&quot;:[8]}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"45\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;singleLevel&quot;}\" data-aria-posinset=\"4\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Download and EP cancellation across all 5 SDKs:\u00a0<\/span><\/b><span data-contrast=\"auto\">Cancel model and execution-provider downloads from C#, Python, JavaScript, Rust, and C++ using each language&#8217;s native cancellation pattern. Try out:\u00a0<\/span><span data-contrast=\"none\">https:\/\/github.com\/microsoft\/Foundry-Local\/blob\/main\/README.md<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559685&quot;:720,&quot;335559739&quot;:200,&quot;335559740&quot;:276,&quot;335559991&quot;:360,&quot;469777462&quot;:[720],&quot;469777927&quot;:[0],&quot;469777928&quot;:[8]}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"45\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;singleLevel&quot;}\" data-aria-posinset=\"5\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Inference cancellation:\u00a0<\/span><\/b><span data-contrast=\"auto\">Cancel in-flight chat completions and transcription sessions cleanly when users move on, without wasted compute or orphaned streams. Try out:\u00a0<\/span><span data-contrast=\"none\">https:\/\/github.com\/microsoft\/Foundry-Local\/blob\/main\/README.md<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559685&quot;:720,&quot;335559739&quot;:200,&quot;335559740&quot;:276,&quot;335559991&quot;:360,&quot;469777462&quot;:[720],&quot;469777927&quot;:[0],&quot;469777928&quot;:[8]}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"45\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;singleLevel&quot;}\" data-aria-posinset=\"6\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Per-EP download progress in Python:\u00a0<\/span><\/b><span data-contrast=\"auto\">Surface per-provider download progress in Python instead of a generic spinner. Try out:\u00a0<\/span><span data-contrast=\"none\">https:\/\/github.com\/microsoft\/Foundry-Local\/tree\/main\/sdk\/python<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559685&quot;:720,&quot;335559739&quot;:200,&quot;335559740&quot;:276,&quot;335559991&quot;:360,&quot;469777462&quot;:[720],&quot;469777927&quot;:[0],&quot;469777928&quot;:[8]}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"45\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;singleLevel&quot;}\" data-aria-posinset=\"7\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Upgraded to Windows ML (WinML) 2.0:\u00a0<\/span><\/b><span data-contrast=\"auto\">The\u00a0Foundry Local\u00a0WinML\u00a0packages\u00a0now ship\u00a0with\u00a0the latest\u00a0WinML\u00a02.0, removing the previous Windows App SDK runtime dependency and bootstrap step so Python, JavaScript, Rust, and C++ apps get NPU and GPU acceleration with no extra installation or\u00a0initialization\u00a0code. Try out:\u00a0<\/span><span data-contrast=\"none\">https:\/\/learn.microsoft.com\/en-us\/windows\/ai\/new-windows-ml\/overview<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559685&quot;:720,&quot;335559739&quot;:200,&quot;335559740&quot;:276,&quot;335559991&quot;:360,&quot;469777462&quot;:[720],&quot;469777927&quot;:[0],&quot;469777928&quot;:[8]}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"45\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;singleLevel&quot;}\" data-aria-posinset=\"8\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">WebGPU\u00a0execution provider for\u00a0WinML:\u00a0<\/span><\/b><span data-contrast=\"auto\">Expand GPU acceleration coverage across more Windows hardware with the new\u00a0WebGPU\u00a0execution provider for the\u00a0WinML\u00a0SDK. Try out:\u00a0<\/span><span data-contrast=\"none\">https:\/\/learn.microsoft.com\/en-us\/windows\/ai\/new-windows-ml\/overview<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559685&quot;:720,&quot;335559739&quot;:200,&quot;335559740&quot;:276,&quot;335559991&quot;:360,&quot;469777462&quot;:[720],&quot;469777927&quot;:[0],&quot;469777928&quot;:[8]}\">\u00a0<\/span><\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p><span style=\"font-size: 18pt;\"><strong>Foundry Local in action: voice input in GitHub Copilot CLI\u00a0<\/strong><\/span><\/p>\n<p><span data-contrast=\"auto\">The GitHub Copilot CLI&#8217;s voice input is built on Foundry Local. When you dictate a prompt in the terminal, audio is captured from your mic, streamed into a Foundry Local live transcription session running the\u00a0Nemotron\u00a0ASR Streaming model, and the partial + final results are piped straight into the CLI&#8217;s input buffer \u2014 all on-device, no cloud hop, no audio leaving the machine.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">To enable use\u00a0<\/span><span data-contrast=\"auto\">\/voice on<\/span><span data-contrast=\"auto\">\u00a0and then you can speak into your Copilot CLI\u00a0by holding\u00a0space (or,\u00a0Ctrl+k\u00a0v to toggle):<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><div style=\"width: 1920px;\" class=\"wp-video\"><video class=\"wp-video-shortcode\" id=\"video-2319-1\" width=\"1920\" height=\"1080\" preload=\"metadata\" controls=\"controls\"><source type=\"video\/mp4\" src=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2026\/06\/ghcp-voice.mp4?_=1\" \/><a href=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2026\/06\/ghcp-voice.mp4\">https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2026\/06\/ghcp-voice.mp4<\/a><\/video><\/div><\/p>\n<p><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">There is no private API or custom integration here. The CLI uses the same\u00a0create_live_transcription_session() entry point shown in the snippet above, with the same\u00a0sample_rate\u00a0\/ channels \/ language=&#8221;auto&#8221; settings, the same append(pcm_bytes) push model, and the same\u00a0get_stream() iterator. Cancellation when you hit Esc mid-utterance uses the new 1.2.0 inference cancellation path.\u00a0If you have the Copilot CLI installed, run a few prompts with voice and look at:<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"47\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"1\" data-aria-level=\"1\"><span data-contrast=\"auto\">End-to-end latency from speech to token \u2014\u00a0that&#8217;s\u00a0your floor for what a\u00a0streaming-ASR\u00a0UX feels like on the user&#8217;s hardware.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"47\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"2\" data-aria-level=\"1\"><span data-contrast=\"auto\">Quality \u2013 the\u00a0model delivers high accuracy\u00a0(in our internal testing the model delivers\u00a0~8% Word Error Rate).<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"47\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"3\" data-aria-level=\"1\"><span data-contrast=\"auto\">Low\u00a0Resource usage while transcribing \u2014\u00a0the model uses\u00a0low single digit\u00a0(%)\u00a0CPU\u00a0resource.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<\/ul>\n<p aria-level=\"2\"><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:160,&quot;335559739&quot;:80}\">\u00a0<\/span><\/p>\n<p aria-level=\"2\"><span data-contrast=\"auto\">If the\u00a0behavior\u00a0works for your use case, you can reproduce it in your own app in a few lines using any of the five SDKs \u2014 no extra services to stand up, no per-minute transcription bill.<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:160,&quot;335559739&quot;:80}\">\u00a0<\/span><\/p>\n<p aria-level=\"1\"><span style=\"font-size: 24pt;\"><strong>How developers are using Foundry Local\u00a0<\/strong><\/span><\/p>\n<p><span data-contrast=\"auto\">Foundry Local is already being used across privacy-sensitive, performance-sensitive, and hardware-diverse scenarios. From local assistants and document workflows to multimodal context collection and enterprise AI pipelines, developers are using it to reduce platform complexity and deliver production-ready AI experiences faster.<\/span><span data-ccp-props=\"{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:276}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">\u00a0<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559740&quot;:276}\">\u00a0<\/span><\/p>\n<p><div style=\"width: 1920px;\" class=\"wp-video\"><video class=\"wp-video-shortcode\" id=\"video-2319-2\" width=\"1920\" height=\"1072\" preload=\"metadata\" controls=\"controls\"><source type=\"video\/mp4\" src=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2026\/06\/FoundryLocal_Blog_updated.mp4?_=2\" \/><a href=\"https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2026\/06\/FoundryLocal_Blog_updated.mp4\">https:\/\/devblogs.microsoft.com\/foundry\/wp-content\/uploads\/sites\/89\/2026\/06\/FoundryLocal_Blog_updated.mp4<\/a><\/video><\/div><\/p>\n<p><span data-contrast=\"auto\">\u00a0<\/span><span data-ccp-props=\"{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:276}\">\u00a0<\/span><\/p>\n<h2><span style=\"font-size: 18pt;\">Privacy-first and secure local AI<\/span><\/h2>\n<p>Across consumer apps and enterprise workflows, developers are using Foundry Local to keep sensitive data closer to the device while delivering faster, more responsive AI experiences.<\/p>\n<p><strong>Foxit PDF Editor AI Assistant<\/strong><\/p>\n<p>Foxit uses Foundry Local to bring secure, local AI into document workflows such as question answering, summarization, translation, and document understanding. The result is a more practical path to on-device AI that helps keep sensitive information closer to the user while simplifying deployment at scale.<\/p>\n<p><em>\u201cFoundry Local gives us a practical way to bring powerful AI experiences directly into PDF workflows while keeping sensitive data closer to the user. Just as importantly, its managed local model approach helps simplify deployment, improve reliability, and reduce the operational burden of delivering on-device AI at scale.\u201d &#8211; Queena Wei, SVP of Product at Foxit<\/em><\/p>\n<p>&nbsp;<\/p>\n<p><strong>Raycast<\/strong><\/p>\n<p>Raycast uses Foundry Local to make privacy-first, on-device AI more accessible to end users. By simplifying model discovery and local interaction, it helps bring local AI into everyday workflows with less friction.<\/p>\n<p><em>\u201cThe integration of Foundry Local into Raycast gives our users the perfect option for privacy-first local AI. With it, they can easily leverage a variety of powerful models optimized for their Windows devices. Foundry Local made it super easy for us to implement the first step, a platform to browse and install models and a quick chat interface to use them, no internet required.\u201d &#8211; Thomas Paul Mann, CEO &amp; Founder at Raycast<\/em><\/p>\n<p>&nbsp;<\/p>\n<p><strong>Rakuten<\/strong><\/p>\n<p>Rakuten uses Foundry Local to bring responsive, privacy-sensitive AI experiences directly onto the device while balancing local responsiveness with broader cloud-connected capabilities. The result is a hybrid experience that feels more natural to end users while improving efficiency behind the scenes.<\/p>\n<p><em>&#8220;Through our partnership with HP, Rakuten AI for Desktop uses Foundry Local to bring AI closer to the user \u2014 running responsive, privacy-sensitive experiences directly on the device while reducing cloud inference costs. Combined with Rakuten AI\u2019s cloud intelligence and ecosystem integrations, this enables a hybrid AI experience that feels native to the desktop and scales efficiently for more advanced tasks.&#8221; &#8211; Vasanth Raju, Head of AI Product at Rakuten Group<\/em><\/p>\n<p><em>\u00a0<\/em><\/p>\n<p><strong>PhonePe<\/strong><\/p>\n<p>PhonePe uses Foundry Local to power AI-driven transaction insights in its digital payments app with strong data protection. This helps deliver more responsive, privacy-conscious AI experiences without requiring personal financial information to leave the device.<\/p>\n<p>&nbsp;<\/p>\n<p><strong>Liquid AI\u2019s ShieldFlow<\/strong><\/p>\n<p><strong>ShieldFlow<\/strong> is an on-device privacy layer to redact sensitive data and prevent prompt injection before any prompt leaves the device. Through Foundry Local, ShieldFlow runs efficiently on CPUs on every Windows device including AI PCs, and enterprises can pull customized Liquid Foundational Model (LFM)\u00a0 tuned to their own policies and roll them out across their Windows fleet through a single managed runtime.<\/p>\n<p>&nbsp;<\/p>\n<p><em>\u00a0<\/em><\/p>\n<h2>Hardware portability and cross-device optimization<\/h2>\n<p>For teams building across different chips and execution environments, Foundry Local helps reduce hardware-specific complexity and accelerate deployment across devices.<\/p>\n<p><strong>Cephable<\/strong><\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Cephable<\/strong><\/span> is a private AI assistant that runs entirely on device, enabling voice control, dictation, content generation, and task automation across apps. With Foundry Local, Cephable\u2019s AI features run faster, support more models across NPU, GPU, and CPU, and let the team focus on building the assistant instead of managing silicon-specific optimizations.<\/p>\n<p><em>&#8220;Since shifting from our custom inferencing implementation to Foundry Local, our engineers have been able to ship core features faster. We&#8217;re saving dozens of hours on optimizing models and managing build pipelines to handle the right acceleration in the right version of our app package. This directly leads to a better user experience and more choice for our users.&#8221; &#8211; Cordellia Yokum, Director and Principal Architect at Cephable<\/em><\/p>\n<p>&nbsp;<\/p>\n<p><strong>FlowyAIPC<\/strong><\/p>\n<p><strong>FlowyAIPC<\/strong> builds an intelligent assistant for the era of heterogeneous AIPC silicon. FlowyAIPC integrates Foundry Local and Windows ML to solve the fundamental challenge of model-hardware decoupling across Intel, AMD, Qualcomm, and NVIDIA chips spanning CPU, NPU, iGPU, and dGPU.<\/p>\n<p><em>\u201cBy leveraging Foundry Local&#8217;s automatic hardware detection and execution-provider abstraction, FlowyAIPC dynamically routes AI workloads to the optimal compute unit without user intervention: lightweight inference and sustained background tasks tap the NPU for power efficiency, while demanding generative workloads seamlessly spill to the GPU or CPU.\u201d &#8211; Guoliang QI, CEO at StarwaveAI<\/em><\/p>\n<p>&nbsp;<\/p>\n<p><strong>AnythingLLM<\/strong><\/p>\n<p>AnythingLLM is a local-first, zero-configuration AI desktop application that allows enterprises to run LLMs completely on-device. Instead of maintaining separate runtimes for each hardware configuration, AnythingLLM uses Foundry Local to deliver on-device AI across a broad range of silicon platforms.<\/p>\n<p><em>&#8220;With the rapid pace of AI software, maintaining custom runtimes for every specialized NPU and hardware configuration on the market creates a massive development bottleneck. The Foundry Local SDK helps us solve this by providing optimized, hardware-level, vendor agnostic performance out of the box, allowing us to deliver a consistent and secure local AI experience to our Windows users globally without the engineering overhead.&#8221; &#8211; Timothy Carambat, Founder &amp; CEO at AnythingLLM<\/em><\/p>\n<p>&nbsp;<\/p>\n<p><strong>LUCI Desktop by Memories.ai<\/strong><\/p>\n<p>Memories.ai uses Foundry Local to run multimodal models efficiently across Qualcomm, Intel, and AMD devices in LUCI Desktop which provides an on-device context layer for PCs. That portability helps the team scale on-device research and multimodal workflows without extensive per-chip optimization.<\/p>\n<p><em>&#8220;Foundry Local SDK took the silicon-portability problem off our plate \u2014 one SDK, simple APIs, and our multimodal models run efficiently across Qualcomm, Intel, and AMD without weeks of per-chip optimization. It lets us scale our on-device research globally on day one and keeps our team focused on the harder problems above the silicon layer.&#8221; &#8211; Shawn Shen, CEO at Memories.ai<\/em><\/p>\n<p>&nbsp;<\/p>\n<p><strong>Model HQ by LLMWare<\/strong><\/p>\n<p>Model HQ enables enterprise teams to build and run RAG pipelines and multi-step agents locally on AI PCs and private servers using a no-code interface.\u00a0By integrating Foundry Local, Model HQ enables fast, offline-capable AI experiences directly on Windows devices built on chips from AMD, Intel, Qualcomm and Nvidia.<\/p>\n<p><em>\u201cThe Foundry Local SDK made it incredibly easy for us to integrate NPU-optimized local AI models directly into Model HQ and rapidly deliver high-performance on-device NPU inferencing with minimal engineering overhead. It has significantly accelerated our ability to fully leverage emerging NPU compute capabilities for fast, efficient, and power-optimized local AI experiences.\u201d &#8211; Darren Oberst, Co-Founder at LLMWare<\/em><\/p>\n<p>&nbsp;<\/p>\n<p>Taken together, these customer stories show what Foundry Local means for developers in practice: fewer runtime and hardware-specific hurdles, faster paths from prototype to production, and more control over how AI runs on real devices. Whether you\u2019re building privacy-sensitive apps, deploying across diverse silicon, or operationalizing local RAG and agent workflows, Foundry Local helps you spend less time stitching infrastructure together and more time shipping experiences that work.<\/p>\n<p>&nbsp;<\/p>\n<p aria-level=\"2\"><span style=\"font-size: 24pt;\"><strong>Foundry Local on Azure Local\u00a0<\/strong><\/span><\/p>\n<p><span data-contrast=\"auto\">At Build,\u00a0we\u2019re\u00a0also introducing\u00a0<\/span><b><span data-contrast=\"auto\">Foundry Local on Azure Local<\/span><\/b><span data-contrast=\"auto\">\u00a0in preview: a new on-premises AI platform for running models, agents, and tools at enterprise scale.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Designed for organizations that\u00a0seek\u00a0control, compliance, and low-latency execution, Foundry Local on Azure Local runs as containerized Kubernetes workloads on Azure Local and is orchestrated through Azure Arc. It helps teams deploy consistently across edge, hybrid, and fully disconnected environments while keeping AI close to the data and operations that depend on it.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Here are some of the key preview capabilities announced today:<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"2\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"1\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Model Catalog on Azure Local<\/span><\/b><span data-contrast=\"auto\">\u00a0&#8211;\u00a0Run and swap\u00a0models\u00a0and custom models locally through one API across prebuilt ONNX and\u00a0vLLM\u00a0inference, from single-node to multi-node deployments.\u00a0<\/span><a href=\"https:\/\/nam06.safelinks.protection.outlook.com\/?url=https%3A%2F%2Faka.ms%2FFoundryLoca_Techcommunity_Build_blog&amp;data=05%7C02%7CInbal.Sagiv%40microsoft.com%7Cdafc0c275e404232153e08deb69ac35f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C639148972993747578%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&amp;sdata=m8amRIilBDOUMjllGIdTbs5iyFeQ4vyNdRPgrmyh2lo%3D&amp;reserved=0\"><span data-contrast=\"none\">https:\/\/aka.ms\/FoundryLoca_Techcommunity_Build_blog<\/span><\/a><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<\/ul>\n<p><span data-contrast=\"none\">Register to get access to Foundry Local on Azure Local preview:\u00a0<\/span><a href=\"https:\/\/aka.ms\/FoundryLocalAzure_PreviewRequest\"><span data-contrast=\"none\">https:\/\/aka.ms\/FoundryLocalAzure_PreviewRequest<\/span><\/a><span data-contrast=\"none\">\u00a0<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559685&quot;:720,&quot;335559739&quot;:0,&quot;335559740&quot;:300}\">\u00a0<\/span><\/p>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"2\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"2\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Agentic Retrieval in Foundry Local<\/span><\/b><span data-contrast=\"auto\">\u00a0&#8211;\u00a0Ground agents in enterprise data with built-in multi-step agentic RAG and a local chat experience.\u00a0<\/span><a href=\"https:\/\/nam06.safelinks.protection.outlook.com\/?url=https%3A%2F%2Faka.ms%2FAgentsAndToolsBuildBlog2026&amp;data=05%7C02%7CInbal.Sagiv%40microsoft.com%7Cdafc0c275e404232153e08deb69ac35f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C639148972993698453%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&amp;sdata=IjfwUiJXZUFq1E3190EbbHm4LgSV7MUaxlufY3HWvb8%3D&amp;reserved=0\"><span data-contrast=\"none\">https:\/\/aka.ms\/AgentsAndToolsBuildBlog2026<\/span><\/a><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"2\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"3\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Custom MCP tools<\/span><\/b><span data-contrast=\"auto\">\u00a0&#8211;\u00a0Extend agents with custom tool servers using the Model Context Protocol (MCP) standard.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"2\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"4\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Solution templates for Azure Local<\/span><\/b><span data-contrast=\"auto\">\u00a0&#8211;\u00a0Start faster with code samples for chat interfaces on Azure Local and video agents powered by Azure AI Video Indexer on Azure Local.\u00a0<\/span><a href=\"https:\/\/nam06.safelinks.protection.outlook.com\/?url=https%3A%2F%2Faka.ms%2Ffoundry-local-model-catalog-blog&amp;data=05%7C02%7CInbal.Sagiv%40microsoft.com%7Cdafc0c275e404232153e08deb69ac35f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C639148972993771472%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&amp;sdata=8QJFxCNRdDJn%2BH9VvDtTCMWotBTXmLoT8PEoyaG%2BRg0%3D&amp;reserved=0\"><span data-contrast=\"none\">https:\/\/aka.ms\/foundry-local-model-catalog-blog<\/span><\/a><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"2\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"5\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">GitHub Enterprise Local<\/span><\/b><span data-contrast=\"auto\">\u00a0&#8211;\u00a0Build and deploy AI apps end to end on-premises with local repos, CI\/CD pipelines, and integrated security scanning.\u00a0<\/span><a href=\"https:\/\/aka.ms\/GHEL\"><span data-contrast=\"none\">https:\/\/aka.ms\/GHEL<\/span><\/a><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"2\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"6\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Azure Local for small form factor devices<\/span><\/b><span data-contrast=\"auto\">\u00a0&#8211;\u00a0Extend Azure Local to industrial PCs and ruggedized devices for manufacturing and retail edge deployments, with turnkey AI inference and Azure Arc-based device management.\u00a0<\/span><a href=\"https:\/\/aka.ms\/AzureSFF\"><span data-contrast=\"none\">https:\/\/aka.ms\/AzureSFF<\/span><\/a><span data-contrast=\"auto\">\u00a0<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<li aria-setsize=\"-1\" data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"2\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"6\" data-aria-level=\"1\"><strong>Watch the demo<\/strong> &#8211; <a href=\"https:\/\/nam06.safelinks.protection.outlook.com\/?url=http%3A%2F%2Faka.ms%2FAzureSFFLaunchDemo&amp;data=05%7C02%7CInbal.Sagiv%40microsoft.com%7C0011ea620b6e4b10c61c08dec1117986%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C639160478236376451%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&amp;sdata=63zmJiS4h%2FmSKXTBX9AsYj3ku7DrEp9BEE6VZ2JXewQ%3D&amp;reserved=0\">aka.ms\/AzureSFFLaunchDemo<\/a><\/li>\n<\/ul>\n<p><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:300}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Early momentum is already visible across sovereign, industrial, and disconnected scenarios where organizations\u00a0seek\u00a0to have\u00a0AI run reliably under strict operational and\u00a0compliance\u00a0constraints.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><i><span data-contrast=\"none\">&#8220;In energy operations, AI needs to run where the work happens \u2013 at remote facilities, offshore platforms, and field locations where connectivity is often limited, and safety is paramount.\u00a0Foundry Local\u00a0on Azure Local\u00a0gives us a path to bring AI-driven decision-making closer to our operational data, with the\u00a0governance\u00a0our industry demands. The ability to deploy and run AI workloads consistently across edge and field environments, even when disconnected, is critical as we advance\u00a0Chevron&#8217;s\u00a0vision for autonomous and intelligent operations.&#8221;\u00a0<\/span><\/i><span data-contrast=\"none\">\u00a0(Chevron) Ed Moore &#8211; OT Strategist and Distinguished Engineer<\/span><span data-ccp-props=\"{&quot;335557856&quot;:16777215,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335557856&quot;:16777215,&quot;335559685&quot;:360,&quot;335559739&quot;:0,&quot;335559740&quot;:300,&quot;335559991&quot;:360}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Together, these capabilities help organizations support both sovereign AI requirements, such as data control and\u00a0compliance, and industrial edge scenarios that depend on real-time, localized execution.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p>&nbsp;<\/p>\n<p aria-level=\"1\"><span style=\"font-size: 24pt;\"><strong>Get started\u00a0<\/strong><\/span><\/p>\n<p><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">If you want to start building with Foundry Local, begin with the\u00a0<a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/foundry-local\/\">documentation<\/a>, <a href=\"https:\/\/aka.ms\/edgeai-for-beginners\">Edge AI for Beginners<\/a>, explore the available samples, and test local inference in your own application workflow. From there, you can evaluate the right model, runtime, and hardware path for your scenario, whether you\u2019re building for AI PCs, enterprise apps, edge devices, or disconnected environments.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">If you\u2019re following Microsoft Build 2026, these related sessions can help you go deeper into the announcements and developer scenarios supported by these releases:<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\u00b7\" data-font=\"Symbol\" data-listid=\"38\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\u00b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"1\" data-aria-level=\"1\"><strong>BRK260 &#8211; <\/strong>\u00a0<a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/BRK260?source=sessions\">Build Apps w\/ Local AI for Unmetered Intelligence on every Windows PC<\/a><\/li>\n<li aria-setsize=\"-1\" data-leveltext=\"\u00b7\" data-font=\"Symbol\" data-listid=\"38\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\u00b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"1\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">OD833 <\/span><\/b><span data-contrast=\"auto\">&#8211; \u00a0<\/span><a href=\"https:\/\/github.com\/microsoft\/Build26-OD833-deploy-ai-offline-creating-apps-with-foundry-local\"><span data-contrast=\"none\">Deploy AI offline: Creating apps with Foundry Local<\/span><\/a><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\u00b7\" data-font=\"Symbol\" data-listid=\"38\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\u00b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"2\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">OD837 <\/span><\/b><span data-contrast=\"auto\">&#8211; <\/span><span data-contrast=\"auto\">\u00a0<\/span><a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/OD837?source=sessions\"><span data-contrast=\"none\">Build and deploy AI at the edge for real-world impact<\/span><\/a><span data-contrast=\"auto\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li aria-setsize=\"-1\" data-leveltext=\"\u00b7\" data-font=\"Symbol\" data-listid=\"38\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\u00b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"3\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">OD839 <\/span><\/b><span data-contrast=\"auto\">&#8211; <\/span><a href=\"https:\/\/build.microsoft.com\/en-US\/sessions\/OD839?source=sessions\"><span data-contrast=\"none\">AI solutions built to power industrial innovation and sovereign control<\/span><\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Why edge AI development is still hard\u00a0 AI is no longer confined to cloud experiments. Developers are increasingly expected to deliver AI inside apps, devices, and edge systems where responsiveness, privacy, resilience, and local control are essential. But building those experiences for production is still difficult.\u00a0 Teams often\u00a0have to\u00a0solve model packaging, runtime fragmentation, hardware differences, [&hellip;]<\/p>\n","protected":false},"author":189734,"featured_media":2678,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[49,37,163,1],"tags":[25,3,10,16,38,35,34,28],"class_list":["post-2319","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-aiagent","category-foundrylocal","category-microsoft-build","category-microsoft-foundry","tag-agents","tag-ai-development","tag-ai-agents","tag-ai-applications","tag-foundry-local","tag-local-ai","tag-microsoft-build","tag-whats-new"],"acf":[],"blog_post_summary":"<p>Why edge AI development is still hard\u00a0 AI is no longer confined to cloud experiments. Developers are increasingly expected to deliver AI inside apps, devices, and edge systems where responsiveness, privacy, resilience, and local control are essential. But building those experiences for production is still difficult.\u00a0 Teams often\u00a0have to\u00a0solve model packaging, runtime fragmentation, hardware differences, [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/posts\/2319","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/users\/189734"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/comments?post=2319"}],"version-history":[{"count":2,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/posts\/2319\/revisions"}],"predecessor-version":[{"id":2701,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/posts\/2319\/revisions\/2701"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/media\/2678"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/media?parent=2319"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/categories?post=2319"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/foundry\/wp-json\/wp\/v2\/tags?post=2319"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}