September 23rd, 2025
0 reactions

Foundry Local Meets More Silicon

Foundry Local is a high-performance local AI runtime stack that brings Azure AI Foundry’s power to client devices. Foundry Local lets you build and ship cross-platform AI apps that run models with acceleration on a wide range of hardware.

The Evolution of AI Acceleration

On-device AI has progressed rapidly. Early workloads ran on CPUs, but performance and power limits made real-time inference difficult. GPUs improved things with parallelism, faster inference. The latest breakthrough is NPUs (Neural Processing Units), designed specifically for neural networks. NPUs deliver far greater efficiency and throughput, making advanced models practical on desktop and mobile applications. This shift enables AI that is faster, more energy-efficient, and privacy-preserving — without constant reliance on the cloud.

What’s New

With the release of Foundry Local v0.7, we are pleased to announce expanded NPU support, now including Intel and AMD NPUs on Windows 11. We are also continuing to make Foundry Local improvements for NVIDIA and Qualcomm silicon, in partnership with our silicon partners. Furthermore, leveraging Windows ML, Foundry Local automatically detects users’ silicon across NPU, GPU, and CPU and downloads the appropriate execution providers, removing the necessity for these to be bundled with Foundry Local.

Example screenshot

 

Get Started

 

On Windows

  1. Open Windows Terminal
  2. Install Foundry Local using winget
    winget install Microsoft.FoundryLocal
  3. Run a model
    foundry model run qwen2.5-0.5b

 

On MacOS

  1. Open Terminal
  2. Install Foundry Local
    brew tap microsoft/foundrylocal
    brew install foundrylocal
  3. Run a model
    foundry model run qwen2.5-0.5b

 

For more information, check out Foundry Local documentation and samples here
For developers interested in bringing their own models to Windows, check out Windows ML now available in general availability.

 

 

 

 

Author

0 comments