{"id":232297,"date":"2025-06-17T11:46:40","date_gmt":"2025-06-17T18:46:40","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/java\/?p=232297"},"modified":"2025-06-17T11:46:40","modified_gmt":"2025-06-17T18:46:40","slug":"connect-spring-ai-to-local-ai-models-with-foundry-local","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/java\/connect-spring-ai-to-local-ai-models-with-foundry-local\/","title":{"rendered":"Connect Spring AI to Local AI Models with Foundry Local"},"content":{"rendered":"<h2>What is Azure AI Foundry and Foundry Local?<\/h2>\n<p>Azure AI Foundry is Microsoft&#8217;s comprehensive platform for enterprise AI development and deployment,\nenabling organizations to build, customize, and operate AI solutions at scale. It provides tools,\nservices, and infrastructure to develop, fine-tune and deploy AI models in production environments\nwith enterprise-grade security and compliance.<\/p>\n<p>Foundry Local is the desktop companion to Azure AI Foundry that brings powerful AI model inference\nto your local machine. It&#8217;s an open-source tool that provides an OpenAI-compatible API, allowing\ndevelopers to run, test, and integrate large language models directly on their own hardware without\nsending data to the cloud. This makes it ideal for development, testing, and scenarios requiring\ndata privacy or offline operation.<\/p>\n<h2>Why Use Foundry Local with Spring AI?<\/h2>\n<p>Running AI models locally has become increasingly valuable for Java developers who want to reduce latency,\navoid network dependency, and eliminate API cost overhead. Foundry Local makes this seamless by offering\nan OpenAI-compatible API, allowing Spring AI to interact with local models as if they were hosted in the cloud.<\/p>\n<p>In this guide, we\u2019ll walk you through setting up Foundry Local, integrating it with a Spring Boot project using\nSpring AI, and building a simple REST API to interact with the AI model for chat and summarization tasks. This\nenables powerful local inference with models like Phi-3.5, Qwen, or DeepSeek, using standard Spring idioms.<\/p>\n<h2>Quick Setup<\/h2>\n<h3>Step 1: Install Foundry Local<\/h3>\n<p>Foundry Local can be installed via Homebrew for macOS users:<\/p>\n<pre>$ brew tap microsoft\/foundrylocal\r\n$ brew install foundrylocal\r\n$ foundry --help<\/pre>\n<p>This sets up the CLI tools necessary to run and manage language models locally.<\/p>\n<h3>Step 2: Download and Start a Model<\/h3>\n<p>Before interacting with any model, you need to list available models, download the one you want,\nand load it into memory:<\/p>\n<pre># Discover available models\r\n$ foundry model list\r\n\r\n# Download the Phi-3.5 mini model\r\n$ foundry model download phi-3.5-mini\r\n\r\n# Load it into memory for serving\r\n$ foundry model load phi-3.5-mini\r\n\r\n# Check model and service status\r\n$ foundry service status<\/pre>\n<p>Once loaded, the model will be available through a local HTTP endpoint using an OpenAI-compatible interface.<\/p>\n<h2>Create a Spring Boot Project<\/h2>\n<p>You can use Spring Initializr to quickly generate a project with the required dependencies.<\/p>\n<p>Run the following cURL command to download a Spring Starter project with the required dependencies:<\/p>\n<pre>curl -G https:\/\/start.spring.io\/starter.zip \\\r\n    -d javaVersion=21 \\\r\n    -d baseDir=demo \\\r\n    -d language=java \\\r\n    -d bootVersion=3.5.0 \\\r\n    -d type=maven-project \\\r\n    -d dependencies=web,spring-ai-openai \\\r\n    -o demo.zip<\/pre>\n<p>Now unzip the project and open Visual Studio Code.<\/p>\n<pre>unzip demo.zip\r\ncode demo\/<\/pre>\n<p>If you\u2019re adding Spring AI to an existing project, use the following Maven dependencies:<\/p>\n<pre>&lt;dependencies&gt;\r\n    &lt;dependency&gt;\r\n        &lt;groupId&gt;org.springframework.boot&lt;\/groupId&gt;\r\n        &lt;artifactId&gt;spring-boot-starter-web&lt;\/artifactId&gt;\r\n    &lt;\/dependency&gt;\r\n    &lt;dependency&gt;\r\n        &lt;groupId&gt;org.springframework.ai&lt;\/groupId&gt;\r\n        &lt;artifactId&gt;spring-ai-starter-model-openai&lt;\/artifactId&gt;\r\n        &lt;version&gt;1.0.0&lt;\/version&gt;\r\n    &lt;\/dependency&gt;\r\n&lt;\/dependencies&gt;\r\n<\/pre>\n<p>These dependencies enable the REST interface and Spring AI\u2019s OpenAI integration capabilities.<\/p>\n<h2>Connect Spring AI to Foundry Local<\/h2>\n<h3>Step 1: Configure Properties<\/h3>\n<p>Update application.properties to point Spring AI to your locally running Foundry instance:<\/p>\n<pre># Azure AI Foundry Local Configuration\r\nspring.ai.openai.api-key=not-used\r\nspring.ai.openai.base-url=http:\/\/localhost:8081\r\nspring.ai.openai.chat.options.model=Phi-3.5-mini-instruct-generic-gpu\r\n\r\n# Server Configuration\r\nserver.port=8080<\/pre>\n<p>Even though an API key is required by the config, it won\u2019t be used for local models.<\/p>\n<h3>Step 2: Configuration Class<\/h3>\n<p>This configuration class sets up the OpenAI-compatible API client and initializes the chat model with your local settings:<\/p>\n<pre>@Configuration\r\npublic class FoundryLocalConfig {\r\n\r\n    @Value(\"${spring.ai.openai.base-url}\") private String baseUrl;\r\n    @Value(\"${spring.ai.openai.api-key}\")  private String apiKey;\r\n    @Value(\"${spring.ai.openai.chat.options.model}\") private String modelName;\r\n\r\n\t@Bean\r\n\tpublic OpenAiApi openAiApi() {\r\n\t\treturn OpenAiApi.builder().baseUrl(baseUrl).apiKey(apiKey).build();\r\n\t}\r\n\r\n\t@Bean\r\n\tpublic OpenAiChatModel chatModel(OpenAiApi openAiApi) {\r\n\t\tOpenAiChatOptions options = OpenAiChatOptions.builder()\r\n\t\t\t\t.model(modelName)\r\n\t\t\t\t.temperature(0.7)\r\n\t\t\t\t.build();\r\n\t\treturn OpenAiChatModel.builder().openAiApi(openAiApi).defaultOptions(options).build();\r\n\t}\r\n}<\/pre>\n<p>This is the bridge that connects Spring AI\u2019s abstraction to the local model served by Foundry.<\/p>\n<h3>Step 3: Implement a Service Layer<\/h3>\n<p>Create a service class to interact with the chat model. This encapsulates the logic for calling the model:<\/p>\n<pre>@Service\r\npublic class AIService {\r\n\r\n    private final OpenAiChatModel chatModel;\r\n\r\n    public AIService(OpenAiChatModel _chatModel) { chatModel = _chatModel;\r\n    }\r\n\r\n    public String chat(String message) {\r\n        try {\r\n            return chatModel.call(message);\r\n        } catch (Exception e) {\r\n            throw new RuntimeException(\"Error calling AI model: \" + e.getMessage(), e);\r\n        }\r\n    }\r\n\r\n    public String summarizeText(String text) {\r\n        String prompt = \"Please provide a concise summary of the following text:\\n\\n\" + text;\r\n        return chatModel.call(prompt);\r\n    }\r\n}<\/pre>\n<p>The `chat()` and `summarizeText()` methods abstract out the logic so your controller stays clean.<\/p>\n<h3>Step 4: Create REST Endpoints<\/h3>\n<p>Now expose those methods via HTTP endpoints:<\/p>\n<pre>@RestController\r\n@RequestMapping(\"\/api\/ai\")\r\npublic class AIController {\r\n\r\n    private final AIService aiService;\r\n\r\n    public AIController(AIService _aiService) { aiService = _aiService; }\r\n\r\n    @PostMapping(\"\/chat\")\r\n    public ResponseEntity&lt;Map&lt;String, String&gt;&gt; chat(@RequestBody Map&lt;String, String&gt; request) {\r\n        String message = request.get(\"message\");\r\n        if (message == null || message.trim().isEmpty()) {\r\n            return ResponseEntity.badRequest().body(Map.of(\"error\", \"Message is required\"));\r\n        }\r\n        \r\n        String response = aiService.chat(message);\r\n        return ResponseEntity.ok(Map.of(\"response\", response));\r\n    }\r\n\r\n    @PostMapping(\"\/summarize\")\r\n    public ResponseEntity&lt;Map&lt;String, String&gt;&gt; summarize(@RequestBody Map&lt;String, String&gt; request) {\r\n        String text = request.get(\"text\");\r\n        if (text == null || text.trim().isEmpty()) {\r\n            return ResponseEntity.badRequest().body(Map.of(\"error\", \"Text is required\"));\r\n        }\r\n        \r\n        String summary = aiService.summarizeText(text);\r\n        return ResponseEntity.ok(Map.of(\"summary\", summary));\r\n    }\r\n\r\n    @GetMapping(\"\/health\")\r\n    public ResponseEntity&lt;Map&lt;String, String&gt;&gt; health() {\r\n        try {\r\n            aiService.chat(\"Hello\");\r\n            return ResponseEntity.ok(Map.of(\"status\", \"healthy\"));\r\n        } catch (Exception e) {\r\n            return ResponseEntity.status(HttpStatus.SERVICE_UNAVAILABLE)\r\n                .body(Map.of(\"status\", \"unhealthy\", \"error\", e.getMessage()));\r\n        }\r\n    }\r\n}\r\n<\/pre>\n<p>With these endpoints, your app becomes a simple gateway for local AI-powered inference.<\/p>\n<h2>Running the Application<\/h2>\n<h3>Step 1: Start Foundry Local<\/h3>\n<p>Configure Foundry to serve the model on port 8081:<\/p>\n<pre>foundry service set --port 8081\r\nfoundry service start\r\nfoundry model load phi-3.5-mini<\/pre>\n<p>Make sure this service is running and healthy before starting your Spring Boot app.<\/p>\n<h3>Step 2: Start Spring Boot<\/h3>\n<p>Use Maven to build and run the app:<\/p>\n<pre>mvn clean compile\r\nmvn spring-boot:run<\/pre>\n<h3>Step 3: Test Your Endpoints<\/h3>\n<p>Perform a health check<\/p>\n<pre>curl http:\/\/localhost:8080\/api\/ai\/health<\/pre>\n<p>Send a chat message:<\/p>\n<pre>curl -X POST http:\/\/localhost:8080\/api\/ai\/chat \\\r\n  -H \"Content-Type: application\/json\" \\\r\n  -d '{\"message\": \"What is Spring AI?\"}'<\/pre>\n<p>Summarize a block of text:<\/p>\n<pre>curl -X POST http:\/\/localhost:8080\/api\/ai\/summarize \\\r\n  -H \"Content-Type: application\/json\" \\\r\n  -d '{\"text\": \"Your long text here...\"}'<\/pre>\n<h2>Conclusion<\/h2>\n<p>In this post, we demonstrated how to build a fully local AI-powered application using Spring AI and Foundry Local.<\/p>\n<p>Thanks to Foundry\u2019s compatibility with OpenAI\u2019s APIs, Spring AI required no special integration work\u2014we simply changed the base URL to localhost. This local-first approach has multiple advantages:<\/p>\n<ul>\n<li>Privacy &amp; Control: No data leaves your machine<\/li>\n<li>Performance: Lower latency compared to cloud APIs<\/li>\n<li>Cost: No API billing or token limits<\/li>\n<li>Simplicity: Familiar Spring configuration and idioms<\/li>\n<\/ul>\n<h2>References<\/h2>\n<ul style=\"list-style-type: square;\">\n<li><a href=\"https:\/\/learn.microsoft.com\/azure\/ai-foundry\/foundry-local\">Foundry Local<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/microsoft\/Foundry-Local\">Foundry Local &#8211; GitHub Repository<\/a><\/li>\n<li><a href=\"https:\/\/docs.spring.io\/spring-ai\/reference\/\">Spring AI Documentation<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>What is Azure AI Foundry and Foundry Local? Azure AI Foundry is Microsoft&#8217;s comprehensive platform for enterprise AI development and deployment, enabling organizations to build, customize, and operate AI solutions at scale. It provides tools, services, and infrastructure to develop, fine-tune and deploy AI models in production environments with enterprise-grade security and compliance. Foundry Local [&hellip;]<\/p>\n","protected":false},"author":192528,"featured_media":232302,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[803,1,15,17],"tags":[],"class_list":["post-232297","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-intelligent-apps","category-java","category-vscode","category-web"],"acf":[],"blog_post_summary":"<p>What is Azure AI Foundry and Foundry Local? Azure AI Foundry is Microsoft&#8217;s comprehensive platform for enterprise AI development and deployment, enabling organizations to build, customize, and operate AI solutions at scale. It provides tools, services, and infrastructure to develop, fine-tune and deploy AI models in production environments with enterprise-grade security and compliance. Foundry Local [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/posts\/232297","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/users\/192528"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/comments?post=232297"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/posts\/232297\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/media\/232302"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/media?parent=232297"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/categories?post=232297"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/java\/wp-json\/wp\/v2\/tags?post=232297"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}