{"id":9205,"date":"2024-12-05T07:00:11","date_gmt":"2024-12-05T15:00:11","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/cosmosdb\/?p=9205"},"modified":"2024-12-05T13:25:54","modified_gmt":"2024-12-05T21:25:54","slug":"announcing-azure-cosmos-db-integration-with-spring-ai-and-langchain4j","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/cosmosdb\/announcing-azure-cosmos-db-integration-with-spring-ai-and-langchain4j\/","title":{"rendered":"Announcing Azure Cosmos DB Integration with Spring AI and Langchain4J!"},"content":{"rendered":"<p>We\u2019re continuing to simplify AI app development by integrating <strong>Azure Cosmos DB\u2019s<\/strong> cost-effective and scalable vector search directly with <a href=\"https:\/\/spring.io\/projects\/spring-ai\" target=\"_blank\" rel=\"noopener\"><strong>Spring AI<\/strong><\/a> and <a href=\"https:\/\/docs.langchain4j.dev\/\" target=\"_blank\" rel=\"noopener\"><strong>LangChain4J<\/strong><\/a>! These frameworks empower Java developers to efficiently manage fast and accurate data retrieval with Azure Cosmos DB\u2019s scalability and efficient <a href=\"https:\/\/learn.microsoft.com\/azure\/cosmos-db\/nosql\/vector-search\" target=\"_blank\" rel=\"noopener\">vector indexing and search<\/a> capabilities, allowing for efficient handling of high-dimensional vectors. By using Azure Cosmos DB\u2019s vector store, developers can keep vectors directly within their documents, enabling seamless integration with Spring AI or Langchain4J for advanced vector search queries.<\/p>\n<h3>Spring AI vs Langchain4J<\/h3>\n<p><a href=\"https:\/\/docs.langchain4j.dev\/\" target=\"_blank\" rel=\"noopener\">Langchain4J<\/a> and <a href=\"https:\/\/docs.spring.io\/spring-ai\/reference\/index.html\" target=\"_blank\" rel=\"noopener\">Spring AI<\/a> are both Java frameworks tailored for integrating AI capabilities into applications, but differ slightly in focus and approach. Both are designed to work seamlessly with large language models (LLMs), emphasizing modularity for constructing complex AI workflows like chaining prompts, handling memory, and managing vector stores. <strong>Spring AI<\/strong> is also a Spring ecosystem extension which brings along <a href=\"https:\/\/spring.io\/projects\/spring-boot\" target=\"_blank\" rel=\"noopener\">Spring Boot\u2019s<\/a> familiar configurations for ease of use. While Langchain4J excels in flexibility and advanced LLM-specific tools, Spring AI leverages <a href=\"https:\/\/spring.io\/projects\/spring-framework\" target=\"_blank\" rel=\"noopener\">Spring&#8217;s robust framework<\/a> ecosystem for quick, standardized integration, making it ideal for developers already invested in Spring-based projects, such as Azure Cosmos DB&#8217;s existing <a href=\"https:\/\/learn.microsoft.com\/java\/api\/overview\/azure\/spring-data-cosmos-readme?view=azure-java-stable\" target=\"_blank\" rel=\"noopener\">Spring Data Module<\/a>.<\/p>\n<h3>Easy AI app development with Spring AI<\/h3>\n<p>The below examples are taken from the full sample <a href=\"https:\/\/github.com\/Azure-Samples\/cosmosdb-spring-ai-sample\" target=\"_blank\" rel=\"noopener\">here<\/a>. Creating an AI Chat Bot with <a href=\"https:\/\/learn.microsoft.com\/azure\/cosmos-db\/gen-ai\/rag\" target=\"_blank\" rel=\"noopener\">Retrieval Augmented Generation (RAG)<\/a> using your own private data is easy in Spring AI once you have imported the required libraries:<\/p>\n<pre class=\"prettyprint language-xml\"><code class=\"language-xml\">&lt;dependencies&gt;\r\n\t&lt;dependency&gt;\r\n\t\t&lt;groupId&gt;org.springframework.ai&lt;\/groupId&gt;\r\n\t\t&lt;artifactId&gt;spring-ai-core&lt;\/artifactId&gt;\r\n\t\t&lt;version&gt;1.0.0-M4&lt;\/version&gt;\r\n\t&lt;\/dependency&gt;\r\n\t&lt;dependency&gt;\r\n\t\t&lt;groupId&gt;org.springframework.ai&lt;\/groupId&gt;\r\n\t\t&lt;artifactId&gt;spring-ai-azure-cosmos-db-store-spring-boot-starter&lt;\/artifactId&gt;\r\n\t\t&lt;version&gt;1.0.0-M4&lt;\/version&gt;\r\n\t&lt;\/dependency&gt;\r\n\t&lt;dependency&gt;\r\n\t\t&lt;groupId&gt;org.springframework.ai&lt;\/groupId&gt;\r\n\t\t&lt;artifactId&gt;spring-ai-spring-boot-autoconfigure&lt;\/artifactId&gt;\r\n\t\t&lt;version&gt;1.0.0-M4&lt;\/version&gt;\r\n\t\t&lt;scope&gt;compile&lt;\/scope&gt;\r\n\t&lt;\/dependency&gt;\r\n\t&lt;dependency&gt;\r\n\t\t&lt;groupId&gt;org.springframework.ai&lt;\/groupId&gt;\r\n\t\t&lt;artifactId&gt;spring-ai-azure-openai&lt;\/artifactId&gt;\r\n\t\t&lt;version&gt;1.0.0-M4&lt;\/version&gt;\r\n\t\t&lt;scope&gt;compile&lt;\/scope&gt;\r\n\t&lt;\/dependency&gt;\r\n&lt;\/dependencies&gt;<\/code><\/pre>\n<p>You can then create a configuration class for Spring AI&#8217;s ChatClient:<\/p>\n<pre class=\"prettyprint language-java\"><code class=\"language-java\">package com.microsoft.azure.spring.chatgpt.sample.webapi;\r\n\r\nimport org.springframework.context.annotation.Bean;\r\nimport org.springframework.context.annotation.Configuration;\r\nimport org.springframework.ai.chat.client.ChatClient;\r\n\r\n@Configuration\r\npublic class ChatClientConfiguration {\r\n\r\n    private final ChatClient.Builder chatClientBuilder;\r\n\r\n    public ChatClientConfiguration(ChatClient.Builder chatClientBuilder) {\r\n        this.chatClientBuilder = chatClientBuilder;\r\n    }\r\n\r\n    @Bean\r\n    public ChatClient chatClient() {\r\n        return chatClientBuilder.build();\r\n    }\r\n}<\/code><\/pre>\n<p>Next, create a custom message class that implements Spring AI&#8217;s Message interface:<\/p>\n<pre class=\"prettyprint language-java\"><code class=\"language-java\">package com.microsoft.azure.spring.chatgpt.sample.webapi.models;\r\n\r\nimport org.springframework.ai.chat.messages.Message;\r\nimport org.springframework.ai.chat.messages.MessageType;\r\n\r\nimport java.util.Map;\r\n\r\npublic class ChatMessage implements Message {\r\n    private String role;\r\n    private String content;\r\n\r\n    public ChatMessage(String role, String content) {\r\n        this.role = role;\r\n        this.content = content;\r\n    }\r\n\r\n    @Override\r\n    public String getContent() {\r\n        return content;\r\n    }\r\n\r\n    @Override\r\n    public Map&lt;String, Object&gt; getMetadata() {\r\n        return Map.of();\r\n    }\r\n\r\n    @Override\r\n    public MessageType getMessageType() {\r\n        return MessageType.USER;\r\n    }\r\n}<\/code><\/pre>\n<p>Then, add appropriate configuration in your application.properties file that will be autowired, like below:<\/p>\n<pre class=\"prettyprint language-default\"><code class=\"language-default\">spring.ai.vectorstore.cosmosdb.databaseName=cosmosdb\r\nspring.ai.vectorstore.cosmosdb.containerName=vectorstore\r\nspring.ai.vectorstore.cosmosdb.partitionKeyPath=\/id\r\nspring.ai.vectorstore.cosmosdb.metadataFields=metadata\r\nspring.ai.vectorstore.cosmosdb.vectorStoreThoughput=1000\r\nspring.ai.vectorstore.cosmosdb.endpoint=${COSMOSDB_AI_ENDPOINT}\r\nspring.ai.vectorstore.cosmosdb.key=${COSMOSDB_AI_KEY}\r\nspring.ai.azure.openai.api-key=${AZURE_OPENAI_APIKEY}\r\nspring.ai.azure.openai.endpoint=${AZURE_OPENAI_ENDPOINT}\r\nspring.ai.azure.openai.embedding.options.deployment-name=text-embedding-ada-002<\/code><\/pre>\n<p>Finally, you can autowire VectorStore and ChatClient, and then do chat completion with RAG in one simple method within your controller. Chat completion is handled by your chatClient, and RAG is handled by calling the QuestionAnswerAdvisor on the chatClient. Here, we also add a custom method to transform the incoming messages into the expected ChatMessage format we defined above. Find out more about the powerful <a href=\"https:\/\/docs.spring.io\/spring-ai\/reference\/api\/chatclient.html\" target=\"_blank\" rel=\"noopener\">Spring AI Chat Client API<\/a> here.<\/p>\n<pre class=\"prettyprint language-java\"><code class=\"language-java\">package com.microsoft.azure.spring.chatgpt.sample.webapi.controllers;\r\n\r\nimport com.fasterxml.jackson.core.JsonProcessingException;\r\nimport com.fasterxml.jackson.databind.ObjectMapper;\r\nimport com.microsoft.azure.spring.chatgpt.sample.webapi.models.ChatMessage;\r\nimport io.micrometer.observation.ObservationRegistry;\r\nimport org.springframework.ai.chat.client.ChatClient;\r\nimport org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor;\r\nimport org.springframework.ai.chat.messages.Message;\r\nimport org.springframework.ai.vectorstore.SearchRequest;\r\nimport org.springframework.ai.vectorstore.VectorStore;\r\nimport org.springframework.beans.factory.annotation.Autowired;\r\nimport org.springframework.context.annotation.Bean;\r\nimport org.springframework.context.annotation.Lazy;\r\nimport org.springframework.web.bind.annotation.PostMapping;\r\nimport org.springframework.web.bind.annotation.RequestBody;\r\nimport org.springframework.web.bind.annotation.RequestMapping;\r\nimport org.springframework.web.bind.annotation.RestController;\r\nimport java.util.List;\r\nimport java.util.stream.Collectors;\r\nimport java.util.stream.StreamSupport;\r\n\r\n@RestController\r\n@RequestMapping(\"\/chat\")\r\npublic class ChatController {\r\n\r\n    private static final String prompt = \"\"\"\r\n            Context information is below.\r\n            ---------------------\r\n            %s\r\n            ---------------------\r\n            Given the context information and not prior knowledge, answer the question: %s\r\n            \"\"\";\r\n\r\n    @Autowired\r\n    @Lazy\r\n    private VectorStore vectorStore;\r\n\r\n    @Autowired\r\n    @Lazy\r\n    private ChatClient chatClient;\r\n\r\n    @PostMapping(\"\/completions\")\r\n    public String generation(@RequestBody String userInput) throws JsonProcessingException {\r\n        List&lt;Message&gt; messages = getMessages(userInput);\r\n        String response = this.chatClient.prompt(prompt)\r\n                .user(messages.get(0).getContent())\r\n                .messages(messages)\r\n                .advisors(new QuestionAnswerAdvisor(vectorStore, SearchRequest.query(messages.get(0).getContent()).withTopK(5)))\r\n                .call()\r\n                .content();\r\n        return response;\r\n    }\r\n\r\n    @Bean\r\n    public ObservationRegistry observationRegistry() {\r\n        return ObservationRegistry.create();\r\n    }\r\n\r\n    public List&lt;Message&gt; getMessages(String text) throws JsonProcessingException {\r\n        ObjectMapper objectMapper = new ObjectMapper();\r\n        return StreamSupport.stream(objectMapper.readTree(text).get(\"messages\").spliterator(), false)\r\n                .map(node -&gt; new ChatMessage(node.get(\"role\").asText(), node.get(\"content\").asText()))\r\n                .collect(Collectors.toList());\r\n    }\r\n}<\/code><\/pre>\n<h3>Spring AI Integration &#8211; features and benefits<\/h3>\n<p>The Spring AI integration for Azure Cosmos DB brings some powerful benefits. Here are just a few:<\/p>\n<h4>Efficient batch loading of vector embeddings<\/h4>\n<div class=\"paragraph\">\n<p>When working with vector stores, it\u2019s often necessary to embed large numbers of documents. While it might seem straightforward to make a single call to embed all documents at once, this approach can lead to issues. Embedding models process text as tokens and have a maximum token limit, often referred to as the context window size. This limit restricts the amount of text that can be processed in a single embedding request. Attempting to embed too many tokens in one call can result in errors or truncated embeddings. To address this token limit, Spring AI implements a batching strategy. This approach breaks down large sets of documents into smaller batches that fit within the embedding model\u2019s maximum context window. Batching not only solves the token limit issue but can also lead to improved performance and more efficient use of API rate limits. In Azure Cosmos DB&#8217;s <a href=\"https:\/\/docs.spring.io\/spring-ai\/reference\/api\/vectordbs\/azure-cosmos-db.html\" target=\"_blank\" rel=\"noopener\">vector store implementation for Spring AI<\/a>, both bulk execution in the Java SDK and batching strategy in Spring AI are implemented, leading to efficient loading for large volumes of vectors. The above mentioned <a href=\"https:\/\/github.com\/Azure-Samples\/cosmosdb-spring-ai-sample\" target=\"_blank\" rel=\"noopener\">sample<\/a> includes a <a href=\"https:\/\/github.com\/Azure-Samples\/cosmosdb-spring-ai-sample\/blob\/main\/src\/main\/java\/com\/microsoft\/azure\/spring\/chatgpt\/sample\/cli\/CliApplication.java\" target=\"_blank\" rel=\"noopener\">CLI application<\/a> for loading of vectors.<\/p>\n<h4>Vector indexing with DiskANN<\/h4>\n<div class=\"sectionbody\">\n<div class=\"paragraph\">\n<p>In Spring AI for Azure Cosmos DB, vector searches will create and leverage DiskANN indexes to ensure optimal performance for similarity queries. DiskANN (Disk-based Approximate Nearest Neighbor Search) is an innovative technology used in Azure Cosmos DB to enhance the performance of vector searches. It enables efficient and scalable similarity searches across high-dimensional data by indexing embeddings stored in Cosmos DB.<\/p>\n<\/div>\n<div class=\"paragraph\">\n<p>DiskANN provides the following benefits:<\/p>\n<\/div>\n<div class=\"ulist\">\n<ul>\n<li><strong>Efficiency<\/strong>: By utilizing disk-based structures, DiskANN significantly reduces the time required to find nearest neighbors compared to traditional methods.<\/li>\n<li><strong>Scalability<\/strong>: It can handle large datasets that exceed memory capacity, making it suitable for various applications, including machine learning and AI-driven solutions.<\/li>\n<li><strong>Low Latency<\/strong>: DiskANN minimizes latency during search operations, ensuring that applications can retrieve results quickly even with substantial data volumes.<\/li>\n<\/ul>\n<\/div>\n<div class=\"paragraph\">\n<h4 id=\"page-title\" class=\"page\">Function Calling API<\/h4>\n<p>You can register custom Java functions with the\u00a0<code>ChatClient<\/code> and have the AI model intelligently choose to output a JSON object containing arguments to call one or many of the registered functions. This allows you to connect the LLM capabilities with external tools and APIs. The AI models are trained to detect when a function should be called and to respond with JSON that adheres to the function signature. Learn more about how function calling works <a href=\"https:\/\/docs.spring.io\/spring-ai\/reference\/api\/functions.html\" target=\"_blank\" rel=\"noopener\">here<\/a>.<\/p>\n<h3>Langchain4j Integration<\/h3>\n<p>We&#8217;ve also integrated Azure Cosmos DB with <a href=\"https:\/\/docs.langchain4j.dev\/\" target=\"_blank\" rel=\"noopener\">Langchain4J<\/a>. Take a look at our sample <a href=\"https:\/\/github.com\/microsoft\/AzureDataRetrievalAugmentedGenerationSamples\/tree\/main\/Java\/CosmosDB-NoSQL-Langchain\" target=\"_blank\" rel=\"noopener\">here<\/a> for a quick start!<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-content\/uploads\/sites\/52\/2024\/12\/langchain4j-demo.gif\"><img decoding=\"async\" class=\"alignnone wp-image-9214 size-full\" src=\"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-content\/uploads\/sites\/52\/2024\/12\/langchain4j-demo.gif\" alt=\"Image langchain4j demo\" width=\"1280\" height=\"720\" \/><\/a><\/p>\n<\/div>\n<\/div>\n<\/div>\n<h2>Leave a review<\/h2>\n<p>Tell us about your Azure Cosmos DB experience! Leave a review on PeerSpot and we\u2019ll gift you $50. <a href=\"https:\/\/peerspotdotcom.my.site.com\/proReviews\/?SalesOpportunityProduct=00kPy000004TKXJIA4&amp;productPeerspotNumber=30881&amp;CalendlyAccount=peerspot&amp;CalendlyFormLink=peerspot-product-reviews-ps-gc-vi-sf-50&amp;giftCard=50\" target=\"_blank\" rel=\"noopener\">Get started here<\/a>.<\/p>\n<h2>About Azure Cosmos DB<\/h2>\n<p>Azure Cosmos DB is a fully managed and serverless NoSQL and vector database for modern app development, including AI applications. With its SLA-backed speed and availability as well as instant dynamic scalability, it is ideal for real-time NoSQL and MongoDB applications that require high performance and distributed computing over massive volumes of NoSQL and vector data.<\/p>\n<p><a href=\"https:\/\/cosmos.azure.com\/try\/\" target=\"_blank\" rel=\"noopener\">Try Azure Cosmos DB for free here.<\/a> To stay in the loop on Azure Cosmos DB updates, follow us on <a href=\"https:\/\/twitter.com\/AzureCosmosDB\" target=\"_blank\" rel=\"noopener\">X<\/a>, <a href=\"https:\/\/aka.ms\/AzureCosmosDBYouTube\" target=\"_blank\" rel=\"noopener\">YouTube<\/a>, and <a href=\"https:\/\/www.linkedin.com\/company\/azure-cosmos-db\/\" target=\"_blank\" rel=\"noopener\">LinkedIn<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We\u2019re continuing to simplify AI app development by integrating Azure Cosmos DB\u2019s cost-effective and scalable vector search directly with Spring AI and LangChain4J! These frameworks empower Java developers to efficiently manage fast and accurate data retrieval with Azure Cosmos DB\u2019s scalability and efficient vector indexing and search capabilities, allowing for efficient handling of high-dimensional vectors. [&hellip;]<\/p>\n","protected":false},"author":9387,"featured_media":9229,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1610,14,643,1849],"tags":[],"class_list":["post-9205","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","category-core-sql-api","category-java-sdk","category-spring-data"],"acf":[],"blog_post_summary":"<p>We\u2019re continuing to simplify AI app development by integrating Azure Cosmos DB\u2019s cost-effective and scalable vector search directly with Spring AI and LangChain4J! These frameworks empower Java developers to efficiently manage fast and accurate data retrieval with Azure Cosmos DB\u2019s scalability and efficient vector indexing and search capabilities, allowing for efficient handling of high-dimensional vectors. [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/posts\/9205","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/users\/9387"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/comments?post=9205"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/posts\/9205\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/media\/9229"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/media?parent=9205"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/categories?post=9205"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cosmosdb\/wp-json\/wp\/v2\/tags?post=9205"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}