We’re continuing to simplify AI app development by integrating Azure Cosmos DB’s cost-effective and scalable vector search directly with Spring AI and LangChain4J! These frameworks empower Java developers to efficiently manage fast and accurate data retrieval with Azure Cosmos DB’s scalability and efficient vector indexing and search capabilities, allowing for efficient handling of high-dimensional vectors. By using Azure Cosmos DB’s vector store, developers can keep vectors directly within their documents, enabling seamless integration with Spring AI or Langchain4J for advanced vector search queries.
Spring AI vs Langchain4J
Langchain4J and Spring AI are both Java frameworks tailored for integrating AI capabilities into applications, but differ slightly in focus and approach. Both are designed to work seamlessly with large language models (LLMs), emphasizing modularity for constructing complex AI workflows like chaining prompts, handling memory, and managing vector stores. Spring AI is also a Spring ecosystem extension which brings along Spring Boot’s familiar configurations for ease of use. While Langchain4J excels in flexibility and advanced LLM-specific tools, Spring AI leverages Spring’s robust framework ecosystem for quick, standardized integration, making it ideal for developers already invested in Spring-based projects, such as Azure Cosmos DB’s existing Spring Data Module.
Easy AI app development with Spring AI
The below examples are taken from the full sample here. Creating an AI Chat Bot with Retrieval Augmented Generation (RAG) using your own private data is easy in Spring AI once you have imported the required libraries:
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-core</artifactId>
<version>1.0.0-M4</version>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-azure-cosmos-db-store-spring-boot-starter</artifactId>
<version>1.0.0-M4</version>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-spring-boot-autoconfigure</artifactId>
<version>1.0.0-M4</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-azure-openai</artifactId>
<version>1.0.0-M4</version>
<scope>compile</scope>
</dependency>
</dependencies>
You can then create a configuration class for Spring AI’s ChatClient:
package com.microsoft.azure.spring.chatgpt.sample.webapi;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.ai.chat.client.ChatClient;
@Configuration
public class ChatClientConfiguration {
private final ChatClient.Builder chatClientBuilder;
public ChatClientConfiguration(ChatClient.Builder chatClientBuilder) {
this.chatClientBuilder = chatClientBuilder;
}
@Bean
public ChatClient chatClient() {
return chatClientBuilder.build();
}
}
Next, create a custom message class that implements Spring AI’s Message interface:
package com.microsoft.azure.spring.chatgpt.sample.webapi.models;
import org.springframework.ai.chat.messages.Message;
import org.springframework.ai.chat.messages.MessageType;
import java.util.Map;
public class ChatMessage implements Message {
private String role;
private String content;
public ChatMessage(String role, String content) {
this.role = role;
this.content = content;
}
@Override
public String getContent() {
return content;
}
@Override
public Map<String, Object> getMetadata() {
return Map.of();
}
@Override
public MessageType getMessageType() {
return MessageType.USER;
}
}
Then, add appropriate configuration in your application.properties file that will be autowired, like below:
spring.ai.vectorstore.cosmosdb.databaseName=cosmosdb
spring.ai.vectorstore.cosmosdb.containerName=vectorstore
spring.ai.vectorstore.cosmosdb.partitionKeyPath=/id
spring.ai.vectorstore.cosmosdb.metadataFields=metadata
spring.ai.vectorstore.cosmosdb.vectorStoreThoughput=1000
spring.ai.vectorstore.cosmosdb.endpoint=${COSMOSDB_AI_ENDPOINT}
spring.ai.vectorstore.cosmosdb.key=${COSMOSDB_AI_KEY}
spring.ai.azure.openai.api-key=${AZURE_OPENAI_APIKEY}
spring.ai.azure.openai.endpoint=${AZURE_OPENAI_ENDPOINT}
spring.ai.azure.openai.embedding.options.deployment-name=text-embedding-ada-002
Finally, you can autowire VectorStore and ChatClient, and then do chat completion with RAG in one simple method within your controller. Chat completion is handled by your chatClient, and RAG is handled by calling the QuestionAnswerAdvisor on the chatClient. Here, we also add a custom method to transform the incoming messages into the expected ChatMessage format we defined above. Find out more about the powerful Spring AI Chat Client API here.
package com.microsoft.azure.spring.chatgpt.sample.webapi.controllers;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.microsoft.azure.spring.chatgpt.sample.webapi.models.ChatMessage;
import io.micrometer.observation.ObservationRegistry;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor;
import org.springframework.ai.chat.messages.Message;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Lazy;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.StreamSupport;
@RestController
@RequestMapping("/chat")
public class ChatController {
private static final String prompt = """
Context information is below.
---------------------
%s
---------------------
Given the context information and not prior knowledge, answer the question: %s
""";
@Autowired
@Lazy
private VectorStore vectorStore;
@Autowired
@Lazy
private ChatClient chatClient;
@PostMapping("/completions")
public String generation(@RequestBody String userInput) throws JsonProcessingException {
List<Message> messages = getMessages(userInput);
String response = this.chatClient.prompt(prompt)
.user(messages.get(0).getContent())
.messages(messages)
.advisors(new QuestionAnswerAdvisor(vectorStore, SearchRequest.query(messages.get(0).getContent()).withTopK(5)))
.call()
.content();
return response;
}
@Bean
public ObservationRegistry observationRegistry() {
return ObservationRegistry.create();
}
public List<Message> getMessages(String text) throws JsonProcessingException {
ObjectMapper objectMapper = new ObjectMapper();
return StreamSupport.stream(objectMapper.readTree(text).get("messages").spliterator(), false)
.map(node -> new ChatMessage(node.get("role").asText(), node.get("content").asText()))
.collect(Collectors.toList());
}
}
Spring AI Integration – features and benefits
The Spring AI integration for Azure Cosmos DB brings some powerful benefits. Here are just a few:
Efficient batch loading of vector embeddings
When working with vector stores, it’s often necessary to embed large numbers of documents. While it might seem straightforward to make a single call to embed all documents at once, this approach can lead to issues. Embedding models process text as tokens and have a maximum token limit, often referred to as the context window size. This limit restricts the amount of text that can be processed in a single embedding request. Attempting to embed too many tokens in one call can result in errors or truncated embeddings. To address this token limit, Spring AI implements a batching strategy. This approach breaks down large sets of documents into smaller batches that fit within the embedding model’s maximum context window. Batching not only solves the token limit issue but can also lead to improved performance and more efficient use of API rate limits. In Azure Cosmos DB’s vector store implementation for Spring AI, both bulk execution in the Java SDK and batching strategy in Spring AI are implemented, leading to efficient loading for large volumes of vectors. The above mentioned sample includes a CLI application for loading of vectors.
Vector indexing with DiskANN
In Spring AI for Azure Cosmos DB, vector searches will create and leverage DiskANN indexes to ensure optimal performance for similarity queries. DiskANN (Disk-based Approximate Nearest Neighbor Search) is an innovative technology used in Azure Cosmos DB to enhance the performance of vector searches. It enables efficient and scalable similarity searches across high-dimensional data by indexing embeddings stored in Cosmos DB.
DiskANN provides the following benefits:
- Efficiency: By utilizing disk-based structures, DiskANN significantly reduces the time required to find nearest neighbors compared to traditional methods.
- Scalability: It can handle large datasets that exceed memory capacity, making it suitable for various applications, including machine learning and AI-driven solutions.
- Low Latency: DiskANN minimizes latency during search operations, ensuring that applications can retrieve results quickly even with substantial data volumes.
Function Calling API
You can register custom Java functions with the ChatClient
and have the AI model intelligently choose to output a JSON object containing arguments to call one or many of the registered functions. This allows you to connect the LLM capabilities with external tools and APIs. The AI models are trained to detect when a function should be called and to respond with JSON that adheres to the function signature. Learn more about how function calling works here.
Langchain4j Integration
We’ve also integrated Azure Cosmos DB with Langchain4J. Take a look at our sample here for a quick start!
Leave a review
Tell us about your Azure Cosmos DB experience! Leave a review on PeerSpot and we’ll gift you $50. Get started here.
About Azure Cosmos DB
Azure Cosmos DB is a fully managed and serverless NoSQL and vector database for modern app development, including AI applications. With its SLA-backed speed and availability as well as instant dynamic scalability, it is ideal for real-time NoSQL and MongoDB applications that require high performance and distributed computing over massive volumes of NoSQL and vector data.
Try Azure Cosmos DB for free here. To stay in the loop on Azure Cosmos DB updates, follow us on X, YouTube, and LinkedIn.
0 comments
Be the first to start the discussion.