December 16th, 2024

Exploring Microsoft.Extensions.VectorData with Qdrant and Azure AI Search

Bruno Capuano
Cloud Advocate

Discover how to use Microsoft.Extensions.VectorData to implement semantic search using Qdrant and Azure AI Search.

Dive into Semantic Search with Microsoft.Extensions.VectorData: Qdrant and Azure AI Search

Semantic search is transforming how applications find and interpret data by focusing on meaning rather than mere keyword matching. With the release of Microsoft.Extensions.VectorData, .NET developers have a new set of building blocks to integrate vector-based search capabilities into their applications. In this post, we’ll explore two practical implementations of semantic search using Qdrant locally and Azure AI Search.

Quick Introduction to Microsoft.Extensions.VectorData

Microsoft.Extensions.VectorData is a set of code .NET libraries designed for managing vector-based data in .NET applications. These libraries provide a unified layer of C# abstractions for interacting with vector stores, enabling developers to handle embeddings and perform vector similarity queries efficiently.

To get a detailed overview of the library’s architecture and capabilities, I recommend reading Luis’s excellent blog post.

In this blog post, we’ll showcase two real-world use cases:

  1. Using Qdrant locally for semantic search.

  2. Leveraging Azure AI Search for enterprise-scale vector search.

To run the demos, you need to use one of the models provided by Ollama for the embeddings generations. In this sample, the model used is all-minilm.

Semantic Search with Qdrant

What is Qdrant?

Qdrant is a vector similarity search engine that provides a production-ready service with a convenient API to store, search, and manage points (i.e. vectors) with an additional payload. It’s perfect for applications that require efficient similarity searches. You can easily run Qdrant locally in a Docker container, making it a developer-friendly choice.

For setup instructions, refer to the Qdrant Quickstart Guide. And, as for reference, this is a sample command to run a local container instance:

docker run -p 6333:6333 -p 6334:6334 -v $(pwd)/qdrant_storage:/qdrant/storage:z qdrant/qdrant

Once the container is created, you can check it in Docker.

qdrant container running in docker

Qdrant and Semantic Kernel

Semantic Kernel provides a built-in connector for Qdrant, enabling .NET developers to store embeddings and execute vector-based queries seamlessly. This connector is built on top of Microsoft.Extensions.VectorData and the official .NET Qdrant Client.

This integration combines Qdrant’s high performance with Semantic Kernel’s ease of use.

To learn more about the connector, visit the official documentation for Semantic Kernel Vector Store Qdrant connector.

Scenario Overview – Qdrant

  • Setup: A Qdrant instance runs locally in a Docker container.
  • Functionality: A .NET console application uses the Semantic Kernel’s Qdrant connector to:
    • Store movie embeddings.
    • Perform semantic search queries.

Let’s see a sample class that implements and runs this demo.

using Microsoft.Extensions.AI;
using Microsoft.Extensions.VectorData;
using Microsoft.SemanticKernel.Connectors.Qdrant;
using Qdrant.Client;

var vectorStore = new QdrantVectorStore(new QdrantClient("localhost"));

// get movie list
var movies = vectorStore.GetCollection<ulong, MovieVector<ulong>>("movies");
await movies.CreateCollectionIfNotExistsAsync();
var movieData = MovieFactory<ulong>.GetMovieVectorList();

// get embeddings generator and generate embeddings for movies
IEmbeddingGenerator<string, Embedding<float>> generator =
    new OllamaEmbeddingGenerator(new Uri("http://localhost:11434/"), "all-minilm");
foreach (var movie in movieData)
{
    movie.Vector = await generator.GenerateEmbeddingVectorAsync(movie.Description);
    await movies.UpsertAsync(movie);
}

// perform the search
var query = "A family friendly movie that includes ogres and dragons";
var queryEmbedding = await generator.GenerateEmbeddingVectorAsync(query);

var searchOptions = new VectorSearchOptions()
{
    Top = 2,
    VectorPropertyName = "Vector"
};

var results = await movies.VectorizedSearchAsync(queryEmbedding, searchOptions);
await foreach (var result in results.Results)
{
    Console.WriteLine($"Title: {result.Record.Title}");
    Console.WriteLine($"Description: {result.Record.Description}");
    Console.WriteLine($"Score: {result.Score}");
    Console.WriteLine();
}

Once the demo is run, this is the sample output:

Title: Shrek
Description: Shrek is an animated film that tells the story of an ogre named Shrek who embarks on a quest to rescue Princess Fiona from a dragon and bring her back to the kingdom of Duloc.
Score: 0.5013245344161987

Title: Lion King
Description: The Lion King is a classic Disney animated film that tells the story of a young lion named Simba who embarks on a journey to reclaim his throne as the king of the Pride Lands after the tragic death of his father.
Score: 0.3225690722465515

Why Qdrant?

Using Qdrant for semantic search offers the advantage of scalable, high-speed similarity search, making it an excellent choice for applications requiring large-scale vector data management. Additionally, you have the option to run Qdrant Cloud on Microsoft Azure.

Semantic Search with Azure AI Search

What is Azure AI Search?

Azure AI Search is Microsoft’s search-as-a-service offering. It integrates traditional search capabilities with AI-powered features like semantic and vector search. Built for scalability and reliability, it is an ideal solution for enterprise applications requiring advanced search functionality. You can learn more about Azure AI Search.

For this sample, we will use the integrated vectorization in Azure AI Search, which improves indexing and querying by converting documents and queries into vectors.

Azure AI Search and Semantic Kernel

This connector is built on top of Microsoft.Extensions.VectorData and the official Azure AI Search libraries for .NET.

For more information, refer to the Azure AI Search connector documentation.

Scenario Overview – Azure AI Search

  • Setup: An Azure AI Search service is created in your Azure subscription.
  • Functionality: The console application:
    • Stores vector embeddings of movies.
    • Executes vector-based semantic search queries.
  • Requirements: The Azure AI Search endpoint must be added as a User Secrets in the application. With the endpoint only, the app will create an Azure Default Credential to connect to the service. If you want to use the secret to access the Azure AI Search, you need to add the value also as a User Secret.

Here is a console command sample on how to add the User Secrets:

dotnet user-secrets init
dotnet user-secrets set "AZURE_AISEARCH_URI" "https://<AI Search Name>.search.windows.net"
dotnet user-secrets set "AZURE_AISEARCH_SECRET" "AI Search Secret"

Let’s see a sample class that implements and runs this demo.

using Microsoft.Extensions.AI;
using Microsoft.Extensions.VectorData;
using Azure;
using Azure.Search.Documents.Indexes;
using Microsoft.SemanticKernel.Connectors.AzureAISearch;
using Microsoft.Extensions.Configuration;
using Azure.Identity;
using Azure.Core;

// get the search index client using Azure Default Credentials or Azure Key Credential with the service secret
var client = GetSearchIndexClient();
var vectorStore = new AzureAISearchVectorStore(searchIndexClient: client);

// get movie list
var movies = vectorStore.GetCollection<string, MovieVector<string>>("movies");
await movies.CreateCollectionIfNotExistsAsync();
var movieData = MovieFactory<string>.GetMovieVectorList();

// get embeddings generator and generate embeddings for movies
IEmbeddingGenerator<string, Embedding<float>> generator =
    new OllamaEmbeddingGenerator(new Uri("http://localhost:11434/"), "all-minilm");
foreach (var movie in movieData)
{
    movie.Vector = await generator.GenerateEmbeddingVectorAsync(movie.Description);
    await movies.UpsertAsync(movie);
}

// perform the search
var query = "A family friendly movie that includes ogres and dragons";
var queryEmbedding = await generator.GenerateEmbeddingVectorAsync(query);

// show the results...

Once the demo is run, this is the sample output:

Title: Shrek
Description: Shrek is an animated film that tells the story of an ogre named Shrek who embarks on a quest to rescue Princess Fiona from a dragon and bring her back to the kingdom of Duloc.
Score: 0.6672559

And we can see the new index with the Movie fields in the Azure Portal in the Azure AI Search service.

index with the Movie fields in the Azure Portal in the Azure AI Search service

Why Azure AI Search?

Azure AI Search provides enterprise-grade scalability and integration, making it a robust solution for production-ready applications requiring advanced semantic search. Additionally, AI Search includes built-in security features, such as encryption and secure authentication, to protect your data. It also adheres to compliance standards, ensuring that your search solutions meet regulatory requirements.

Explaining the Code

Console Applications for Demonstrations

Each semantic search demo is implemented as a .NET 9 Console Application. The codebase for the samples can be traced back to the original demo provided by Luis, with extensions for both Azure AI Search and Qdrant scenarios.

Visual Studio Solution Explorer including all the sample projects

Shared Class for Data Representation

A shared class represents a Movie entity, which includes:

  • Fields for Vector Embeddings: These embeddings are used to perform semantic search.
  • List of Movies: A static list of movies is generated to serve as sample data.
  • Type Factory for Keys: The class implements a factory pattern to handle differences in key data types.

Handling Different Data Types for Keys

  • Qdrant: Uses ulong as the data type for its key field.
  • Azure AI Search: Uses string as the key field’s data type.
  • MovieFactory: Ensures that the application generates the correct data type for each scenario, maintaining flexibility across implementations.

Movie Factory Implementation

public class MovieFactory<T>
{
    public static List<Movie<T>> GetMovieList()
    {
        var movieData = new List<Movie<T>>()
        {
            // all movie sample collection is defined here
        };
        return movieData;
    }

    public static List<MovieVector<T>> GetMovieVectorList()
    {
        var movieData = GetMovieList();
        var movieVectorData = new List<MovieVector<T>>();
        foreach (var movie in movieData)
        {
            movieVectorData.Add(new MovieVector<T>
            {
                Key = movie.Key,
                Title = movie.Title,
                Description = movie.Description
            });
        }
        return movieVectorData;
    }

You can browse the github repository with the complete code samples.

What’s Coming Next?

The journey with Microsoft.Extensions.VectorData doesn’t stop here. You can choose other connectors like SQLite in memory, Pinecone or Redis; enabling developers to run lightweight semantic search solutions locally. This feature will be perfect for scenarios where performance and simplicity are essential.

And we are also working with partners like Elasticsearch are already building on top of Microsoft.Extensions.VectorData. You can learn more about this use case on Customer Case Study: Announcing the Microsoft Semantic Kernel Elasticsearch Connector.

Conclusion and Learn More

The combination of Microsoft.Extensions.VectorData and Semantic Kernel, allows .NET developers to build intelligent, scalable, and context-aware applications. Whether you’re working on a small-scale project or a large enterprise system, these tools provide the foundation for delivering cutting-edge semantic search experiences.

Learn More

Summary

Stay tuned for more tutorials and resources, and feel free to connect with us on social media for questions or feedback. Happy Coding!

Author

Bruno Capuano
Cloud Advocate

2 comments