{"id":59608,"date":"2026-02-26T10:00:00","date_gmt":"2026-02-26T18:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/dotnet\/?p=59608"},"modified":"2026-02-25T23:01:01","modified_gmt":"2026-02-26T07:01:01","slug":"vector-data-in-dotnet-building-blocks-for-ai-part-2","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/dotnet\/vector-data-in-dotnet-building-blocks-for-ai-part-2\/","title":{"rendered":"Vector Data in .NET &#8211; Building Blocks for AI Part 2"},"content":{"rendered":"<p>Welcome back to the building blocks for AI in .NET series! In <a href=\"https:\/\/devblogs.microsoft.com\/dotnet\/dotnet-ai-essentials-the-core-building-blocks-explained\/\">part one<\/a>, we explored Microsoft Extensions for AI (MEAI) and how it provides a unified interface for working with large language models. Today, we&#8217;re diving into the second building block: <strong>Microsoft.Extensions.VectorData<\/strong>.<\/p>\n<p>In the first post, we learned how to ask questions and even share some content for context with an LLM. Most applications, however, require more than just a simple question or small markdown file for context. You may want the LLM to have access to all of your product manuals to help troubleshoot customer issues, or provide your employee handbook for an HR chatbot.<\/p>\n<p>Another feature that is common in intelligent apps is semantic search. A semantic search uses the meaning of a query, not just the words or letters, to conduct the search. It does this by converting text into <em>embeddings<\/em> which are numerical representations of the semantic meaning of text, and vectors that provide insights into how they are related.<\/p>\n<p>Imagine you have a simple database with just three entries:<\/p>\n<ol>\n<li>Hall pass<\/li>\n<li>Mountain pass<\/li>\n<li>Pass (verb)<\/li>\n<\/ol>\n<p>A traditional approach to finding the answer to queries like &#8220;How do I get over the pass?&#8221; or &#8220;Where do I pick up a pass?&#8221; breaks the query down into parts to search for. The word &#8220;pass&#8221; appears in all three database items, so I receive all three entries back despite the different contexts of my queries. Here is a simplified visualization:<\/p>\n<pre><code class=\"language-text\">\"How do I get over the pass?\" \r\nHow | do | I | get | over | the | pass \r\nPass - matches all three entries \r\n\r\n\"Where do I pick up my pass?\"\r\nWhere | do | I | pick | up | my | pass \r\nPass - matches all three entries <\/code><\/pre>\n<p>Now let&#8217;s assume I use an embedding to encode the semantic meaning of the word. The database has already been encoded, but I need to create embeddings from my query. This time, however, the embeddings provide me with a semantic result, not a text-based one. The semantic approach looks like this:<\/p>\n<pre><code class=\"language-text\">\"How do I get over the pass?\" \r\n0 | 5 | etc. | 2 \r\n2 - matches the 2nd entry, \"Mountain pass\" \r\n\r\n\"Where do I pick up my pass?\" \r\n6 | 9 | etc. | 1   \r\n1 - matches the 1st entry, \"Hall pass\" <\/code><\/pre>\n<p>A special embeddings model is used to create the embeddings and is trained to understand the semantic meaning of words through context such as the related terms that appear before and after it. Instead of generating embeddings every time the application runs, it makes much more sense to store them in a database. This has the added bonus of being able to use the database&#8217;s ability to query and return results, rather than coding the logic yourself or doing it in a suboptimal way.<\/p>\n<p><em>Vector databases<\/em> are designed specifically to store vectors and embeddings. Qdrant, Redis, SQL Server and Cosmos DB are examples of services and products that support storing vector data. Just like MEAI unified LLM access, the vector data extensions provide a common abstraction for working with vector stores.<\/p>\n<h2>Why vectors matter for AI applications<\/h2>\n<p>Before we jump into the code, let&#8217;s look a little more closely at vectors. When you ask an LLM a question about your company&#8217;s documentation, the model doesn&#8217;t magically know your content. Instead, your application typically:<\/p>\n<ol>\n<li><strong>Converts your documents into embeddings<\/strong> &#8211; numerical representations that capture semantic meaning<\/li>\n<li><strong>Stores those embeddings in a vector database<\/strong> along with the original content<\/li>\n<li><strong>Converts the user&#8217;s query into an embedding<\/strong> using the same model<\/li>\n<li><strong>Performs a similarity search<\/strong> to find the most relevant documents<\/li>\n<li><strong>Passes the relevant context to the LLM<\/strong> along with the user&#8217;s query<\/li>\n<\/ol>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2026\/02\/rag-diagram-scaled.webp\"><img decoding=\"async\" class=\"alignnone size-large wp-image-59610\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2026\/02\/rag-diagram-759x1024.webp\" alt=\"rag diagram image\" width=\"759\" height=\"1024\" srcset=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2026\/02\/rag-diagram-759x1024.webp 759w, https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2026\/02\/rag-diagram-222x300.webp 222w, https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2026\/02\/rag-diagram-768x1036.webp 768w, https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2026\/02\/rag-diagram-1138x1536.webp 1138w, https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2026\/02\/rag-diagram-1518x2048.webp 1518w, https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2026\/02\/rag-diagram-scaled.webp 1853w\" sizes=\"(max-width: 759px) 100vw, 759px\" \/><\/a><\/p>\n<p>This pattern, known as RAG (Retrieval-Augmented Generation), allows models to provide accurate, grounded responses based on your specific data. The challenge? Every vector database has its own SDK, data structures, and query patterns. That&#8217;s where Microsoft.Extensions.VectorData comes in.<\/p>\n<h2>One interface, many vector stores<\/h2>\n<p>The Microsoft Extensions for Vector Data library provides abstractions that work across different vector database providers. Here&#8217;s what that looks like in practice. First, let&#8217;s look at using an example vector database, Qdrant, directly and without the abstractions:<\/p>\n<pre><code class=\"language-csharp\">var qdrantClient = new QdrantClient(\"localhost\", 6334);\r\n\r\nvar collection = \"my_collection\";\r\nawait qdrantClient.CreateCollectionAsync(collection, new VectorParams\r\n{\r\n    Size = 1536,\r\n    Distance = Distance.Cosine\r\n});\r\n\r\nvar points = new List&lt;PointStruct&gt;\r\n{\r\n    new()\r\n    {\r\n        Id = new PointId { Uuid = Guid.NewGuid().ToString() },\r\n        Vectors = embedding,\r\n        Payload =\r\n        {\r\n            [\"text\"] = \"Sample document text\",\r\n            [\"category\"] = \"documentation\"\r\n        }\r\n    }\r\n};\r\n\r\nawait qdrantClient.UpsertAsync(collection, points);\r\n\r\nvar searchResults = await qdrantClient.SearchAsync(collection, queryEmbedding, limit: 5);<\/code><\/pre>\n<p>Now let&#8217;s see the same thing using the universal abstractions:<\/p>\n<pre><code class=\"language-csharp\">\/\/ Configure embedding generation once on the vector store\r\nvar embeddingGenerator = new OpenAIClient(apiKey)\r\n    .GetEmbeddingClient(\"text-embedding-3-small\")\r\n    .AsIEmbeddingGenerator();\r\n\r\nvar vectorStore = new QdrantVectorStore(\r\n    new QdrantClient(\"localhost\"),\r\n    ownsClient: true,\r\n    new QdrantVectorStoreOptions { EmbeddingGenerator = embeddingGenerator });\r\n\r\nvar collection = vectorStore.GetCollection&lt;string, DocumentRecord&gt;(\"my_collection\");\r\nawait collection.EnsureCollectionExistsAsync();\r\n\r\nvar record = new DocumentRecord\r\n{\r\n    Key = Guid.NewGuid().ToString(),\r\n    Text = \"Sample document text\",\r\n    Category = \"documentation\"\r\n};\r\n\r\nawait collection.UpsertAsync(record);\r\n\r\nvar searchResults = collection.SearchAsync(\"find documents about sample topics\", top: 5);<\/code><\/pre>\n<p>The second example works with any supported vector store by simply changing the <code>VectorStore<\/code> implementation. Your business logic stays the same.<\/p>\n<h2>Defining your data model<\/h2>\n<p>The vector data abstractions use attributes to map your C# classes to vector database schemas. Here&#8217;s a practical example for a document store:<\/p>\n<pre><code class=\"language-csharp\">public class DocumentRecord\r\n{\r\n    [VectorStoreKey]\r\n    public string Key { get; set; }\r\n\r\n    [VectorStoreData]\r\n    public string Text { get; set; }\r\n\r\n    [VectorStoreData(IsIndexed = true)]\r\n    public string Category { get; set; }\r\n\r\n    [VectorStoreData(IsIndexed = true)]\r\n    public DateTimeOffset Timestamp { get; set; }\r\n\r\n    \/\/ The vector is automatically generated from Text when an\r\n    \/\/ IEmbeddingGenerator is configured on the collection or vector store\r\n    [VectorStoreVector(1536, DistanceFunction.CosineSimilarity)]\r\n    public string Embedding =&gt; this.Text;\r\n}<\/code><\/pre>\n<p>The attributes tell the library:<\/p>\n<ul>\n<li><strong><code>VectorStoreKey<\/code><\/strong> &#8211; This property uniquely identifies each record<\/li>\n<li><strong><code>VectorStoreData<\/code><\/strong> &#8211; These are metadata fields you can filter and retrieve<\/li>\n<li><strong><code>VectorStoreVector<\/code><\/strong> &#8211; This is the embedding vector with its dimensions and distance function<\/li>\n<\/ul>\n<h2>Working with collections<\/h2>\n<p>Once you&#8217;ve defined your data model, working with collections is straightforward. The library provides a consistent interface regardless of your underlying vector store:<\/p>\n<pre><code class=\"language-csharp\">\/\/ Get or create a collection\r\nvar collection = vectorStore.GetCollection&lt;string, DocumentRecord&gt;(\"documents\");\r\n\r\n\/\/ Check if the collection exists\r\nbool exists = await collection.CollectionExistsAsync();\r\nawait collection.EnsureCollectionExistsAsync();\r\n\r\n\/\/ Insert or update records\r\nawait collection.UpsertAsync(documentRecord);\r\n\r\n\/\/ Batch operations are supported\r\nawait collection.UpsertBatchAsync(documentRecords);\r\n\r\n\/\/ Retrieve by key\r\nvar record = await collection.GetAsync(\"some-key\");\r\n\r\n\/\/ Delete records\r\nawait collection.DeleteAsync(\"some-key\");\r\nawait collection.DeleteBatchAsync([\"key1\", \"key2\", \"key3\"]);<\/code><\/pre>\n<h2>Semantic search<\/h2>\n<p>The real power comes when you perform semantic searches using the <code>SearchAsync<\/code> method. When an <code>IEmbeddingGenerator<\/code> is configured on the vector store or collection, simply pass your query text and embeddings are generated automatically:<\/p>\n<pre><code class=\"language-csharp\">\/\/ Embeddings are generated automatically when IEmbeddingGenerator is configured\r\nawait foreach (var result in collection.SearchAsync(\"What is semantic search?\", top: 5))\r\n{\r\n    Console.WriteLine($\"Score: {result.Score}, Text: {result.Record.Text}\");\r\n}<\/code><\/pre>\n<p>If you already have a pre-computed <code>ReadOnlyMemory&lt;float&gt;<\/code> embedding\u2014for example, when batching embeddings yourself\u2014you can pass it directly instead:<\/p>\n<pre><code class=\"language-csharp\">\/\/ Pass a pre-computed embedding vector directly\r\nReadOnlyMemory&lt;float&gt; precomputedEmbedding = \/* your embedding *\/;\r\nawait foreach (var result in collection.SearchAsync(precomputedEmbedding, top: 5))\r\n{\r\n    Console.WriteLine($\"Score: {result.Score}, Text: {result.Record.Text}\");\r\n}<\/code><\/pre>\n<h2>Filtering results<\/h2>\n<p>You can combine vector similarity with metadata filtering to narrow down results:<\/p>\n<pre><code class=\"language-csharp\">var searchOptions = new VectorSearchOptions&lt;DocumentRecord&gt;\r\n{\r\n    Filter = r =&gt; r.Category == \"documentation\" &amp;&amp;\r\n                  r.Timestamp &gt; DateTimeOffset.UtcNow.AddDays(-30)\r\n};\r\n\r\nvar results = collection.SearchAsync(\"find relevant documentation\", top: 10, searchOptions);<\/code><\/pre>\n<p>Filters use standard LINQ expressions. The supported operations include:<\/p>\n<ul>\n<li>Equality comparisons (<code>==<\/code>, <code>!=<\/code>)<\/li>\n<li>Range queries (<code>&gt;<\/code>, <code>&lt;<\/code>, <code>&gt;=<\/code>, <code>&lt;=<\/code>)<\/li>\n<li>Logical operators (<code>&amp;&amp;<\/code>, <code>||<\/code>)<\/li>\n<li>Collection membership (<code>.Contains()<\/code>)<\/li>\n<\/ul>\n<h2>Integrating with embeddings<\/h2>\n<p>The recommended approach is to configure an <code>IEmbeddingGenerator<\/code> on the vector store or collection. Embeddings are then generated automatically during both upsert and search\u2014no manual preprocessing required:<\/p>\n<pre><code class=\"language-csharp\">\/\/ Configure an embedding generator on the vector store\r\nvar embeddingGenerator = new OpenAIClient(apiKey)\r\n    .GetEmbeddingClient(\"text-embedding-3-small\")\r\n    .AsIEmbeddingGenerator();\r\n\r\nvar vectorStore = new InMemoryVectorStore(new() { EmbeddingGenerator = embeddingGenerator });\r\nvar collection = vectorStore.GetCollection&lt;string, DocumentRecord&gt;(\"documents\");\r\nawait collection.EnsureCollectionExistsAsync();\r\n\r\n\/\/ Embeddings are generated automatically on upsert\r\nvar record = new DocumentRecord\r\n{\r\n    Key = Guid.NewGuid().ToString(),\r\n    Text = \"Sample text to store\"\r\n};\r\nawait collection.UpsertAsync(record);\r\n\r\n\/\/ Embeddings are also generated automatically on search\r\nawait foreach (var result in collection.SearchAsync(\"find similar text\", top: 5))\r\n{\r\n    Console.WriteLine($\"Score: {result.Score}, Text: {result.Record.Text}\");\r\n}<\/code><\/pre>\n<h2>Implementing RAG patterns<\/h2>\n<p>Bringing it all together, here&#8217;s a simplified RAG implementation using both Microsoft.Extensions.AI and Microsoft.Extensions.VectorData:<\/p>\n<pre><code class=\"language-csharp\">public async Task&lt;string&gt; AskQuestionAsync(string question)\r\n{\r\n    \/\/ Find relevant documents - embeddings are generated automatically\r\n    var contextParts = new List&lt;string&gt;();\r\n    await foreach (var result in collection.SearchAsync(question, top: 3))\r\n    {\r\n        contextParts.Add(result.Record.Text);\r\n    }\r\n\r\n    \/\/ Build context from results\r\n    var context = string.Join(\"\\n\\n\", contextParts);\r\n\r\n    \/\/ Create prompt with context\r\n    var messages = new List&lt;ChatMessage&gt;\r\n    {\r\n        new(ChatRole.System, \r\n            \"Answer questions based on the provided context. If the context doesn't contain relevant information, say so.\"),\r\n        new(ChatRole.User, \r\n            $\"Context:\\n{context}\\n\\nQuestion: {question}\")\r\n    };\r\n\r\n    \/\/ Get response from LLM\r\n    var response = await chatClient.GetResponseAsync(messages);\r\n    return response.Message.Text;\r\n}<\/code><\/pre>\n<h2>Supported vector stores<\/h2>\n<p>Microsoft.Extensions.VectorData works with a wide range of vector databases through official connectors:<\/p>\n<ul>\n<li><strong>Azure AI Search<\/strong> &#8211; <code>Microsoft.Extensions.VectorData.AzureAISearch<\/code><\/li>\n<li><strong>Qdrant<\/strong> &#8211; <code>Microsoft.SemanticKernel.Connectors.Qdrant<\/code><\/li>\n<li><strong>Redis<\/strong> &#8211; <code>Microsoft.SemanticKernel.Connectors.Redis<\/code><\/li>\n<li><strong>PostgreSQL<\/strong> &#8211; <code>Microsoft.SemanticKernel.Connectors.Postgres<\/code><\/li>\n<li><strong>Azure Cosmos DB (NoSQL)<\/strong> &#8211; <code>Microsoft.SemanticKernel.Connectors.AzureCosmosDBNoSQL<\/code><\/li>\n<li><strong>SQL Server<\/strong> &#8211; <code>Microsoft.SemanticKernel.Connectors.SqlServer<\/code><\/li>\n<li><strong>SQLite<\/strong> &#8211; <code>Microsoft.SemanticKernel.Connectors.Sqlite<\/code><\/li>\n<li><strong>In-Memory<\/strong> &#8211; <code>Microsoft.SemanticKernel.Connectors.InMemory<\/code> (great for testing and development)<\/li>\n<\/ul>\n<p>For the full list of supported connectors\u2014including Elasticsearch, MongoDB, Weaviate, Pinecone, and more\u2014see the <a href=\"https:\/\/learn.microsoft.com\/semantic-kernel\/concepts\/vector-store-connectors\/out-of-the-box-connectors\/?pivots=programming-language-csharp\">out-of-the-box connectors documentation<\/a>.<\/p>\n<h2>Why separate from the core AI extensions?<\/h2>\n<p>You might wonder why vector data is in a separate library from the core Microsoft.Extensions.AI package. The answer is simple: not every intelligent application needs vector storage. Many scenarios &#8211; like chatbots, content generation, or classification tasks &#8211; work perfectly fine with just the LLM abstractions. By keeping vector data separate, the core library remains lightweight and focused.<\/p>\n<p>When you do need vectors for semantic search, RAG, or long-term memory, you can add the vector data package and immediately benefit from the same consistent patterns you&#8217;re already using with MEAI.<\/p>\n<h2>Summary<\/h2>\n<p>Microsoft.Extensions.VectorData brings the same benefits to vector databases that Microsoft.Extensions.AI brings to LLMs: a unified, provider-agnostic interface that makes your code portable and your architecture flexible. Whether you&#8217;re implementing RAG patterns, building semantic search, or creating long-term memory for AI agents, these abstractions let you focus on your application logic instead of database-specific SDKs.<\/p>\n<p>In the next post, we&#8217;ll explore the Microsoft Agent Framework and see how these building blocks come together to create sophisticated agentic workflows. Until then, here are some resources to help you get started with vector data in .NET:<\/p>\n<ul>\n<li>Learn by code\n<ul>\n<li><a href=\"https:\/\/github.com\/dotnet\/ai-samples\">AI samples repository<\/a><\/li>\n<\/ul>\n<\/li>\n<li>Learn by following tutorials\n<ul>\n<li><a href=\"https:\/\/learn.microsoft.com\/dotnet\/ai\/\">.NET AI documentation<\/a><\/li>\n<\/ul>\n<\/li>\n<li>Learn by watching videos\n<ul>\n<li><a href=\"https:\/\/youtu.be\/qcp6ufe_XYo\">AI building blocks<\/a><\/li>\n<li><a href=\"https:\/\/youtu.be\/N0DzWMkEnzk\">Building intelligent apps with .NET<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>Happy coding!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Discover how Microsoft.Extensions.VectorData brings unified vector database access to .NET &#8211; one interface for semantic search across any vector store with built-in support for embeddings, filtering, and RAG patterns.<\/p>\n","protected":false},"author":368,"featured_media":59612,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[685,7781,756],"tags":[8124,8123,7811,7877,8038],"class_list":["post-59608","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-dotnet","category-ai","category-csharp","tag-embeddings","tag-microsoft-extensions-vectordata","tag-rag","tag-semantic-search","tag-vector-search"],"acf":[],"blog_post_summary":"<p>Discover how Microsoft.Extensions.VectorData brings unified vector database access to .NET &#8211; one interface for semantic search across any vector store with built-in support for embeddings, filtering, and RAG patterns.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/59608","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/users\/368"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/comments?post=59608"}],"version-history":[{"count":2,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/59608\/revisions"}],"predecessor-version":[{"id":59614,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/59608\/revisions\/59614"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/media\/59612"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/media?parent=59608"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/categories?post=59608"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/tags?post=59608"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}