{"id":4940,"date":"2025-06-24T17:40:21","date_gmt":"2025-06-25T00:40:21","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/semantic-kernel\/?p=4940"},"modified":"2025-06-26T00:27:05","modified_gmt":"2025-06-26T07:27:05","slug":"semantic-kernel-python-gets-a-major-vector-store-upgrade","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/agent-framework\/semantic-kernel-python-gets-a-major-vector-store-upgrade\/","title":{"rendered":"Semantic Kernel Python Gets a Major Vector Store Upgrade"},"content":{"rendered":"<p>We&#8217;re excited to announce a significant update to Semantic Kernel Python&#8217;s vector store implementation. Version 1.34 brings a complete overhaul that makes working with vector data simpler, more intuitive, and more powerful. This update consolidates the API, improves developer experience, and adds new capabilities that streamline AI development workflows.<\/p>\n<h2>What Makes This Release Special?<\/h2>\n<p>The new vector store architecture consolidates everything under <code>semantic_kernel.data.vector<\/code> and delivers three key improvements:<\/p>\n<ol>\n<li><strong>Simplified API<\/strong>: One unified field model replaces multiple complex field types<\/li>\n<li><strong>Integrated Embeddings<\/strong>: Embedding generation happens automatically where you need it<\/li>\n<li><strong>Enhanced Features<\/strong>: Advanced filtering, hybrid search, and streamlined operations<\/li>\n<\/ol>\n<p>Let&#8217;s explore what makes these changes valuable.<\/p>\n<h2>Unified Field Model &#8211; Simplified Configuration<\/h2>\n<p>We&#8217;ve replaced three separate field types with one powerful <code>VectorStoreField<\/code> class that handles everything you need.<\/p>\n<h3>Before: The Old Way (Complex and Verbose)<\/h3>\n<pre class=\"prettyprint language-py\"><code class=\"language-py\">from semantic_kernel.data import (\r\n    VectorStoreRecordKeyField,\r\n    VectorStoreRecordDataField, \r\n    VectorStoreRecordVectorField\r\n)\r\n\r\n# Multiple classes to remember and configure\r\nfields = [\r\n    VectorStoreRecordKeyField(name=\"id\"),\r\n    VectorStoreRecordDataField(name=\"text\", is_filterable=True, is_full_text_searchable=True),\r\n    VectorStoreRecordVectorField(name=\"vector\", dimensions=1536, distance_function=\"cosine\")\r\n]\r\n<\/code><\/pre>\n<h3>After: The New Way (Clean and Intuitive)<\/h3>\n<pre class=\"prettyprint language-py\"><code class=\"language-py\">from semantic_kernel.data.vector import VectorStoreField\r\nfrom semantic_kernel.connectors.ai.open_ai import OpenAITextEmbedding\r\n\r\nembedding_service = OpenAITextEmbedding(ai_model_id=\"text-embedding-3-small\")\r\n\r\n# One class handles all field types\r\nfields = [\r\n    VectorStoreField(\"key\", name=\"id\"),\r\n    VectorStoreField(\"data\", name=\"text\", is_indexed=True, is_full_text_indexed=True),\r\n    VectorStoreField(\"vector\", name=\"vector\", dimensions=1536, \r\n                    distance_function=\"cosine\", embedding_generator=embedding_service)\r\n]\r\n<\/code><\/pre>\n<p>This approach provides cleaner code with better IDE support, including improved autocomplete and clearer intentions.<\/p>\n<h2>Integrated Embeddings &#8211; Automatic Generation<\/h2>\n<p>The new architecture includes automatic embedding generation directly in your field definitions. No more manual embedding steps\u2014just define what you want embedded, and it happens automatically.<\/p>\n<pre class=\"prettyprint language-py\"><code class=\"language-py\">from semantic_kernel.data.vector import VectorStoreField, vectorstoremodel\r\nfrom semantic_kernel.connectors.ai.open_ai import OpenAITextEmbedding\r\nfrom typing import Annotated\r\nfrom dataclasses import dataclass\r\n\r\n@vectorstoremodel\r\n@dataclass\r\nclass MyRecord:\r\n    content: Annotated[str, VectorStoreField('data', is_indexed=True, is_full_text_indexed=True)]\r\n    title: Annotated[str, VectorStoreField('data', is_indexed=True, is_full_text_indexed=True)]\r\n    id: Annotated[str, VectorStoreField('key')]\r\n    vector: Annotated[list[float] | str | None, VectorStoreField(\r\n        'vector', \r\n        dimensions=1536, \r\n        distance_function=\"cosine\",\r\n        embedding_generator=OpenAITextEmbedding(ai_model_id=\"text-embedding-3-small\"),\r\n    )] = None\r\n\r\n    def __post_init__(self):\r\n        if self.vector is None:\r\n            # Combine multiple fields for richer embeddings\r\n            self.vector = f\"Title: {self.title}, Content: {self.content}\"\r\n<\/code><\/pre>\n<p>You can now easily combine multiple fields to create richer embeddings with simple field assignment.<\/p>\n<h2>Lambda-Powered Filtering &#8211; Type-Safe and Expressive<\/h2>\n<p>The new filtering system uses lambda expressions that are type-safe, IDE-friendly, and highly expressive, replacing the previous string-based <code>FilterClause<\/code> objects.<\/p>\n<h3>Before: String-Based Complexity<\/h3>\n<pre class=\"prettyprint language-py\"><code class=\"language-py\">from semantic_kernel.data.text_search import SearchFilter\r\n\r\n# Multiple objects and method calls\r\ntext_filter = SearchFilter()\r\ntext_filter.equal_to(\"category\", \"AI\")\r\ntext_filter.equal_to(\"status\", \"active\")\r\n<\/code><\/pre>\n<h3>After: Lambda Expression Power<\/h3>\n<pre class=\"prettyprint language-py\"><code class=\"language-py\"># Clean, readable, and type-safe\r\nresults = await collection.search(\r\n    \"query text\", \r\n    filter=lambda record: record.category == \"AI\" and record.status == \"active\"\r\n)\r\n\r\n# Complex filtering with multiple conditions\r\nresults = await collection.search(\r\n    \"machine learning concepts\",\r\n    filter=lambda record: (\r\n        record.category == \"AI\" and \r\n        record.score &gt; 0.8 and\r\n        \"important\" in record.tags and\r\n        0.5 &lt;= record.confidence_score &lt;= 0.9\r\n    )\r\n)\r\n<\/code><\/pre>\n<p>Your IDE can now provide full autocomplete support and catch errors at development time.<\/p>\n<h2>Streamlined Operations &#8211; Consistent Interface<\/h2>\n<p>The new API provides a consistent interface that works with both single records and batches:<\/p>\n<pre class=\"prettyprint language-py\"><code class=\"language-py\">from semantic_kernel.connectors.in_memory import InMemoryCollection\r\n\r\ncollection = InMemoryCollection(\r\n    record_type=MyRecord,\r\n    embedding_generator=OpenAITextEmbedding(ai_model_id=\"text-embedding-3-small\")\r\n)\r\n\r\n# Single record or batch - same method\r\nawait collection.upsert(single_record)\r\nawait collection.upsert([record1, record2, record3])\r\n\r\n# Flexible retrieval\r\nawait collection.get([\"id1\", \"id2\"])  # Get specific records\r\nawait collection.get(top=10, skip=0, order_by='title')  # Browse with pagination\r\n\r\n# Powerful search with automatic embedding\r\nresults = await collection.search(\"find AI articles\", top=10)\r\nresults = await collection.hybrid_search(\"machine learning\", top=10)\r\n<\/code><\/pre>\n<h2>Instant Search Functions &#8211; Simplified Creation<\/h2>\n<p>Creating search functions for your kernel is now straightforward:<\/p>\n<h3>Before: Multiple Steps and Setup<\/h3>\n<pre class=\"prettyprint language-py\"><code class=\"language-py\">from semantic_kernel.data import VectorStoreTextSearch\r\n\r\ncollection = InMemoryCollection(collection_name='collection', record_type=MyRecord)\r\nsearch = VectorStoreTextSearch.from_vectorized_search(\r\n    vectorized_search=collection, \r\n    embedding_generator=OpenAITextEmbedding(ai_model_id=\"text-embedding-3-small\")\r\n)\r\nsearch_function = search.create_search(function_name='search')\r\n<\/code><\/pre>\n<h3>After: Streamlined Creation<\/h3>\n<pre class=\"prettyprint language-py\"><code class=\"language-py\"># Create a search function directly on your collection\r\nsearch_function = collection.create_search_function(\r\n    function_name=\"search\",\r\n    search_type=\"vector\",  # or \"keyword_hybrid\"\r\n    top=10,\r\n    vector_property_name=\"vector\"\r\n)\r\n\r\n# Add to kernel\r\nkernel.add_function(plugin_name=\"memory\", function=search_function)\r\n<\/code><\/pre>\n<h2>Enhanced Data Model Expressiveness<\/h2>\n<p>The simplified API doesn&#8217;t sacrifice expressiveness. Data models are more capable than before:<\/p>\n<pre class=\"prettyprint language-py\"><code class=\"language-py\">@vectorstoremodel(collection_name=\"documents\")\r\n@dataclass\r\nclass DocumentRecord:\r\n    # Rich metadata\r\n    id: Annotated[str, VectorStoreField('key')]\r\n    title: Annotated[str, VectorStoreField('data', is_indexed=True, is_full_text_indexed=True)]\r\n    content: Annotated[str, VectorStoreField('data', is_full_text_indexed=True)]\r\n    category: Annotated[str, VectorStoreField('data', is_indexed=True)]\r\n    tags: Annotated[list[str], VectorStoreField('data', is_indexed=True)]\r\n    created_date: Annotated[datetime, VectorStoreField('data', is_indexed=True)]\r\n    confidence_score: Annotated[float, VectorStoreField('data', is_indexed=True)]\r\n    \r\n    # Multiple vectors for different purposes\r\n    content_vector: Annotated[list[float] | str | None, VectorStoreField(\r\n        'vector', \r\n        dimensions=1536,\r\n        storage_name=\"content_embedding\",\r\n        embedding_generator=OpenAITextEmbedding(ai_model_id=\"text-embedding-3-small\")\r\n    )] = None\r\n    \r\n    title_vector: Annotated[list[float] | str | None, VectorStoreField(\r\n        'vector',\r\n        dimensions=1536, \r\n        storage_name=\"title_embedding\",\r\n        embedding_generator=OpenAITextEmbedding(ai_model_id=\"text-embedding-3-small\")\r\n    )] = None\r\n\r\n    def __post_init__(self):\r\n        if self.content_vector is None:\r\n            self.content_vector = self.content\r\n        if self.title_vector is None:\r\n            self.title_vector = self.title\r\n<\/code><\/pre>\n<h2>Better Connector Experience<\/h2>\n<p>We&#8217;ve also streamlined the connector imports and naming. Everything is now logically organized under <code>semantic_kernel.connectors<\/code>:<\/p>\n<pre class=\"prettyprint language-py\"><code class=\"language-py\"># Clean, consistent imports\r\nfrom semantic_kernel.connectors.azure_ai_search import AzureAISearchStore\r\nfrom semantic_kernel.connectors.chroma import ChromaVectorStore\r\nfrom semantic_kernel.connectors.pinecone import PineconeVectorStore\r\nfrom semantic_kernel.connectors.qdrant import QdrantVectorStore\r\n\r\n# Or use the convenient lazy loading\r\nfrom semantic_kernel.connectors.memory import (\r\n    AzureAISearchStore,\r\n    ChromaVectorStore,\r\n    PineconeVectorStore,\r\n    QdrantVectorStore\r\n)\r\n<\/code><\/pre>\n<h2>Real-World Example: Complete Implementation<\/h2>\n<p>Here&#8217;s how a complete example looks with the new architecture:<\/p>\n<h3>The New Way (Simple and Powerful)<\/h3>\n<pre class=\"prettyprint language-py\"><code class=\"language-py\">from semantic_kernel.data.vector import VectorStoreField, vectorstoremodel\r\nfrom semantic_kernel.connectors.in_memory import InMemoryCollection\r\nfrom semantic_kernel.connectors.ai.open_ai import OpenAITextEmbedding\r\nfrom typing import Annotated\r\nfrom dataclasses import dataclass\r\n\r\n@vectorstoremodel(collection_name=\"knowledge_base\")\r\n@dataclass\r\nclass KnowledgeBase:\r\n    id: Annotated[str, VectorStoreField('key')]\r\n    content: Annotated[str, VectorStoreField('data', is_full_text_indexed=True)]\r\n    category: Annotated[str, VectorStoreField('data', is_indexed=True)]\r\n    vector: Annotated[list[float] | str | None, VectorStoreField(\r\n        'vector', \r\n        dimensions=1536,\r\n        embedding_generator=OpenAITextEmbedding(ai_model_id=\"text-embedding-3-small\")\r\n    )] = None\r\n\r\n    def __post_init__(self):\r\n        if self.vector is None:\r\n            self.vector = self.content\r\n\r\n# Create collection with automatic embedding\r\nasync with InMemoryCollection(record_type=KnowledgeBase) as collection:\r\n    await collection.ensure_collection_exists()\r\n    \r\n    # Add documents (embeddings created automatically)\r\n    docs = [\r\n        KnowledgeBase(id=\"1\", content=\"Semantic Kernel is awesome\", category=\"general\"),\r\n        KnowledgeBase(id=\"2\", content=\"Python makes AI development easy\", category=\"programming\"),\r\n    ]\r\n    await collection.upsert(docs)\r\n    \r\n    # Search with intelligent filtering\r\n    results = await collection.search(\r\n        \"AI development\", \r\n        top=5,\r\n        filter=lambda doc: doc.category == \"programming\"\r\n    )\r\n    \r\n    # Create kernel search function\r\n    search_func = collection.create_search_function(\"knowledge_search\", search_type=\"vector\")\r\n    kernel.add_function(plugin_name=\"kb\", function=search_func)\r\n<\/code><\/pre>\n<h2>What This Means for Your Projects<\/h2>\n<p>This update brings several concrete benefits:<\/p>\n<ul>\n<li><strong>Faster Development<\/strong>: Less boilerplate, more focus on your AI logic<\/li>\n<li><strong>Better Maintainability<\/strong>: Clearer code that&#8217;s easier to understand and modify<\/li>\n<li><strong>Enhanced Performance<\/strong>: Built-in optimizations and batch operations<\/li>\n<li><strong>Future-Proof Architecture<\/strong>: Aligned with .NET SDK for consistent cross-platform development<\/li>\n<li><strong>Richer Functionality<\/strong>: Hybrid search, advanced filtering, and integrated embeddings<\/li>\n<\/ul>\n<h2>Ready to Upgrade?<\/h2>\n<p>The migration path is well-documented (<a href=\"https:\/\/learn.microsoft.com\/en-us\/semantic-kernel\/support\/migration\/vectorstore-python-june-2025\">here<\/a>), and the benefits are immediate. Check out the comprehensive migration guide and explore the updated samples in <code>samples\/concepts\/memory\/<\/code> to see these changes in action.<\/p>\n<p>This release represents a significant step forward in making vector search more accessible and powerful while maintaining the flexibility developers need for sophisticated AI applications.<\/p>\n<p>As part of this release we have also marked the following things as deprecated, MemoryStore abstractions, MemoryStore implementations, Semantic Text Memory and the TextMemoryPlugin. The connectors have been moved to <code>semantic_kernel.connectors.memory_stores<\/code> so that you can still find them if you really need them, otherwise they will be removed in August.<\/p>\n<p>The future of vector search in Semantic Kernel Python is here. \ud83c\udf1f<\/p>\n<hr \/>\n<p><em>Ready to experience the new vector store architecture? Update to Semantic Kernel Python 1.34 and start building with the improved API today.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>We&#8217;re excited to announce a significant update to Semantic Kernel Python&#8217;s vector store implementation. Version 1.34 brings a complete overhaul that makes working with vector data simpler, more intuitive, and more powerful. This update consolidates the API, improves developer experience, and adds new capabilities that streamline AI development workflows. What Makes This Release Special? The [&hellip;]<\/p>\n","protected":false},"author":150044,"featured_media":4898,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-4940","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-semantic-kernel"],"acf":[],"blog_post_summary":"<p>We&#8217;re excited to announce a significant update to Semantic Kernel Python&#8217;s vector store implementation. Version 1.34 brings a complete overhaul that makes working with vector data simpler, more intuitive, and more powerful. This update consolidates the API, improves developer experience, and adds new capabilities that streamline AI development workflows. What Makes This Release Special? The [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/4940","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/users\/150044"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/comments?post=4940"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/posts\/4940\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media\/4898"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/media?parent=4940"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/categories?post=4940"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/agent-framework\/wp-json\/wp\/v2\/tags?post=4940"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}