March 14th, 2025

Customer Case Study: Announcing the Microsoft Semantic Kernel Couchbase Connector

We’re thrilled to announce the launch of the Semantic Kernel Couchbase Vector Store Connector for .NET developers, created through our strategic partnership with Microsoft’s Semantic Kernel team. This powerful out-of-the-box connector transforms how developers integrate vector search capabilities into their AI applications.

What sets this connector apart is how it harnesses Couchbase’s distributed NoSQL platform alongside Semantic Kernel’s vector store abstractions, creating an integration that prioritizes both performance and developer experience.

The Semantic Kernel Couchbase Vector Store Connector eliminates traditional barriers between data storage and AI processing, giving developers the freedom to focus on creating intelligent, context-aware applications.

In this blog, we’ll explore how this connector enhances AI development workflows, demonstrating practical examples of how you can leverage Couchbase’s vector capabilities to build more responsive, data-driven AI agents.

Microsoft Semantic Kernel and Couchbase

Semantic Kernel is a lightweight, open-source development kit that lets you easily build AI agents and integrate AI models into your codebase. It serves as efficient middleware that enables rapid delivery of enterprise-grade solutions. By combining prompts with existing APIs, Semantic Kernel allows AI models to perform actions through your code – when a request is made, the model calls a function, and Semantic Kernel translates this request into a function call and returns the results to the model. Its modular design lets you add your existing code as plugins, maximizing your investment through flexible integration.

The new Couchbase connector extends these capabilities by seamlessly integrating Couchbase—whether via Couchbase Server or Couchbase Capella—into the Semantic Kernel environment. It delivers efficient data storage, retrieval, and similarity search for high-dimensional embeddings using robust indexing and flexible JSON document management, enabling developers to quickly build responsive, enterprise-grade AI applications with minimal code changes.

Prerequisites

Create and Deploy Your Free Tier Operational cluster on Capella

To get started with Couchbase Capella, create an account and use it to deploy a forever free tier operational cluster. This account provides you with an environment where you can explore and learn about Capella with no time constraint.

To know more, please follow the instructions.

Couchbase Capella Configuration

When running Couchbase using Capella, the following prerequisites need to be met.

  • Create the database credentials to access the travel-sample bucket (Read and Write) used in the application.
  • Allow access to the Cluster from the IP on which the application is running.

Creating Vector Search Index

For creating a vector search index, please follow the instructions. Here is a sample index that you can create considering the below RAG example.

In this sample, the index is configured on the travel-sample bucket within the inventory scope, where a collection named semantickernel is created. This collection is used to store the data for the RAG example. The index configuration allows you to set the similarity function—such as dot product—and specify the dimensions of the embedding field (e.g., 1536). This flexibility lets you tailor the index to your application’s specific requirements.

If you wish to map the IsFilterable and IsFullTextSearchable attributes, you can configure them as shown below. In this example, the HotelName field is set to be filterable by using the keyword analyzer for exact matching, while the Description field is configured for full-text search using the standard analyzer. For more information on available analyzers, please refer to the documentation.

{
  "types": {
    "inventory.semantickernel": {
      "dynamic": false,
      "enabled": true,
      "properties": {
        "description": {
          "dynamic": false,
          "enabled": true,
          "fields": [
            {
             "analyzer": "standard",  // using standard analyzer for full-text search
              "index": true,
              "name": "description",
              "store": true,
              "type": "text"
            }
          ]
        },
        "hotelName": {
          "dynamic": false,
          "enabled": true,
          "fields": [
            {
            "analyzer": "keyword",  // using keyword analyzer for exact matching
              "index": true,
              "name": "hotelName",
              "store": true,
              "type": "text"
            }
          ]
        }
        // ... other fields ...
      }
    }
  }
}

High-Level Scenario & RAG Example

In this demonstration, we build a Retrieval Augmented Generation (RAG) application:

  • User Input: A user submits a question.
  • Processing: The application generates an embedding for the question and retrieves relevant entries from the Couchbase vector store.
  • Output: The LLM uses the retrieved context to generate a detailed answer, including data source references.

For instance, imagine an application where users can query a hotel database. In our demo, we generated 100 sample hotel entries—an intentionally small dataset to let you easily try out the connector demo. In a production scenario, however, the Couchbase connector would clearly demonstrate its superior performance and scalability, especially when handling very large datasets.

The complete demo application is available in the Couchbase vector store connector repository.

Project Setup

Add the required NuGet packages:

dotnet add package "CouchbaseConnector.SemanticKernel" -v 0.2.2
dotnet add package "Microsoft.Extensions.Hosting" -v 9.0.2
dotnet add package "Microsoft.SemanticKernel.PromptTemplates.Handlebars" -v 1.40.0

Include the necessary using directives:

using System.Text.Json.Serialization;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.VectorData;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Data;
using Microsoft.SemanticKernel.Embeddings;
using Microsoft.SemanticKernel.PromptTemplates.Handlebars;

Defining the Data Model

Create your data model for a hotel record:

/// <summary>
/// Data model for storing a "hotel" with a name, a description, a description embedding and an optional reference link.
/// </summary>
public sealed record Hotel
{
    [VectorStoreRecordKey]
    [JsonPropertyName("hotelId")]
    public required string HotelId { get; set; }

    [TextSearchResultName]
    [VectorStoreRecordData]
    [JsonPropertyName("hotelName")]
    public required string HotelName { get; set; }

    [TextSearchResultValue]
    [VectorStoreRecordData]
    [JsonPropertyName("description")]
    public required string Description { get; set; }

[VectorStoreRecordVector(Dimensions: 1536, DistanceFunction.DotProductSimilarity)]
    [JsonPropertyName("descriptionEmbedding")]
    public ReadOnlyMemory<float> DescriptionEmbedding { get; set; }

    [TextSearchResultLink]
    [VectorStoreRecordData]
    [JsonPropertyName("referenceLink")]
    public string? ReferenceLink { get; set; }
}

The VectorStore* attributes define the storage schema, while the TextSearch* attributes enable dynamic text search in prompt templates.

Initializing the Semantic Kernel Engine

Set up the engine and register Couchbase settings (loaded from appsettings.json):

var builder = Host.CreateApplicationBuilder(args);
// Add configuration from appsettings.json
var couchbaseConfig = builder.Configuration.GetSection("Couchbase");
// Register AI services.
var kernelBuilder = builder.Services.AddKernel();
kernelBuilder.AddAzureOpenAIChatCompletion("gpt-4o", "https://my-service.openai.azure.com", "my_token");
        kernelBuilder.AddAzureOpenAITextEmbeddingGeneration("ada-002", "https://my-service.openai.azure.com", "my_token");
// Register text search service.
kernelBuilder.AddVectorStoreTextSearch<Hotel>();
// Register Couchbase Vector Store using provided extensions.
builder.Services.AddCouchbaseFtsVectorStoreRecordCollection<Hotel>(
    connectionString: couchbaseConfig["ConnectionString"],
    username: couchbaseConfig["Username"],
    password: couchbaseConfig["Password"],
    bucketName: couchbaseConfig["BucketName"],
    scopeName: couchbaseConfig["ScopeName"],
    collectionName: couchbaseConfig["CollectionName"],
    options: new CouchbaseFtsVectorStoreRecordCollectionOptions<Hotel>
    {
        IndexName = couchbaseConfig["IndexName"]
    });
// Build the host.
using var host = builder.Build();
// Access services directly (for demo purposes).
var kernel = host.Services.GetService<Kernel>()!;
var embeddings = host.Services.GetService<ITextEmbeddingGenerationService>()!;
var vectorStoreCollection = host.Services.GetService<IVectorStoreRecordCollection<string, Hotel>>()!;
// Register search plugin.
var textSearch = host.Services.GetService<VectorStoreTextSearch<Hotel>>()!;
kernel.Plugins.Add(textSearch.CreateWithGetTextSearchResults("SearchPlugin"));

Ingesting Demo Data

Load demo hotel records (CSV format: ID;Hotel Name;Description;Reference Link) and ingest them into the vector store:

// CSV format: ID;Hotel Name;Description;Reference Link
var hotels = (await File.ReadAllLinesAsync(filePath))
             .Select(x => x.Split(';'));

foreach (var chunk in hotels.Chunk(25))
{
    var descriptionEmbeddings = await embeddings.GenerateEmbeddingsAsync(chunk.Select(x => x[2]).ToArray());
             
    for (var i = 0; i < chunk.Length; ++i)
    {
        var hotel = chunk[i];
        await vectorStoreCollection.UpsertAsync(new Hotel
        {
            HotelId = hotel[0],
            HotelName = hotel[1],
            Description = hotel[2],
            DescriptionEmbedding = descriptionEmbeddings[i],
            ReferenceLink = hotel[3]
        });
    }
}

The embeddings.GenerateEmbeddingsAsync() method transparently calls the configured Azure AI Embeddings Generation service.

Invoking the AI Agent

Invoke the LLM with a prompt that integrates search results:

var response = await kernel.InvokePromptAsync(
  promptTemplate: """
                  Please use this information to answer the question:
                  {{#with (SearchPlugin-GetTextSearchResults question)}}
                    {{#each this}}
                      Name: {{Name}}
                      Value: {{Value}}
                      Source: {{Link}}
                      -----------------
                    {{/each}}
                  {{/with}}

                  Include the source of relevant information in the response.

                  Question: {{question}}
                  """,
  arguments: new KernelArguments
  {
      { "question", "Please show me all hotels that have a rooftop bar." },
  },
  templateFormat: "handlebars",
  promptTemplateFactory: new HandlebarsPromptTemplateFactory());
Console.WriteLine(response.ToString());

The TextSearch* attributes we set in our data model—they automatically populate our prompt templates with the relevant information from the vector store through the corresponding placeholders. When a user queries, for example, “Please show me all hotels that have a rooftop bar,” the system processes the request by generating an embedding for the question, searching the vector store, and then constructing the answer based on our template. Running the code (e.g., via Console.WriteLine(response.ToString());) will output similar to this:

– **Skyline Suites**: Offering panoramic city views from every suite, this hotel includes a rooftop bar. [Source](https://example.com/yz567)

This result correctly corresponds to the hotel record in our CSV file.

9;

Skyline Suites;

Offering panoramic city views from every suite, this hotel is perfect for those who love the urban landscape. Enjoy luxurious amenities, a rooftop bar, and close proximity to attractions. Luxurious and contemporary.;

https://example.com/yz567

This demonstration highlights how Microsoft Semantic Kernel’s well-designed abstractions simplify the integration process while offering great flexibility. The framework’s robust features—such as the InvokePrompt function, along with its templating and search plugin systems—streamline the development of intelligent, responsive applications.

What’s next?

  • Hybrid Search with Semantic Kernel: We’re thrilled to partner with Microsoft to bring hybrid search and advanced retrieval strategies to Semantic Kernel developers in the near future.

The Microsoft Semantic Kernel Couchbase Connector is a transformative tool for AI agent development. By merging Couchbase’s robust, scalable storage with the abstractions of Microsoft Semantic Kernel, developers can rapidly build intelligent, responsive applications with minimal overhead. Explore the complete demo in the Couchbase vector store connector repository and join us as we continue to innovate in the realm of AI-driven solutions.

0 comments