How to enhance your chatbot so it can retrieve data from multiple data sources & orchestrate its own plan with C# Semantic Kernel, planner & Azure OpenAI – part 1
In this multi-part series, Jordan Bean shares how to enhance a chatbot to retrieve data from multiple data sources and orchestrate plans with C# Semantic Kernel, planner, and Azure Open AI.
As discussed in the previous post about Azure OpenAI, using the Retrieval Augmented Generation (RAG) pattern is a simple & effective way to enable Azure OpenAI to “chat with your data”. This pattern enables you to search your data with Azure Cognitive Search (a search engine), retrieve relevant snippets of information from your data, then add that additional information to your prompts to the Azure OpenAI service to respond in human readable fashion.
This pattern is simple, effective and should always be the first thing you deploy. There are several excellent reference implementations posted on GitHub to help you do this.
- Azure-Samples/azure-search-openai-demo: A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure Cognitive Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences. (github.com)
- Azure-Samples/azure-search-openai-demo-csharp: A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure Cognitive Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences. (github.com)
This isn’t always enough
However, if you look at the implementation of the RAG pattern, you will notice that it can be limiting. There is only a single data source (Cognitive Search). Technically, Cognitive Search can index many data sources (PDFs, Word docs, etc). It is best at searching unstructured data. In addition, I have to manually copy the results from the Cognitive Search and append them to the prompt myself in code. I had to parse the results of Cognitive Search myself in code.
This leads to several questions:
- What if I want to use multiple data sources to retrieve the data necessary to answer the user’s question?
- What if I don’t know or don’t want to hard-code the order of operation for calling multiple data sources?
- What if I don’t want to write the parsing code for calling multiple data sources with disparate data formats?
- How can I let AI “orchestrate” the API calls to answer questions & pull together data that I couldn’t predict beforehand?
To solve these more complex issues, I need more than the RAG pattern.
Continue reading this post, as well as the full series on Jordan’s dev blog.