Data is at the heart of every AI application, and efficient data ingestion is critical for success. With over 1,400 enterprise connectors, Logic Apps offers unmatched access to a diverse range of systems, applications, and databases, whether hosted in the cloud or on-premises. These connectors give businesses the flexibility to keep their data where it resides while seamlessly powering AI experiences.
By leveraging Azure Logic Apps‘ native capabilities, organizations can implement the Retrieval-Augmented Generation (RAG) pattern, enabling straightforward ingestion and retrieval of data from multiple sources to enrich AI-driven applications.
SQL as the vector store
Support for a dedicated vector data type in Azure SQL Database was recently announced in Public Preview, marking a significant step forward in data storage capabilities. Recognized for its robust, scalable, and secure infrastructure, Azure SQL continues to set the standard for enterprise-grade data solutions. But as AI evolves, so do the ways we interact with data. By transforming Azure SQL into a vector store, businesses can harness the power of relational databases to handle their existing data, enabling efficient document indexing and retrieval for Gen AI models. This approach opens doors to use cases where structured and unstructured data converge—without the need to adopt new search platforms.
Logic Apps for Document Ingestion and Retrieval
Building AI applications with business context awareness requires two fundamental steps. The first step is data ingestion—converting text data into vector embeddings and storing it in vector stores optimized for retrieval by large language models (LLMs). The second step is reasoning on this data using various LLMs to generate answers and insights.
In this blog post, we’ll demonstrate how Logic Apps can facilitate the data ingestion process to SQL to store data as vector embeddings. We’ll also cover how Logic Apps enables seamless data retrieval and integrates with Azure OpenAI’s LLMs for reasoning on the ingested data.
To demonstrate Logic Apps’ capabilities in this solution, we’ll use the RAG with Documents example from the Azure SQL DB Vector Search repository on GitHub. The table schema used is here.
Document Ingestion into SQL
In RAG (Retrieval-Augmented Generation), the ingestion process involves several stages to ensure that documents are processed, retrieved, and used effectively by generative AI models. Here’s a breakdown of each stage and how you can use Logic Apps Standard for them –
Data/Document Collection
Leverage 1400+ connectors in Logic Apps to gather relevant documents, datasets or other sources of information. You can read the data based on schedule or event (for example, when new files or records are added or updated at the data source).
Document Parsing/Chunking
Leverage Parse a document action to convert content, such as PDF document, CSV file, PPT and so on, into a tokenized string.
Leverage Chunk text action to split tokenized content into smaller, manageable chunks for processing in the subsequent steps by AI models. The action provides options to choose chunking strategy, token size, etc so that users can configure the chunks so that they are optimal size and in accordance to their AI models
Generating Embeddings
Leverage Azure Open AI connector, and specifically Generate Embeddings action to convert the tokenized chunks into vector embeddings. The embeddings represent text in a way that AI can understand and optimal for advanced similarity searches for efficient retrieval.
Store embeddings in vector store
Prepare data for ingestion using Select action by mapping the generated embeddings to the SQL table schema.
Write the embeddings into SQL table using the Execute Query action in SQL connector. The SQL query can be updated based on your specific table and its schema. Here is an example of the SQL action to insert records including vector embeddings in the SQL table
This completes all the steps required for document ingestion with out of box connectors and actions. Here is a sample workflow that triggers when a new file is added or updated in Blob Storage. The document is indexed into SQL database vector store with all out of box actions.
Chat with your Data – Retrieval
With your business data indexed in vector stores, you can now leverage that data to build contextually aware AI applications.
Here’s a breakdown of each stage and how Logic Apps facilitates the process:
Question The retrieval process is typically initiated by a question, which may come from chat agents, APIs, or other systems. Logic Apps supports this through its out-of-the-box connectors and actions, making it simple to receive and handle queries.
Prompt Based on the question, a prompt is generated. Using the SQL connector, you can write the Select query to perform a similarity search on the vector embeddings, retrieving the most relevant information.
Response Generation To produce a contextual and grounded response, the “Get Chat Completions” API from the Azure OpenAI connector is used. This API processes the prompt, vector search results, and system instructions to deliver an accurate answer from the LLM.
Below is a sample workflow that processes a question, retrieves relevant responses based on SQL vector search results, and enhances these results with insights from Azure OpenAI’s LLM.
Sample Response
Logic Apps workflow for retrieval uses HTTP trigger which creates a REST endpoint for that workflow. It can be invoked from any application, and we are using Postman to call the API
Getting Started
Are you ready to try out these capabilities? Everything you need to configure SQL and other pre-requisites such as Azure Open AI are available here – RAG with Documents example . Here are the workflow JSON for both the Logic Apps covered in this blog
Learn More
If you are new to Azure Logic Apps, here are few pointers to get started and use Logic Apps to accelerate developer productivity.
Is the source code for the Logic Apps posted anywhere? This is great and trying to recreate this for our AI Team to show them this new feature, but having a few issues getting everything to sync. I have the ingestion working, just not the query part. Would appreciate if we could get the Logic App to see how this was configured. Thanks, great article.
Hi Mike,
In two weeks, both these Logic Apps will be available as Templates (https://aka.ms/logicapps/templates) which you would be able to use from Logic Apps designer. For now, you can refer to the workflow json – here
Please let me know if you have any further questions.