LangChain is an open-source framework for creating applications that use and are powered by language models (LLM/MLM/SML). In this post, basic LangChain components (toolkits, chains, agents) will be used to create a natural language to SQL prompt that will allow interactions with an Azure SQL Database; just ask the database what you want as if speaking to another person.
Getting Started
Where's the code?
All the code that is used in this post is available in the Azure SQL AI samples GitHub repository located here. The notebook outlined in this post is located here.Looking at the python notebook used for this sample, you may notice in the repository there are 2. One is using OpenAI and the other is using Azure OpenAI. This post will be using the OpenAI notebook. The differences are very minor with only change being a reference to where the Azure OpenAI endpoints are located.
The first section in the notebook sets up our python environment by installing the necessary libraries via a requirements.txt file.
pip install -r requirements.txt
Next, install the SQL ODBC driver and referenced in the notebook are links for Windows, MacOS, and Linux. Included in the repository is the script for Ubuntu because that’s the OS that is used with GitHub codespace; the easiest place to run this sample.
Remember to fill out the values in the .env file. For this example, all that needs to be filled out is the OPENAI_API_KEY and connection information for the py-connectionString variable.
AZURE_OPENAI_API_KEY="" AZURE_OPENAI_ENDPOINT="" OPENAI_API_KEY="" py-connectionString="mssql+pyodbc://USERNAME:PASSWORD@SERVER_NAME.database.windows.net/DATABASE_NAME?driver=ODBC+Driver+18+for+SQL+Server"
The last setup section for the LangChain with Azure SQL Database example is the table creation script.
create table [dbo].[langtable] (id int Identity, username nvarchar(100)) GO insert into [dbo].[langtable] (username) values('sammy') insert into [dbo].[langtable] (username) values('mary') insert into [dbo].[langtable] (username) values('jane') insert into [dbo].[langtable] (username) values('fred') insert into [dbo].[langtable] (username) values('billy') insert into [dbo].[langtable] (username) values('jonny') insert into [dbo].[langtable] (username) values('kenny') insert into [dbo].[langtable] (username) values('dan') insert into [dbo].[langtable] (username) values('frank') insert into [dbo].[langtable] (username) values('jenny') GO select * from [dbo].[langtable] GO
The Notebook
Select Kernel
Be sure to select a kernel for the python notebook by use the Select Kernel button in the upper right of the notebook.If using VS Code/Codespace, it will install the Python and Jupyter extensions and result in the kernel being set.
The first stop in the notebook is setting up the libraries to be used in the python script. The section also loads any environment variables we set in the .env file to be used later.
import pyodbc import os from dotenv import load_dotenv from langchain.agents import create_sql_agent from langchain.agents.agent_types import AgentType from langchain.sql_database import SQLDatabase from langchain_community.agent_toolkits import SQLDatabaseToolkit from langchain_openai import OpenAI load_dotenv()
from sqlalchemy import create_engine connectionString=os.environ["py-connectionString"] db_engine = create_engine(connectionString) db = SQLDatabase(db_engine, view_support=True, schema="dbo") # test the connection print(db.dialect) print(db.get_usable_table_names()) db.run("select convert(varchar(25), getdate(), 120)")
The next section utilizes the LangChain SQL Database Toolkit to create a LangChain SQL agent that uses OpenAI as the LLM and the database connection.
toolkit = SQLDatabaseToolkit(db=db, llm=OpenAI(temperature=0)) agent_executor = create_sql_agent( llm=OpenAI(temperature=0), toolkit=toolkit, verbose=True, agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION, )
agent_executor.invoke("count the rows in the langtable table.")
> Entering new AgentExecutor chain... I need to use a SQL query to count the rows in the langtable table. Action: sql_db_query Action Input: SELECT COUNT(*) FROM langtable[(10,)]10 rows were counted in the langtable table. Final Answer: 10 > Finished chain. {'input': 'count the rows in the langtable table.', 'output': '10'}
Brian, i’m trying to do the same thing but using the Azure AI Assistant and code interpreter, but there are no data connections to SQL (yet..?)any ideas if its possible and if so, how to accomplish that?
We are looking into this and how best to have Azure SQL be present in AI Studio. Stay tuned!