{"id":2916,"date":"2024-02-19T04:00:03","date_gmt":"2024-02-19T12:00:03","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/azure-sql\/?p=2916"},"modified":"2024-02-19T07:07:31","modified_gmt":"2024-02-19T15:07:31","slug":"nl2sql-with-langchain-and-azure-sql-database","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/azure-sql\/nl2sql-with-langchain-and-azure-sql-database\/","title":{"rendered":"NL2SQL with LangChain and Azure SQL Database"},"content":{"rendered":"<p><a href=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2024\/02\/Picture1-copy-scaled.jpeg\"><img decoding=\"async\" class=\"wp-image-2919 aligncenter\" src=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2024\/02\/Picture1-copy-scaled.jpeg\" alt=\"Image Picture1 copy\" width=\"611\" height=\"253\" srcset=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2024\/02\/Picture1-copy-scaled.jpeg 2500w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2024\/02\/Picture1-copy-300x124.jpeg 300w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2024\/02\/Picture1-copy-1024x424.jpeg 1024w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2024\/02\/Picture1-copy-768x318.jpeg 768w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2024\/02\/Picture1-copy-1536x636.jpeg 1536w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2024\/02\/Picture1-copy-2048x848.jpeg 2048w\" sizes=\"(max-width: 611px) 100vw, 611px\" \/><\/a><\/p>\n<p><a href=\"https:\/\/www.langchain.com\/\" target=\"_blank\" rel=\"noopener\">LangChain<\/a> is an open-source framework for creating applications that use and are powered by language models (LLM\/MLM\/SML). In this post, basic LangChain components (toolkits, chains, agents) will be used to create a natural language to SQL prompt that will allow interactions with an Azure SQL Database; just ask the database what you want as if speaking to another person.<\/p>\n<h2>Getting Started<\/h2>\n<p><div class=\"alert alert-info\"><p class=\"alert-divider\"><i class=\"fabric-icon fabric-icon--Info\"><\/i><strong>Where's the code?<\/strong><\/p>All the code that is used in this post is available in the Azure SQL AI samples GitHub repository located <a href=\"https:\/\/github.com\/Azure-Samples\/SQL-AI-samples\" target=\"_blank\" rel=\"noopener\">here<\/a>. The notebook outlined in this post is located <a href=\"https:\/\/github.com\/Azure-Samples\/SQL-AI-samples\/blob\/main\/AzureSQLDatabase\/LangChain\/dbOpenAI.ipynb\" target=\"_blank\" rel=\"noopener\">here<\/a>.<\/div><\/p>\n<p>Looking at the python notebook used for this sample, you may notice in the repository there are 2. One is using OpenAI and the other is using Azure OpenAI. This post will be using the OpenAI notebook. The differences are very minor with only change being a reference to where the Azure OpenAI endpoints are located.<\/p>\n<p><div class=\"alert alert-primary\">You can run this example in GitHub Codespace.<\/div><\/p>\n<p>The first section in the notebook sets up our python environment by installing the necessary libraries via a requirements.txt file.<\/p>\n<pre>pip install -r requirements.txt<\/pre>\n<p>Next, install the SQL ODBC driver and referenced in the notebook are links for Windows, MacOS, and Linux. Included in the repository is the script for Ubuntu because that&#8217;s the OS that is used with GitHub codespace; the easiest place to run this sample.<\/p>\n<p>Remember to fill out the values in the .env file. <span style=\"font-size: 1rem;text-align: var(--bs-body-text-align)\">For this example, all that needs to be filled out is the <strong>OPENAI_API_KEY<\/strong> and connection information for the <\/span><span style=\"font-size: 1rem;text-align: var(--bs-body-text-align)\"><strong>py-connectionString<\/strong> variable. <\/span><\/p>\n<div>\n<pre>AZURE_OPENAI_API_KEY=\"\"\r\nAZURE_OPENAI_ENDPOINT=\"\"\r\n<strong>OPENAI_API_KEY=\"\"<\/strong>\r\n<strong>py-connectionString=\"mssql+pyodbc:\/\/USERNAME:PASSWORD@SERVER_NAME.database.windows.net\/DATABASE_NAME?driver=ODBC+Driver+18+for+SQL+Server\"<\/strong><\/pre>\n<div><span style=\"font-size: 1rem;text-align: var(--bs-body-text-align)\">You can get your OpenAI API key <a href=\"https:\/\/platform.openai.com\/account\/api-keys\" target=\"_blank\" rel=\"noopener\">here<\/a> once logged into the platform.<\/span><\/div>\n<div><\/div>\n<\/div>\n<p>The last setup section for the LangChain with Azure SQL Database example is the table creation script.<\/p>\n<p><div class=\"alert alert-success\">When using NL2SQL agents, its best to connect with a user with as few roles\/privileges as possible for maximum security.<\/div><\/p>\n<div>\n<pre>create table [dbo].[langtable] (id int Identity, username nvarchar(100))\r\nGO\r\n\r\ninsert into [dbo].[langtable] (username) values('sammy')\r\ninsert into [dbo].[langtable] (username) values('mary')\r\ninsert into [dbo].[langtable] (username) values('jane')\r\ninsert into [dbo].[langtable] (username) values('fred')\r\ninsert into [dbo].[langtable] (username) values('billy')\r\ninsert into [dbo].[langtable] (username) values('jonny')\r\ninsert into [dbo].[langtable] (username) values('kenny')\r\ninsert into [dbo].[langtable] (username) values('dan')\r\ninsert into [dbo].[langtable] (username) values('frank')\r\ninsert into [dbo].[langtable] (username) values('jenny')\r\nGO\r\n\r\nselect * from [dbo].[langtable]\r\nGO<\/pre>\n<h2>The Notebook<\/h2>\n<p><div class=\"alert alert-warning\"><p class=\"alert-divider\"><i class=\"fabric-icon fabric-icon--Warning\"><\/i><strong>Select Kernel<\/strong><\/p>Be sure to select a kernel for the python notebook by use the <strong>Select Kernel<\/strong> button in the upper right of the notebook.<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2024\/02\/Screenshot-2024-02-16-at-11.28.19\u202fAM.png\"><img decoding=\"async\" class=\"alignnone size-full wp-image-2930\" src=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2024\/02\/Screenshot-2024-02-16-at-11.28.19\u202fAM.png\" alt=\"Image Screenshot 2024 02 16 at 11 28 19 AM\" width=\"603\" height=\"122\" srcset=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2024\/02\/Screenshot-2024-02-16-at-11.28.19\u202fAM.png 603w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2024\/02\/Screenshot-2024-02-16-at-11.28.19\u202fAM-300x61.png 300w\" sizes=\"(max-width: 603px) 100vw, 603px\" \/><\/a><\/p>\n<p>If using VS Code\/Codespace, it will install the Python and Jupyter extensions and result in the kernel being set.<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2024\/02\/Screenshot-2024-02-16-at-11.33.18\u202fAM.png\"><img decoding=\"async\" class=\"alignnone size-full wp-image-2932\" src=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2024\/02\/Screenshot-2024-02-16-at-11.33.18\u202fAM.png\" alt=\"Image Screenshot 2024 02 16 at 11 33 18 AM\" width=\"144\" height=\"45\" \/><\/a><\/div><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-size: 1rem;text-align: var(--bs-body-text-align)\">The first stop in the notebook is setting up the libraries to be used in the python script. The section also loads any environment variables we set in the .env file to be used later.<\/span><\/p>\n<\/div>\n<div>\n<pre>import pyodbc\r\nimport os\r\nfrom dotenv import load_dotenv\r\nfrom langchain.agents import create_sql_agent\r\nfrom langchain.agents.agent_types import AgentType\r\nfrom langchain.sql_database import SQLDatabase\r\nfrom langchain_community.agent_toolkits import SQLDatabaseToolkit\r\nfrom langchain_openai import OpenAI\r\nload_dotenv()<\/pre>\n<\/div>\n<div><\/div>\n<div>Once this section is run, you will be creating and testing the database connection. Using SQL Alchemy, a database engine is created. This engine is then added to a db variable that initializes the\u00a0<a href=\"https:\/\/python.langchain.com\/docs\/integrations\/toolkits\/sql_database#toolkit\" target=\"_blank\" rel=\"noopener\">LangChain SQL Database Toolkit<\/a> integration. The connection is then tested by printing out the results of 3 tests which include listing all tables the agent has access to and a SQL statement.<\/div>\n<div><\/div>\n<div>\n<div>\n<div>\n<pre>from sqlalchemy import create_engine\r\n\r\nconnectionString=os.environ[\"py-connectionString\"]\r\n\r\ndb_engine = create_engine(connectionString)\r\n\r\ndb = SQLDatabase(db_engine, view_support=True, schema=\"dbo\")\r\n\r\n# test the connection\r\nprint(db.dialect)\r\nprint(db.get_usable_table_names())\r\ndb.run(\"select convert(varchar(25), getdate(), 120)\")<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<p>The next section utilizes the\u00a0<a href=\"https:\/\/python.langchain.com\/docs\/integrations\/toolkits\/sql_database\" target=\"_blank\" rel=\"noopener\">LangChain SQL Database Toolkit<\/a> to create a LangChain SQL agent that uses OpenAI as the LLM and the database connection.<\/p>\n<div>\n<pre>toolkit = SQLDatabaseToolkit(db=db, llm=OpenAI(temperature=0))\r\n\r\nagent_executor = create_sql_agent(\r\nllm=OpenAI(temperature=0),\r\ntoolkit=toolkit,\r\nverbose=True,\r\nagent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,\r\n)<\/pre>\n<\/div>\n<div>Finally, once all the components are set up, we can execute the prompt with natural language asking the database to count the rows in our test table.<\/div>\n<div><\/div>\n<pre><span style=\"color: #292b2c;font-family: Menlo, Monaco, Consolas, 'Liberation Mono', 'Courier New', monospace;font-size: 14.4px;text-align: var(--bs-body-text-align)\">agent_executor.invoke(\"count the rows in the langtable table.\")<\/span><\/pre>\n<div>\n<div><\/div>\n<div>with the results being similar to the following output:<\/div>\n<div><\/div>\n<div>\n<div class=\"output_container\" data-vscode-context=\"{&quot;preventDefaultContextMenuItems&quot;:true}\">\n<div id=\"4633454c-6a19-4e38-9243-76f8f5dc65bf\" class=\"output remove-padding output-stream\">\n<div>\n<pre><span class=\"code-bold\">&gt; Entering new AgentExecutor chain...<\/span> \r\n<span class=\"code-foreground-colored code-bold code-italic\">I need to use a SQL query to count the rows in the langtable table. <\/span>\r\n<span class=\"code-foreground-colored code-bold code-italic\">Action: sql_db_query <\/span>\r\n<span class=\"code-foreground-colored code-bold code-italic\">Action Input: SELECT COUNT(*) FROM langtable<\/span><span class=\"code-foreground-colored code-bold code-italic\">[(10,)]<\/span><span class=\"code-foreground-colored code-bold code-italic\">10 rows were counted in the langtable table. <\/span>\r\n<span class=\"code-foreground-colored code-bold code-italic\">Final Answer: 10<\/span> \r\n<span class=\"code-bold\">&gt; Finished chain.\r\n<\/span>{'input': 'count the rows in the langtable table.', 'output': '10'}<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<h2>Summary<\/h2>\n<div>You have just created your first LangChain with Azure SQL Database application using NL2SQL agents. As AI becomes more important in our development environments, being able to use structured data with an LLM is key to creating prompts that can answers the questions users ask more accurately.<\/div>\n","protected":false},"excerpt":{"rendered":"<p>LangChain is an open-source framework for creating applications that use and are powered by language models (LLM\/MLM\/SML). In this post, basic LangChain components (toolkits, chains, agents) will be used to create a natural language to SQL prompt that will allow interactions with an Azure SQL Database; just ask the database what you want as if [&hellip;]<\/p>\n","protected":false},"author":95874,"featured_media":2918,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[601,1,605,582],"tags":[590,602,510,465,469,30,449,579,561,407],"class_list":["post-2916","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","category-azure-sql","category-langchain","category-openai","tag-ai","tag-azure-openai","tag-azure-sql-database","tag-azuresql","tag-azuresqldb","tag-developers","tag-development","tag-langchain","tag-openai","tag-python"],"acf":[],"blog_post_summary":"<p>LangChain is an open-source framework for creating applications that use and are powered by language models (LLM\/MLM\/SML). In this post, basic LangChain components (toolkits, chains, agents) will be used to create a natural language to SQL prompt that will allow interactions with an Azure SQL Database; just ask the database what you want as if [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts\/2916","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/users\/95874"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/comments?post=2916"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts\/2916\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/media\/2918"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/media?parent=2916"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/categories?post=2916"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/tags?post=2916"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}