Vanna.ai and Microsoft SQL
Vanna.ai, in partnership with Azure SQL Database Product Management, bring you a collaboration for using Vanna’s natural language to SQL (NL2SQL) agent with the Azure SQL Database.
How does Vanna.ai work?
First, Vanna is an open-source Python RAG (Retrieval-Augmented Generation) framework for SQL generation/natural language to SQL (NL2SQL) and provides a fully hosted platform for application development. Get started for free here.
(From the documentation)
At its core, Vanna is a Python package that uses retrieval augmentation to help you generate accurate SQL queries for your database using LLMs.
Vanna works in two easy steps – train a RAG “model” on your data, and then ask questions which will return SQL queries that can be set up to automatically run on your database.
BYOLLM
Already have an LLM and want to bring your own? Or have a vector DB for storage? Or both? No problem! Easily configure an environment here and out comes the python notebook needed to get started (Vanna has created notebooks for EVERY combination!).
Vanna.ai has always been simple to use, but now, with the addition of Microsoft SQL Server/Azure SQL as a database connecter, it’s even easier.
Why use AI to chat with your database in the first place?
To start, it allows you spend less time writing SQL and more time generating insights on your data. You no longer have to be an expert in SQL to get information from the database. And if you are an expert, save some time writing a query when you can just ask for it. In fact, everyone benefits by being able to ask the database about business questions and getting the answer; have Vanna find the relevant tables and columns across your data estate.
And why Vanna?
With it’s retrieval framework and feedback loop, allows you to quickly train up an “AI Analyst” in much the same way that you would onboard a human analyst by supervising it. While using it, you can give instant feedback if the generated SQL was correct and have those answers help train the AI for future questions. And as will be seen in the blog, Vanna can be used as a standalone web app, in Jupyter, as part of an embedded widget, or as part of a MS Teams bot. See examples of this at https://github.com/vanna-ai.
Vanna x SQL x Codespace
This blog post will go over creating a NL2SQL agent using Vanna.ai and SQL Server/Azure SQL using a python notebook and a GitHub codespace. The notebook will be based off of the one hosted by Vanna here which uses Vanna’s hosted LLM and vector database with SQL Server/Azure SQL. The codespace will also be using a local free instance of SQL Server using docker in docker loaded with the Adventure Works sample data.
Get started by:
- Going to this GitHub repository
- Clicking on the green Code button
- Clicking the 3 dots
- Selecting + New with options.
Then on the following page:
- Click the Default project configuration button
- Select Python (we will be adding many more configurations as we add to this repository)
- Click the green Create codespace button
Once the codespace has started, use the terminal to run the following code to install sqlcmd:
curl https://packages.microsoft.com/keys/microsoft.asc | sudo tee /etc/apt/trusted.gpg.d/microsoft.asc sudo add-apt-repository "$(wget -qO- https://packages.microsoft.com/config/ubuntu/20.04/prod.list)" sudo apt-get update sudo apt-get install sqlcmd
then, create the database with the following command:
sqlcmd create mssql --accept-eula --using https://aka.ms/AdventureWorksLT.bak
and once the database is finished creating, the connection string for ODBC will be needed for the notebook, run the following command to get the connection string:
sqlcmd config connection-strings
sss
Security Check
In the connect string, there is TrustServerCertificate=yes. For local development environments, this is OK but NEVER use this setting in live/production environments. More on this hereOne of the files included in this directory is the odbcDriverInstallUbuntu.txt file. This file contains the commands for installing the Microsoft ODBC driver for SQL Server on Ubuntu (Linux). Run the contents of this file in the terminal to install the driver. Once installed, run the following command for checking the version to ensure it has been installed successfully:
more /etc/odbcinst.ini
Open your notebook to…
Now that the database is created with the Adventure Works data, open the vanna_and_sql.ipynb notebook in the code editor by clicking on it.
The first Setup step installs the vanna and pyodbc python libraries to be used in the notebook. Click on the Execute Cell button to run this command.
Choose a Kernel
You may see a dialog box asking you to choose a Python Kernal for the notebook. Choose python environment for the first choice and then the python version installed in this codespace.Next, execute the following cell that prepares the libraries to be used for the python commands that follow:
The last cell in the Setup section of the notebook sets the Vanna API key and model name to be used for the Vanna hosted LLM and vector database. Remember, you can sign up for these free services here. Once you enter this information, execute the cell with the run button.
The next step in the notebook defines the database connection. Vanna has help out MS SQL/Azure SQL users here with vn.connect_to_mssql. All that needs to be passed to this function is the ODBC string that was shown previously by running sqlcmd config connection-strings. Copy and paste the ODBC connection string into this cell of the notebook then run it.
Training Grounds
Moving to the training section, if the previous sections were filled out correctly, executing this cell should result in seeing all the tables in the database that the user in the connection string allows (which should be all of Adventure Works tables and views).
The next cell uses some training to remind the NL2SQL agent to use tables with their schema names. As before, execute this cell:
Free Apps!
Part of the Vanna python package is a flask app that can be used as a chat session. Execute this cell to start the flask application:
Once the application has started, a dialog box will appear at the bottom of the code space:
Click the Open in Browser button to try out the application.
Using the chat box, ask the application a question about the Adventure Work data. For example, “What are the top 10 products sold and their colors? Format any currency columns as dollars.”.
The result is the SQL query itself, a table with the data as asked:
as well as a nice chart!
Wrap-up
Again, a huge thank you to Vanna for adding the simple, one line of code function for accessing SQL databases. It makes an already easy to use NL2SQL tool even easier (if that was possible!). The notebook sample is in the Azure SQL AI Samples repository here for anyone to try out and start using with their applications. As always, tell us how it goes, and we appreciate any feedback you may have.
0 comments