{"id":5898,"date":"2025-08-29T08:00:06","date_gmt":"2025-08-29T15:00:06","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/azure-sql\/?p=5898"},"modified":"2025-11-10T11:12:47","modified_gmt":"2025-11-10T19:12:47","slug":"using-sql-servers-new-ai-features-and-python-to-create-a-t-sql-assistant","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/azure-sql\/using-sql-servers-new-ai-features-and-python-to-create-a-t-sql-assistant\/","title":{"rendered":"Using SQL Server&#8217;s new AI features and Python to create a T-SQL assistant"},"content":{"rendered":"<p style=\"text-align: left;\"><div class=\"alert alert-info\"><p class=\"alert-divider\"><i class=\"fabric-icon fabric-icon--Info\"><\/i><strong>Guest Post<\/strong><\/p>Rodrigo Ribeiro Gomes, Head of Innovation at Power Tuning, has been leveraging the new AI features in SQL Server engine to simplify the lives of DBAs and developers. He generously shared his experience with us: thank you, Rodrigo!<\/div><\/p>\n<p>I recently created a <a href=\"https:\/\/github.com\/rrg92\/sqlserver-lib\/tree\/main\">GitHub repository<\/a> to share the T-SQL scripts I&#8217;ve accumulated over 10 years as a DBA. However, I&#8217;ve always struggled to find the right script when I need it. For years, I relied on simple shell commands like find or just my memory, which could take 10-30 minutes, if I was lucky enough to find it.<\/p>\n<p>But, we are in the AI age, so I created <a href=\"https:\/\/huggingface.co\/spaces\/rrg92\/sqlserver-lib-assistant\">that simple web application<\/a> where you type what you want, and it finds the best scripts for your needs. The solution was created using the following technologies:<\/p>\n<ul>\n<li>The <a href=\"https:\/\/www.gradio.app\/\">Gradio Python library<\/a>, for Chat UI and REST API access<\/li>\n<li>Open-source AI models, using the <a href=\"https:\/\/pypi.org\/project\/sentence-transformers\/\">SentenceTransformers Python library<\/a> for generating embeddings and reranking<\/li>\n<li><a href=\"https:\/\/huggingface.co\">Hugging Face<\/a> for hosting and the <a href=\"https:\/\/huggingface.co\/docs\/hub\/en\/spaces-zerogpu\">ZeroGPU<\/a> feature, which allows access to high-end GPUs at low cost<\/li>\n<li><a href=\"https:\/\/ai.google.dev\/gemini-api\/docs\/openai\">Google Gemini<\/a> model, to generate answers<\/li>\n<li><a href=\"https:\/\/github.com\/features\/actions\">GitHub Actions<\/a> to trigger PowerShell code that will index files in the GitHub repo<\/li>\n<li><a href=\"https:\/\/powersh.ai\/\">PowershAI module<\/a> to access Gradio<\/li>\n<li><a href=\"https:\/\/azure.microsoft.com\/en-us\/products\/azure-sql\/database\">Azure SQL Database<\/a> with new AI support, where we index the script file contents and search by meaning<\/li>\n<\/ul>\n<p>In this post, I will focus on SQL Server&#8217;s role, exploring its new AI features and their importance to this solution. If you would like a more detailed post that explains not only SQL, but also the Python code and AI models, read the expanded article version on <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/an-ai-powered-t-sql-assistant-built-with-python-and-sql-server\/\">Simple Talk<\/a>.<\/p>\n<p>Here is a diagram showing all the main components of the project (in this post we will focus on just how SQL was used):<\/p>\n<p><figure id=\"attachment_5901\" aria-labelledby=\"figcaption_attachment_5901\" class=\"wp-caption aligncenter\" ><a href=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/Diagram-1.png\"><img decoding=\"async\" class=\"wp-image-5901\" src=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/Diagram-1-300x217.png\" alt=\"A workflow diagram showing Rodrigo pushing code to GitHub, which triggers a PowerShell script that stores embeddings in Azure SQL Database. The data is processed with Hugging Face models for embedding, search, and reranking, then connected to Google Gemini\/OpenAI API, where the user interacts via a chat interface\" width=\"600\" height=\"433\" srcset=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/Diagram-1-300x217.png 300w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/Diagram-1-768x554.png 768w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/Diagram-1.png 1024w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/><\/a><figcaption id=\"figcaption_attachment_5901\" class=\"wp-caption-text\">Diagram showing main components of solution: SQL Database, GitHub Actions and Hugging Face Space, with python, models and GPUs<\/figcaption><\/figure><\/p>\n<h2>Indexing the data<\/h2>\n<p>This project is essentially a <a href=\"https:\/\/devblogs.microsoft.com\/azure-sql\/rag-with-azure-sql\/\">RAG (Retrieval Augmented Generation)<\/a> project where I need to locate the most pertinent scripts from my repository based on the text a user inputs in the chat interface. To accomplish this, we must search for these relevant scripts and utilize a large language model (LLM) to generate a response based on the content of the identified scripts.<\/p>\n<p>The first part of a project like this is indexing the data (our scripts\u2019 content). To do that, we need something called embeddings, which are basically arrays of floating-point numbers generated by an AI model based on the input, and these values can represent the meaning of the text. The flow here is very simple: when I push a specific tag (I chose <em>tag embed-vNN<\/em>) to the Git repo, a PowerShell script called <a href=\"https:\/\/github.com\/rrg92\/sqlserver-lib\/blob\/main\/SqlLibEmbeddings\/embed.ps1\">embed.ps1<\/a> will generate embeddings for each script:<\/p>\n<p><figure id=\"attachment_5908\" aria-labelledby=\"figcaption_attachment_5908\" class=\"wp-caption aligncenter\" ><a href=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/foreach-code.png\"><img decoding=\"async\" class=\"wp-image-5908 size-full\" src=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/foreach-code.png\" alt=\"PowerShell script that loops through files, prints each file name, reads its content, builds an embedding input string, generates embeddings, logs the embedding length, and stores a custom object with file path, chunk number, content, and JSON-compressed embeddings.\" width=\"725\" height=\"530\" srcset=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/foreach-code.png 725w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/foreach-code-300x219.png 300w\" sizes=\"(max-width: 725px) 100vw, 725px\" \/><\/a><figcaption id=\"figcaption_attachment_5908\" class=\"wp-caption-text\">This is a snippet from embed.ps1, where it loops over the current files from the repo and generates embeddings for each file.<\/figcaption><\/figure><\/p>\n<p>The GetEmbeddings function connects to my Hugging Face Space using a PowerShell module I created called <a href=\"https:\/\/powersh.ai\/\">PowershAI<\/a>, which simplifies this process. This approach centralizes embedding generation in the Hugging Face Space.<\/p>\n<p>After generating all embeddings, I insert into <a href=\"https:\/\/github.com\/rrg92\/sqlserver-lib\/blob\/main\/SqlLibEmbeddings\/tab.Scripts.sql\">tab.Scripts.sql<\/a> using bulk operation (via .NET method <a href=\"https:\/\/learn.microsoft.com\/en-us\/dotnet\/api\/microsoft.data.sqlclient.sqlbulkcopy.writetoserver?view=sqlclient-dotnet-standard-5.2\">WriteToServer<\/a> from legacy SqlClient class). This is the table structure:<\/p>\n<pre class=\"prettyprint language-sql\"><code class=\"language-sql\">\r\nCREATE TABLE Scripts (\r\n\t id int IDENTITY PRIMARY KEY WITH(DATA_COMPRESSION = PAGE)\r\n\t,RelPath varchar(1000) NOT NULL\r\n\t,ChunkNum int NOT NULL\r\n\t,ChunkContent nvarchar(max) NOT NULL\r\n\t,embeddings vector(1024)\r\n)<\/code><\/pre>\n<p>Note the data type of the <strong><em>embeddings<\/em>\u00a0<\/strong>column. It is a <em><strong>vector (1024)<\/strong><\/em>. The 1024 represents the number of positions in the array (also called the number of dimensions). This value depends on the model used to generate the embeddings. More dimensions do not necessarily mean higher quality and, again, depends on each AI model.<\/p>\n<p>For that, I chose this AI model: <a href=\"https:\/\/huggingface.co\/mixedbread-ai\/mxbai-embed-large-v1\"><strong>https:\/\/huggingface.co\/mixedbread-ai\/mxbai-embed-large-v1<\/strong><\/a><strong>. <\/strong>I use the powershai module to connect to a Hugging Face Space via the Gradio API and generate embeddings. Later, I will show the code I used to generate the embeddings.<\/p>\n<p>The results are inserted via BULK INSERT into the Scripts table. This is what the table looks like:<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/Picture4.png\"><img decoding=\"async\" class=\"aligncenter size-large wp-image-5912\" src=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/Picture4-1024x525.png\" alt=\"Picture4 image\" width=\"1024\" height=\"525\" srcset=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/Picture4-1024x525.png 1024w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/Picture4-300x154.png 300w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/Picture4-768x394.png 768w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/Picture4.png 1034w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><\/p>\n<p>Now, we have a process that keeps that table updated whenever I make updates to the repository. Next, let&#8217;s see how I use that data to search and generate a summary for these scripts.<\/p>\n<p><div class=\"alert alert-success\"><p class=\"alert-divider\"><i class=\"fabric-icon fabric-icon--Lightbulb\"><\/i><strong>Try in local SQL 2025<\/strong><\/p>If you would like to test on a local SQL Server 2025, I created <a href=\"https:\/\/github.com\/rrg92\/sqlserver-lib\/blob\/main\/SqlLibEmbeddings\/IndexData2025.sql\">this script in my git repo<\/a>. It uses new SQL Server 2025 features, such as the <a href=\"https:\/\/learn.microsoft.com\/en-us\/sql\/relational-databases\/system-stored-procedures\/sp-invoke-external-rest-endpoint-transact-sql?view=sql-server-ver17&amp;tabs=request-headers\">sp_invoke_external_rest_endpoint<\/a> stored procedure and <a href=\"https:\/\/learn.microsoft.com\/en-us\/sql\/t-sql\/statements\/create-external-model-transact-sql?view=sql-server-ver17\">CREATE EXTERNAL MODEL<\/a>, to connect to the GitHub API, list files, and generate embeddings using the same embedding function provided by my space. Just follow the comments.<\/div><\/p>\n<h2>Searching Data<\/h2>\n<p>Now that we have a table with script content and embeddings, we can use the new SQL Server <a href=\"https:\/\/learn.microsoft.com\/en-us\/sql\/t-sql\/functions\/vector-distance-transact-sql?view=sql-server-ver17\">VECTOR_DISTANCE<\/a> function to find the most relevant data. In short, the complete process for finding the data is (you can check all that code in file <a href=\"https:\/\/huggingface.co\/spaces\/rrg92\/sqlserver-lib-assistant\/blob\/main\/app.py\">app.py<\/a>):<\/p>\n<ol>\n<li>I use the Gradio Python library to build a <a href=\"https:\/\/huggingface.co\/spaces\/rrg92\/sqlserver-lib-assistant\/blob\/main\/app.py#L397\">Chat UI<\/a>, where the user will send messages<\/li>\n<li>When a user sends a message, <a href=\"https:\/\/huggingface.co\/spaces\/rrg92\/sqlserver-lib-assistant\/blob\/main\/app.py#L213\">I translate the message to English<\/a> using an LLM and <a href=\"https:\/\/huggingface.co\/spaces\/rrg92\/sqlserver-lib-assistant\/blob\/main\/app.py#L106\">generate embeddings of the translated message<\/a> (using the SentenceTransformers library), the same method used to index the data. In my tests, I found that using translated English messages yielded better results for the model I am using.<\/li>\n<li>Next, I connect to the Azure SQL database using the <a href=\"https:\/\/pypi.org\/project\/pymssql\/\">pymssql<\/a> library and <a href=\"https:\/\/huggingface.co\/spaces\/rrg92\/sqlserver-lib-assistant\/blob\/main\/app.py#L133\">execute an ad-hoc query<\/a> to retrieve the most relevant scripts by comparing the embeddings generated in the previous step.<\/li>\n<li>Use an AI model with the SentenceTransformers library <a href=\"https:\/\/huggingface.co\/spaces\/rrg92\/sqlserver-lib-assistant\/blob\/main\/app.py#L243\">to rerank the results<\/a>, providing a more precise determination of which script is related to the user&#8217;s question.<\/li>\n<li>Then it passes the <a href=\"https:\/\/huggingface.co\/spaces\/rrg92\/sqlserver-lib-assistant\/blob\/main\/app.py#L284\">reranked result to an LLM<\/a>, which can better explain and present it in a chat interface.<\/li>\n<\/ol>\n<p>Let&#8217;s look at some parts so you can understand how SQL Server helps.<\/p>\n<p>Here is the search function. That function will receive the text input from the user in its first argument.<\/p>\n<p><figure id=\"attachment_5916\" aria-labelledby=\"figcaption_attachment_5916\" class=\"wp-caption aligncenter\" ><a href=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/SEarchFunction.png\"><img decoding=\"async\" class=\"wp-image-5916 size-full\" src=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/SEarchFunction.png\" alt=\"Screenshot of a Python function named search that generates embeddings from text, builds a SQL query using SQL Server\u2019s vector_distance function with cosine similarity, retrieves the top results from a Scripts table including path, script content, content length, and similarity, then executes the query and returns the results.\" width=\"768\" height=\"735\" srcset=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/SEarchFunction.png 768w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/SEarchFunction-300x287.png 300w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/SEarchFunction-24x24.png 24w\" sizes=\"(max-width: 768px) 100vw, 768px\" \/><\/a><figcaption id=\"figcaption_attachment_5916\" class=\"wp-caption-text\">The <a href=\"https:\/\/huggingface.co\/spaces\/rrg92\/sqlserver-lib-assistant\/blob\/main\/app.py#L102\"><em>search<\/em><\/a> function, defined in app.py<\/figcaption><\/figure><\/p>\n<p>First, we generate embeddings from the translated user text, as shown in line 107. This is the same function that was used to index data (Gradio exposes it via REST API and <a href=\"https:\/\/github.com\/rrg92\/powershai\">powershai<\/a> understand that API). The <a href=\"https:\/\/huggingface.co\/spaces\/rrg92\/sqlserver-lib-assistant\/blob\/main\/app.py#L55\"><em>embed<\/em><\/a> function, defined in line 55, uses an object from the SentenceTransformers library, which loads the AI model called <a href=\"https:\/\/huggingface.co\/mixedbread-ai\/mxbai-embed-large-v1\">mixedbread-ai\/mxbai-embed-large-v1<\/a> and expose it simply as that method. Note that the function is decorated with <em>@spaces.GPU:<\/em>\u00a0this is what allows the function to use the high-end GPUs provided by Hugging Face infrastructure, so our function runs very, very fast.<\/p>\n<p><figure id=\"attachment_5918\" aria-labelledby=\"figcaption_attachment_5918\" class=\"wp-caption aligncenter\" ><a href=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/embed-function.png\"><img decoding=\"async\" class=\"wp-image-5918 size-full\" src=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/embed-function.png\" alt=\"Screenshot of a Python function named embed decorated with @spaces.GPU. The function takes text as input, uses Embedder.encode to generate an embedding, and returns the embedding as a list.\" width=\"427\" height=\"128\" srcset=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/embed-function.png 427w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/embed-function-300x90.png 300w\" sizes=\"(max-width: 427px) 100vw, 427px\" \/><\/a><figcaption id=\"figcaption_attachment_5918\" class=\"wp-caption-text\">The <em>embed<\/em> function<\/figcaption><\/figure><\/p>\n<p>So, back to the query, now I have the embedding from the user text, so we can start finding scripts related to it. We need to run a query against our script database to find the rows that best match the user query, using that generated embedding. The query is an ad-hoc query, defined directly into the body of the <em>search<\/em> function:<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/query-search.png\"><img decoding=\"async\" class=\"aligncenter wp-image-5919 size-full\" src=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/query-search.png\" alt=\"Python multi-line SQL query string. Declares a search vector, selects top results including RelPath, similarity, ScriptContent, content length, and CosDistance. Contains a nested subquery using vector_distance with cosine similarity on embeddings.\" width=\"951\" height=\"520\" srcset=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/query-search.png 951w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/query-search-300x164.png 300w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/query-search-768x420.png 768w\" sizes=\"(max-width: 951px) 100vw, 951px\" \/><\/a><\/p>\n<p>Notice that we don&#8217;t filter using common filters like the LIKE operator or any FullText type. In other words, we aren&#8217;t filtering text here but searching for the best match using only numbers. These numbers are the embeddings of our search text, stored in the <em>@search<\/em> variable (using simple string concatenation), and those stored for each script. Remember, embeddings are generated by an AI model and are an array of floating-point numbers representing the meaning of the text. <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/ai-in-sql-server-2025-embeddings\/\">This my post<\/a> and <a href=\"https:\/\/www.youtube.com\/watch?v=uAdnL87bv4E\">this video from MVP Deborah Melkin<\/a> provides a good introduction to help you understand how it works.<\/p>\n<p>Let&#8217;s break this down so you understand, starting with the innermost query, highlighted in red in the image above. That query uses a new function called <a href=\"https:\/\/learn.microsoft.com\/en-us\/sql\/t-sql\/functions\/vector-distance-transact-sql?view=sql-server-ver17\">VECTOR_DISTANCE<\/a>. This function essentially compares two embeddings using a mathematical operation (no AI involved here). In this case, we&#8217;re using the Cosine Distance operation. It returns a value between 0 and 2, where 0 means the two input embeddings are identical and 2 means they are different (or opposite). So, to find the best matching script for our search, we take the top result closest to 0; in other words, it&#8217;s an ordering by the result of that function.<\/p>\n<p>For example, let&#8217;s generate the embeddings for the text &#8216;<em>cpu performance<\/em>&#8216; using our embed function and paste them into the script open in SSMS. It looks like the following:<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/cpu-performance.png\"><img decoding=\"async\" class=\"aligncenter wp-image-5921 size-large\" src=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/cpu-performance-1024x701.png\" alt=\"SQL Server vector search query returning results from Scripts table. Displays script results with columns like id, relative path, script content, etc.\" width=\"1024\" height=\"701\" srcset=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/cpu-performance-1024x701.png 1024w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/cpu-performance-300x205.png 300w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/cpu-performance-768x526.png 768w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/cpu-performance.png 1289w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><\/p>\n<p>I didn&#8217;t include the order by clause for you to see the raw results of the script. Look at the CosDistance column. Some rows are closer to 0 than others. However, if we add order by, we find the best matches.<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/cpu-performance-orderby.png\"><img decoding=\"async\" class=\"aligncenter wp-image-5922 size-full\" src=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/cpu-performance-orderby.png\" alt=\"SQL Server query declaring a vector and using vector_distance with cosine similarity to rank embeddings. Results table lists script files with embeddings, chunk numbers, and cosine distance values, ordered by CosDistance column\" width=\"698\" height=\"520\" srcset=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/cpu-performance-orderby.png 698w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/cpu-performance-orderby-300x223.png 300w\" sizes=\"(max-width: 698px) 100vw, 698px\" \/><\/a><\/p>\n<p>Now, row 1 contains the closest result to 0, which is a script called <code>CPU\/CPUDelta.sql<\/code>\u2026 If you look at previous results and compare them with the sorted results, you will see that the latter will bring more results related to our <strong><em>cpu performance<\/em><\/strong> text.<\/p>\n<h2>Finding the best possible matches<\/h2>\n<p>Looking back at the original query, you&#8217;ll notice that the order by is actually in the outermost script. I&#8217;ve kept it there for readability, and the innermost script is used only to generate the CosDistance column, which I used to calculate the <em>Similaridade<\/em> column (similarity in pt-BR), which is simply the cosine distance in the range of -1 to 1, where -1 means opposite and 1 means equal. I also added other columns that I return to Python, such as the path and the script content.<\/p>\n<p>Here is a complete sample of what our SQL function would return if the search text was <em><strong>cpu performance<\/strong>:<\/em><\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/cpu-performance-orderby-top.png\"><img decoding=\"async\" class=\"aligncenter wp-image-5923 size-full\" src=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/cpu-performance-orderby-top.png\" alt=\"SQL Server query using a declared vector for semantic search. It selects top 10 results, computing similarity as 1 minus cosine distance, returning RelPath, ScriptContent, content length, and CosDistance. Results table shows SQL script files with similarity scores, content length, and cosine distance.\" width=\"942\" height=\"602\" srcset=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/cpu-performance-orderby-top.png 942w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/cpu-performance-orderby-top-300x192.png 300w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/cpu-performance-orderby-top-768x491.png 768w\" sizes=\"(max-width: 942px) 100vw, 942px\" \/><\/a><\/p>\n<p><div class=\"alert alert-info\"><p class=\"alert-divider\"><i class=\"fabric-icon fabric-icon--Info\"><\/i><strong>Vector Index<\/strong><\/p>Experienced readers can note that this query can be very inefficient for tables with millions or billions of rows, and, in fact, it is.<\/p>\n<p>Using an ORDER BY\u00a0with the result of a function can lead to a table scan, causing high memory usage, tempdb pressure, high CPU, and high I\/O, especially if this query is run by multiple sessions at the same time. It is a performance killer. For those cases, SQL Server 2025 also brings <a href=\"https:\/\/learn.microsoft.com\/en-us\/sql\/relational-databases\/vectors\/vectors-sql-server?view=sql-server-ver17#approximate-vector-index-and-vector-search-approximate-nearest-neighbors\">vector indexes<\/a>, which can help find the best match in a much more efficient way. In my case, because I will have only a few rows and this is a simple test, I don\u2019t worry about using it for now. But it\u2019s definitely a cool feature you should know about! Your business will thank you!<\/div><\/p>\n<p>This result is converted to JSON by Python, and before asking Google Gemini to use it to generate an answer, we perform a process called ReRank, which is carried out by another AI model using the SentenceTransformer library.<\/p>\n<p><figure id=\"attachment_5924\" aria-labelledby=\"figcaption_attachment_5924\" class=\"wp-caption aligncenter\" ><a href=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/rerank.png\"><img decoding=\"async\" class=\"wp-image-5924 size-large\" src=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/rerank-1024x355.png\" alt=\"Python code snippet showing rerank logic. It creates a doclist from ScriptContent fields of FoundScripts, then loops through a rerank function to assign ranks to documents, sorting results by rank score.\" width=\"1024\" height=\"355\" srcset=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/rerank-1024x355.png 1024w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/rerank-300x104.png 300w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/rerank-768x266.png 768w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2025\/08\/rerank.png 1176w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><figcaption id=\"figcaption_attachment_5924\" class=\"wp-caption-text\">Rerank after embeddings query<\/figcaption><\/figure><\/p>\n<p>ReRank only retrieves matching results from the query and reorders them, reclassifying them according to the input text. Essentially, we&#8217;re asking the AI model: &#8220;Given these matches I found in my SQL Server, can you generate a score for how each relates to CPU performance?&#8221; The ReRanker AI model is slower than embeddings, which is why we don&#8217;t rerank all scripts at once, only a subset. We&#8217;re combining the powerful capabilities of embeddings to find good matches with a better model to relate them, making our search very efficient and highly likely to recommend better queries for use. I could filter the top 10 down to the top 3, 5, etc., after reranking, but I&#8217;ve simply chosen to keep all results returned by SQL and let Google Gemini decide which to include, as I&#8217;ve included the reranked results in the prompt for it to generate an answer.<\/p>\n<p>So, embeddings are not a perfect system or match. They are good for filtering many data points to find better matches. Combined with reranking models, we can produce powerful results.<\/p>\n<h2>Conclusions<\/h2>\n<p>This was a small example of how SQL Server 2025 and Azure SQL Database can play a key role in AI solutions, and thanks to the vector data type and VECTOR_DISTANCE, we can easily add AI support to our T-SQL solutions and efficiently find good matches for a given text. I hope that provides some examples and clarity.<\/p>\n<p>Thanks for reading this far, and remember that all the code is completely public, so you can try it yourself using those links:<\/p>\n<p><a href=\"https:\/\/github.com\/rrg92\/sqlserver-lib\/tree\/main\">The SQL ServerLib Github Repo<\/a><\/p>\n<p><a href=\"https:\/\/huggingface.co\/spaces\/rrg92\/sqlserver-lib-assistant\/blob\/main\/app.py\">The hugging face space<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>I recently created a GitHub repository to share the T-SQL scripts I&#8217;ve accumulated over 10 years as a DBA. However, I&#8217;ve always struggled to find the right script when I need it. For years, I relied on simple shell commands like find or just my memory, which could take 10-30 minutes, if I was lucky [&hellip;]<\/p>\n","protected":false},"author":197551,"featured_media":5902,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1,672,619,615],"tags":[590,441,407],"class_list":["post-5898","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-azure-sql","category-sql-server-2025","category-t-sql","category-vectors","tag-ai","tag-powershell","tag-python"],"acf":[],"blog_post_summary":"<p>I recently created a GitHub repository to share the T-SQL scripts I&#8217;ve accumulated over 10 years as a DBA. However, I&#8217;ve always struggled to find the right script when I need it. For years, I relied on simple shell commands like find or just my memory, which could take 10-30 minutes, if I was lucky [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts\/5898","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/users\/197551"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/comments?post=5898"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts\/5898\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/media\/5902"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/media?parent=5898"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/categories?post=5898"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/tags?post=5898"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}