{"id":5847,"date":"2025-08-21T18:45:58","date_gmt":"2025-08-22T01:45:58","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/azure-sql\/?p=5847"},"modified":"2025-09-30T13:28:16","modified_gmt":"2025-09-30T20:28:16","slug":"create-embeddings-in-sql-server-2025-rc0-with-a-local-onnx-model-on-windows","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/azure-sql\/create-embeddings-in-sql-server-2025-rc0-with-a-local-onnx-model-on-windows\/","title":{"rendered":"Create embeddings in SQL Server 2025 RC0 with a local ONNX model on Windows"},"content":{"rendered":"<div>\n<div><span style=\"font-size: 14pt\">With the release of <a href=\"https:\/\/aka.ms\/getsqlserver2025.\">SQL Server 2025 RC0<\/a>, we have enabled the ability to use a <a href=\"https:\/\/learn.microsoft.com\/en-us\/sql\/t-sql\/statements\/create-external-model-transact-sql?view=sql-server-ver17#example-with-a-local-onnx-runtime\">local ONNX model on the server for embeddings<\/a>. This allows you to use these models without having any network traffic leaving the local environment.<\/span><\/div>\n<h2><span style=\"font-size: 18pt\">Getting Started<\/span><\/h2>\n<div>\n<p><span style=\"font-size: 14pt\">This example guides you through setting up SQL Server 2025 on Windows with an ONNX runtime to enable local AI-powered text embedding generation. <\/span><span style=\"font-size: 14pt\"><a href=\"https:\/\/onnxruntime.ai\/\">ONNX Runtime<\/a>\u00a0is an open-source inference engine that allows you to run machine learning models locally, making it ideal for integrating AI capabilities into SQL Server environments.<\/span><\/p>\n<p><span style=\"font-size: 14pt\"><div class=\"alert alert-info\"><p class=\"alert-divider\"><i class=\"fabric-icon fabric-icon--Info\"><\/i><strong>Important<\/strong><\/p>This feature requires that <\/span><a style=\"font-size: 14pt\" href=\"https:\/\/learn.microsoft.com\/en-us\/sql\/machine-learning\/install\/sql-machine-learning-services-windows-install?view=sql-server-ver17\">SQL Server Machine Learning Services<\/a><span style=\"font-size: 14pt\">\u00a0is installed.<\/div><\/span><\/p>\n<\/div>\n<h2><span style=\"font-size: 18pt\">Step 1: Enable developer preview features on SQL Server 2025<\/span><\/h2>\n<div><span style=\"font-size: 14pt\">Run the following SQL command to enable SQL Server 2025 preview features in the database you would like use for this example:<\/span><\/div>\n<pre><span style=\"font-size: 14pt\">ALTER DATABASE SCOPED CONFIGURATION<\/span>\r\n<span style=\"font-size: 14pt\">SET PREVIEW_FEATURES = ON;<\/span><\/pre>\n<h2><span style=\"font-size: 18pt\">Step 2: Enable the local AI runtime on SQL Server 2025<\/span><\/h2>\n<div><span style=\"font-size: 14pt\">Enable external AI runtimes by running the following SQL:<\/span><\/div>\n<pre><span style=\"font-size: 14pt\">EXEC sp_configure 'external AI runtimes enabled', 1;<\/span>\r\n<span style=\"font-size: 14pt\">RECONFIGURE WITH OVERRIDE;<\/span><\/pre>\n<h2><span style=\"font-size: 18pt\">Step 3: Set up the ONNX runtime library<\/span><\/h2>\n<div>\n<p><span style=\"font-size: 14pt\">Create a directory on the SQL Server to hold the ONNX runtime library files. In this example,\u00a0<span style=\"font-family: terminal, monaco, monospace\">C:\\onnx_runtime<\/span>\u00a0is used.<\/span><\/p>\n<p><span style=\"font-size: 14pt\">You can use the following PowerShell commands to create the directory:<\/span><\/p>\n<\/div>\n<pre><span style=\"font-size: 14pt\">cd C:\\<\/span>\r\n<span style=\"font-size: 14pt\">mkdir onnx_runtime<\/span><\/pre>\n<div>\n<p><span style=\"font-size: 14pt\">Next, <a href=\"https:\/\/github.com\/microsoft\/onnxruntime\/releases\">download the ONNX Runtime<\/a> (version \u2265 1.19) that is appropriate for your operating system.<\/span><\/p>\n<p><span style=\"font-size: 14pt\">After unzipping the download, copy the\u00a0<strong>onnxruntime.dll<\/strong>\u00a0(located in the lib directory) to the\u00a0<span style=\"font-family: terminal, monaco, monospace\">C:\\onnx_runtime<\/span>\u00a0directory that was created.<\/span><\/p>\n<\/div>\n<h2><span style=\"font-size: 18pt\">Step 4: Set up the tokenization library<\/span><\/h2>\n<div>\n<p><span style=\"font-size: 14pt\">Download and build\u00a0<a href=\"https:\/\/github.com\/mlc-ai\/tokenizers-cpp\/tree\/main\">the\u00a0tokenizers-cpp\u00a0library\u00a0from GitHub<\/a>. Once the dll is created, place the tokenizer in the\u00a0<span style=\"font-family: terminal, monaco, monospace\">C:\\onnx_runtime<\/span>\u00a0directory.<\/span><\/p>\n<p><span style=\"font-size: 14pt\"><div class=\"alert alert-info\"><p class=\"alert-divider\"><i class=\"fabric-icon fabric-icon--Info\"><\/i><strong>Important<\/strong><\/p>Ensure the created dll is named <strong>tokenizers_cpp.dll<\/strong>.<\/div><\/span><\/p>\n<\/div>\n<h3><span style=\"font-size: 14pt\">Easy Button<\/span><\/h3>\n<div><span style=\"font-size: 14pt\">To make this process easy to get you started, our engineering team has <a href=\"https:\/\/github.com\/PARTHSQL\/tokenizers-cpp\/releases\/download\/v0.1.1\/tokenizers_cpp-release.dll\">created the file<\/a> for you for the model that will be downloaded in the next step.<\/span><\/div>\n<div>\n<p><span style=\"font-size: 14pt\">You can download it <a href=\"https:\/\/github.com\/PARTHSQL\/tokenizers-cpp\/releases\/download\/v0.1.1\/tokenizers_cpp-release.dll\">here<\/a>.<\/span><\/p>\n<p><span style=\"font-size: 14pt\"><em>Just be sure to rename it to <strong>tokenizers_cpp.dll<\/strong>.<\/em><\/span><\/p>\n<\/div>\n<h2><span style=\"font-size: 18pt\">Step 5: Download the ONNX model<\/span><\/h2>\n<div><span style=\"font-size: 14pt\">Start by creating the\u00a0model\u00a0directory in\u00a0<span style=\"font-family: terminal, monaco, monospace\">C:\\onnx_runtime\\<\/span>.<\/span><\/div>\n<pre><span style=\"font-size: 14pt\">cd C:\\onnx_runtime<\/span>\r\n<span style=\"font-size: 14pt\">mkdir model<\/span><\/pre>\n<div>\n<p><span style=\"font-size: 14pt\">This example uses the <a href=\"https:\/\/huggingface.co\/nsense\/all-MiniLM-L6-v2-onnx\">all-MiniLM-L6-v2-onnx model from Hugging Face<\/a>, which <a href=\"https:\/\/huggingface.co\/nsense\/all-MiniLM-L6-v2-onnx\">can be downloaded here<\/a>.<\/span><\/p>\n<p><span style=\"font-size: 14pt\">Clone the repository into the <span style=\"font-family: terminal, monaco, monospace\">C:\\onnx_runtime\\model<\/span> directory with the following git command:<\/span><\/p>\n<p><em><span style=\"font-size: 14pt\">If not installed, you can download git from the following <a href=\"https:\/\/git-scm.com\/downloads\">download link<\/a>\u00a0or via winget (<span style=\"font-family: terminal, monaco, monospace\">winget install Microsoft.Git<\/span>)<\/span><\/em><\/p>\n<\/div>\n<\/div>\n<div>\n<pre><span style=\"font-size: 14pt\">cd C:\\onnx_runtime\\model<\/span>\r\n<span style=\"font-size: 14pt\">git clone https:\/\/huggingface.co\/nsense\/all-MiniLM-L6-v2-onnx<\/span><\/pre>\n<h2><span style=\"font-size: 18pt\">Step 6: Set directory permissions<\/span><\/h2>\n<div><span style=\"font-size: 14pt\">Use the following PowerShell script to provide the <strong>MSSQLLaunchpad<\/strong> user access to the <strong>ONNX runtime directory<\/strong>:<\/span><\/div>\n<pre><span style=\"font-size: 14pt\">$AIExtPath = \"C:\\onnx_runtime\";<\/span>\r\n<span style=\"font-size: 14pt\">$Acl = Get-Acl -Path $AIExtPath<\/span>\r\n<span style=\"font-size: 14pt\">$AccessRule = New-Object System.Security.AccessControl.FileSystemAccessRule('MSSQLLaunchpad', \"FullControl\", \"ContainerInherit,ObjectInherit\", \"None\",\"Allow\")<\/span>\r\n<span style=\"font-size: 14pt\">$Acl.AddAccessRule($AccessRule)<\/span>\r\n<span style=\"font-size: 14pt\">Set-Acl -Path $AIExtPath -AclObject $Acl<\/span><\/pre>\n<h2><span style=\"font-size: 18pt\">Step 7: Create the external model<\/span><\/h2>\n<div>\n<p><span style=\"font-size: 14pt\">Run the following SQL to register your ONNX model as an <strong>external model object<\/strong>:<\/span><\/p>\n<p><span style=\"font-size: 14pt\"><i>The &#8216;<strong>PARAMETERS<\/strong>&#8216; value used here is a placeholder needed for SQL Server 2025 RC 0.<\/i><\/span><\/p>\n<\/div>\n<pre><span style=\"font-size: 14pt\"><span style=\"font-size: 14pt;font-family: Consolas, Monaco, monospace\">CREATE EXTERNAL MODEL myLocalOnnxModel<\/span>\r\nWITH (\r\n<\/span><span style=\"font-size: 14pt\">LOCATION = 'C:\\onnx_runtime\\model\\all-MiniLM-L6-v2-onnx',\r\n<\/span><span style=\"font-size: 14pt\">API_FORMAT = 'ONNX Runtime',\r\n<\/span><span style=\"font-size: 14pt\">MODEL_TYPE = EMBEDDINGS,\r\n<\/span><span style=\"font-size: 14pt\">MODEL = 'allMiniLM',\r\n<\/span><span style=\"font-size: 14pt\">PARAMETERS = '{\"valid\":\"JSON\"}',\r\n<\/span><span style=\"font-size: 14pt\">LOCAL_RUNTIME_PATH = 'C:\\onnx_runtime\\'\r\n<\/span><span style=\"font-size: 14pt\">);<\/span><\/pre>\n<div>\n<p><span style=\"font-size: 14pt\"><div class=\"alert alert-info\"><p class=\"alert-divider\"><i class=\"fabric-icon fabric-icon--Info\"><\/i><strong>Important<\/strong><\/p><\/span><span style=\"font-size: 14pt\"><em>LOCATION<\/em> should point to the directory containing model.onnx and tokenizer.json files.<\/span><\/p>\n<p><span style=\"font-size: 14pt\"><em>LOCAL_RUNTIME_PATH<\/em> should point to directory containing onnxruntime.dll and tokenizer_cpp.dll files.\u00a0<\/span><span style=\"font-size: 14pt\"><\/div><\/span><\/p>\n<\/div>\n<h2><span style=\"font-size: 18pt\">Step 8: Generate embeddings<\/span><\/h2>\n<div><span style=\"font-size: 14pt\">Use the\u00a0<a href=\"https:\/\/learn.microsoft.com\/en-us\/sql\/t-sql\/functions\/ai-generate-embeddings-transact-sql?view=sql-server-ver17\"><strong>ai_generate_embeddings<\/strong><\/a>\u00a0function to test the model by running the following SQL:<\/span><\/div>\n<pre><span style=\"font-size: 14pt\">SELECT ai_generate_embeddings (N'Test Text' USE MODEL myLocalOnnxModel);<\/span><\/pre>\n<div>\n<p><span style=\"font-size: 14pt\">This command launches the\u00a0<strong>AIRuntimeHost<\/strong>, load the required DLLs, and processes the input text.<\/span><\/p>\n<p><span style=\"font-size: 14pt\">The result from the SQL statement is an array of embeddings:<\/span><\/p>\n<p><span style=\"font-family: terminal, monaco, monospace;font-size: 14pt\">[0.320098,0.568766,0.154386,0.205526,-0.027379,-0.149689,-0.022946,-0.385856,-0.039183&#8230;]<\/span><\/p>\n<\/div>\n<h2><span style=\"font-size: 18pt\">Enable XEvent telemetry<\/span><\/h2>\n<div><span style=\"font-size: 14pt\">Run the following SQL to enable telemetry for troubleshooting.<\/span><\/div>\n<pre><span style=\"font-size: 14pt\">CREATE EVENT SESSION newevt<\/span>\r\n<span style=\"font-size: 14pt\">ON SERVER<\/span>\r\n<span style=\"font-size: 14pt\">ADD EVENT ai_generate_embeddings_airuntime_trace<\/span>\r\n<span style=\"font-size: 14pt\">(<\/span>\r\n<span style=\"font-size: 14pt\">ACTION (sqlserver.sql_text, sqlserver.session_id)<\/span>\r\n<span style=\"font-size: 14pt\">)<\/span>\r\n<span style=\"font-size: 14pt\">ADD TARGET package0.ring_buffer<\/span>\r\n<span style=\"font-size: 14pt\">WITH (MAX_MEMORY = 4096 KB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY = 30 SECONDS, TRACK_CAUSALITY = ON, STARTUP_STATE = OFF);<\/span>\r\n<span style=\"font-size: 14pt\">GO<\/span>\r\n<span style=\"font-size: 14pt\">ALTER EVENT SESSION newevt ON SERVER STATE = START;<\/span>\r\n<span style=\"font-size: 14pt\">GO<\/span><\/pre>\n<div><span style=\"font-size: 14pt\">Next, use this SQL query see the captured telemetry:<\/span><\/div>\n<pre><span style=\"font-size: 14pt\">SELECT<\/span>\r\n<span style=\"font-size: 14pt\">event_data.value('(@name)[1]', 'varchar(100)') AS event_name,<\/span>\r\n<span style=\"font-size: 14pt\">event_data.value('(@timestamp)[1]', 'datetime2') AS [timestamp],<\/span>\r\n<span style=\"font-size: 14pt\">event_data.value('(data[@name=\"model_name\"]\/value)[1]', 'nvarchar(200)') AS model_name,<\/span>\r\n<span style=\"font-size: 14pt\">event_data.value('(data[@name=\"phase_name\"]\/value)[1]', 'nvarchar(100)') AS phase,<\/span>\r\n<span style=\"font-size: 14pt\">event_data.value('(data[@name=\"message\"]\/value)[1]', 'nvarchar(max)') AS message,<\/span>\r\n<span style=\"font-size: 14pt\">event_data.value('(data[@name=\"request_id\"]\/value)[1]', 'nvarchar(max)') AS session_id,<\/span>\r\n<span style=\"font-size: 14pt\">event_data.value('(data[@name=\"error_code\"]\/value)[1]', 'bigint') AS error_code<\/span>\r\n<span style=\"font-size: 14pt\">FROM (<\/span>\r\n<span style=\"font-size: 14pt\">SELECT CAST(target_data AS XML) AS target_data<\/span>\r\n<span style=\"font-size: 14pt\">FROM sys.dm_xe_sessions AS s<\/span>\r\n<span style=\"font-size: 14pt\">JOIN sys.dm_xe_session_targets AS t<\/span>\r\n<span style=\"font-size: 14pt\">ON s.address = t.event_session_address<\/span>\r\n<span style=\"font-size: 14pt\">WHERE s.name = 'newevt'<\/span>\r\n<span style=\"font-size: 14pt\">AND t.target_name = 'ring_buffer'<\/span>\r\n<span style=\"font-size: 14pt\">) AS data<\/span>\r\n<span style=\"font-size: 14pt\">CROSS APPLY target_data.nodes('\/\/RingBufferTarget\/event') AS XEvent(event_data);<\/span><\/pre>\n<h2><span style=\"font-size: 18pt\">Clean up<\/span><\/h2>\n<div><span style=\"font-size: 14pt\">To remove the external model object, run the following SQL:<\/span><\/div>\n<pre><span style=\"font-size: 14pt\">DROP EXTERNAL MODEL myLocalOnnxModel;<\/span><\/pre>\n<div><span style=\"font-size: 14pt\">To remove the directory permissions, run the following PowerShell commands:<\/span><\/div>\n<pre><span style=\"font-size: 14pt\">$Acl.RemoveAccessRule($AccessRule)<\/span>\r\n<span style=\"font-size: 14pt\">Set-Acl -Path $AIExtPath -AclObject $Acl<\/span><\/pre>\n<div><span style=\"font-size: 14pt\">Finally, delete the\u00a0<span style=\"font-family: terminal, monaco, monospace\">C:\/onnx_runtime<\/span>\u00a0directory.<\/span><\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>With the release of SQL Server 2025 RC0, we have enabled the ability to use a local ONNX model on the server for embeddings. This allows you to use these models without having any network traffic leaving the local environment. Getting Started This example guides you through setting up SQL Server 2025 on Windows with [&hellip;]<\/p>\n","protected":false},"author":95874,"featured_media":5883,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[601,1,619],"tags":[590,409,602,529,510,588,465,469,30],"class_list":["post-5847","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","category-azure-sql","category-t-sql","tag-ai","tag-api","tag-azure-openai","tag-azure-sql","tag-azure-sql-database","tag-azure-sql-db","tag-azuresql","tag-azuresqldb","tag-developers"],"acf":[],"blog_post_summary":"<p>With the release of SQL Server 2025 RC0, we have enabled the ability to use a local ONNX model on the server for embeddings. This allows you to use these models without having any network traffic leaving the local environment. Getting Started This example guides you through setting up SQL Server 2025 on Windows with [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts\/5847","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/users\/95874"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/comments?post=5847"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts\/5847\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/media\/5883"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/media?parent=5847"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/categories?post=5847"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/tags?post=5847"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}