Azure SQL Devs’ Corner

With the release of SQL Server 2025 RC0, we have enabled the ability to use a local ONNX model on the server for embeddings. This allows you to use these models without having any network traffic leaving the local environment.

Getting Started

This example guides you through setting up SQL Server 2025 on Windows with an ONNX runtime to enable local AI-powered text embedding generation. ONNX Runtime is an open-source inference engine that allows you to run machine learning models locally, making it ideal for integrating AI capabilities into SQL Server environments.

Important

This feature requires that SQL Server Machine Learning Services is installed.

Step 1: Enable developer preview features on SQL Server 2025

Run the following SQL command to enable SQL Server 2025 preview features in the database you would like use for this example:

ALTER DATABASE SCOPED CONFIGURATION
SET PREVIEW_FEATURES = ON;

Step 2: Enable the local AI runtime on SQL Server 2025

Enable external AI runtimes by running the following SQL:

EXEC sp_configure 'external AI runtimes enabled', 1;
RECONFIGURE WITH OVERRIDE;

Step 3: Set up the ONNX runtime library

Create a directory on the SQL Server to hold the ONNX runtime library files. In this example, C:\onnx_runtime is used.

You can use the following PowerShell commands to create the directory:

cd C:\
mkdir onnx_runtime

Next, download the ONNX Runtime (version ≥ 1.19) that is appropriate for your operating system.

After unzipping the download, copy the onnxruntime.dll (located in the lib directory) to the C:\onnx_runtime directory that was created.

Step 4: Set up the tokenization library

Download and build the tokenizers-cpp library from GitHub. Once the dll is created, place the tokenizer in the C:\onnx_runtime directory.

Important

Ensure the created dll is named tokenizers_cpp.dll.

Easy Button

To make this process easy to get you started, our engineering team has created the file for you for the model that will be downloaded in the next step.

You can download it here.

Just be sure to rename it to tokenizers_cpp.dll.

Step 5: Download the ONNX model

Start by creating the model directory in C:\onnx_runtime\.

cd C:\onnx_runtime
mkdir model

This example uses the all-MiniLM-L6-v2-onnx model from Hugging Face, which can be downloaded here.

Clone the repository into the C:\onnx_runtime\model directory with the following git command:

If not installed, you can download git from the following download link or via winget (winget install Microsoft.Git)

cd C:\onnx_runtime\model
git clone https://huggingface.co/nsense/all-MiniLM-L6-v2-onnx

Step 6: Set directory permissions

Use the following PowerShell script to provide the MSSQLLaunchpad user access to the ONNX runtime directory:

$AIExtPath = "C:\onnx_runtime";
$Acl = Get-Acl -Path $AIExtPath
$AccessRule = New-Object System.Security.AccessControl.FileSystemAccessRule('MSSQLLaunchpad', "FullControl", "ContainerInherit,ObjectInherit", "None","Allow")
$Acl.AddAccessRule($AccessRule)
Set-Acl -Path $AIExtPath -AclObject $Acl

Step 7: Create the external model

Run the following SQL to register your ONNX model as an external model object:

The ‘PARAMETERS‘ value used here is a placeholder needed for SQL Server 2025 RC 0.

CREATE EXTERNAL MODEL myLocalOnnxModel
WITH (
LOCATION = 'C:\onnx_runtime\model\all-MiniLM-L6-v2-onnx',
API_FORMAT = 'ONNX Runtime',
MODEL_TYPE = EMBEDDINGS,
MODEL = 'allMiniLM',
PARAMETERS = '{"valid":"JSON"}',
LOCAL_RUNTIME_PATH = 'C:\onnx_runtime\'
);

Important

LOCATION should point to the directory containing model.onnx and tokenizer.json files.

LOCAL_RUNTIME_PATH should point to directory containing onnxruntime.dll and tokenizer_cpp.dll files.

Step 8: Generate embeddings

Use the ai_generate_embeddings function to test the model by running the following SQL:

SELECT ai_generate_embeddings (N'Test Text' USE MODEL myLocalOnnxModel);

This command launches the AIRuntimeHost, load the required DLLs, and processes the input text.

The result from the SQL statement is an array of embeddings:

[0.320098,0.568766,0.154386,0.205526,-0.027379,-0.149689,-0.022946,-0.385856,-0.039183…]

Enable XEvent telemetry

Run the following SQL to enable telemetry for troubleshooting.

CREATE EVENT SESSION newevt
ON SERVER
ADD EVENT ai_generate_embeddings_airuntime_trace
(
ACTION (sqlserver.sql_text, sqlserver.session_id)
)
ADD TARGET package0.ring_buffer
WITH (MAX_MEMORY = 4096 KB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY = 30 SECONDS, TRACK_CAUSALITY = ON, STARTUP_STATE = OFF);
GO
ALTER EVENT SESSION newevt ON SERVER STATE = START;
GO

Next, use this SQL query see the captured telemetry:

SELECT
event_data.value('(@name)[1]', 'varchar(100)') AS event_name,
event_data.value('(@timestamp)[1]', 'datetime2') AS [timestamp],
event_data.value('(data[@name="model_name"]/value)[1]', 'nvarchar(200)') AS model_name,
event_data.value('(data[@name="phase_name"]/value)[1]', 'nvarchar(100)') AS phase,
event_data.value('(data[@name="message"]/value)[1]', 'nvarchar(max)') AS message,
event_data.value('(data[@name="request_id"]/value)[1]', 'nvarchar(max)') AS session_id,
event_data.value('(data[@name="error_code"]/value)[1]', 'bigint') AS error_code
FROM (
SELECT CAST(target_data AS XML) AS target_data
FROM sys.dm_xe_sessions AS s
JOIN sys.dm_xe_session_targets AS t
ON s.address = t.event_session_address
WHERE s.name = 'newevt'
AND t.target_name = 'ring_buffer'
) AS data
CROSS APPLY target_data.nodes('//RingBufferTarget/event') AS XEvent(event_data);

Clean up

To remove the external model object, run the following SQL:

DROP EXTERNAL MODEL myLocalOnnxModel;

To remove the directory permissions, run the following PowerShell commands:

$Acl.RemoveAccessRule($AccessRule)
Set-Acl -Path $AIExtPath -AclObject $Acl

Finally, delete the C:/onnx_runtime directory.

Create embeddings in SQL Server 2025 RC0 with a local ONNX model on Windows

Getting Started

Step 1: Enable developer preview features on SQL Server 2025

Step 2: Enable the local AI runtime on SQL Server 2025

Step 3: Set up the ONNX runtime library

Step 4: Set up the tokenization library

Easy Button

Step 5: Download the ONNX model

Step 6: Set directory permissions

Step 7: Create the external model

Step 8: Generate embeddings

Enable XEvent telemetry

Clean up

Author

0 comments

Read next

Using SQL Server’s new AI features and Python to create a T-SQL assistant

AI-based T-SQL Refactoring: an automatic intelligent code optimization with Azure OpenAI

Getting Started

Step 1: Enable developer preview features on SQL Server 2025

Step 2: Enable the local AI runtime on SQL Server 2025

Step 3: Set up the ONNX runtime library

Step 4: Set up the tokenization library

Easy Button

Step 5: Download the ONNX model

Step 6: Set directory permissions

Step 7: Create the external model

Step 8: Generate embeddings

Enable XEvent telemetry

Clean up

Author

0 comments

Read next

Using SQL Server’s new AI features and Python to create a T-SQL assistant

AI-based T-SQL Refactoring: an automatic intelligent code optimization with Azure OpenAI

Stay informed