Semantic Kernel in-memory SQL processing for analytical queries

In today’s world, large language models (LLMs) like Azure OpenAI’s GPT models have significantly changed how users interact with data-driven applications. While these models are excellent at understanding language, they can fall short on precise, structured computations. In this post, we’ll explore how we built an Industry Copilot by integrating Semantic Kernel with an ephemeral SQL computation plugin to deliver reliable, dynamic analytical insights from drilling report data.

Introducing the Copilot and Business Context

Drilling engineers work with extensive datasets called drilling reports, which capture detailed operational logs. These reports capture critical information such as depth, operational phases, durations, encountered problems, and tags identifying notable events. Engineers frequently need accurate, precise answers to analytical queries like:

“How long did each operation and each phase take?”
“What tags were most frequently encountered during the operation?”

Historically, drilling engineers manually analyzed these reports or relied on customized reporting tools. Our Industry Copilot seeks to simplify this by enabling engineers to pose questions in natural language directly to an intelligent assistant, which provides accurate, real-time analytical insights.

Traditional Query Patterns vs. In-Context Memory Queries

Database Queries and Limitations

Analytical queries may be directly executed against a database using dynamic SQL, where the SQL query is either fully or partially generated by the LLM. While powerful and flexible, dynamically executing SQL against operational databases introduces risks such as:

SQL Injection Vulnerabilities: Maliciously crafted inputs can exploit inadequate sanitization practices.
Resource Overconsumption: Complex queries may strain database resources or degrade application performance.
Complex Row-Level Access Control: Ensuring secure, granular data access often requires intricate configurations, which can impact performance.

In-Context Queries Using Semantic Kernel

To mitigate these risks, we adopted a novel approach leveraging Semantic Kernel, an open-source SDK designed to integrate LLMs seamlessly into applications. Semantic Kernel helps us orchestrate LLM workflows, combining their reasoning capabilities with reliable structured computation.

In this approach, relevant drilling report data is first retrieved into the chat context based on explicit business rules or prior user intent. This subset of data then serves as input for subsequent analytical queries. This method ensures:

Controlled Data Scope: Only relevant data pre-selected by explicit business logic is used, reducing risk and improving performance.
Reduced Security Risk: By querying ephemeral, in-memory datasets, we significantly reduce SQL injection risks.
High Efficiency: Queries are executed against compact datasets loaded directly into memory, vastly improving response times.

Implementing the SQL Computation Plugin

Semantic Kernel Plugin

We introduced a new Semantic Kernel plugin called SqlPlugin to manage ephemeral in-memory SQLite databases for analytical queries:

public class SqlPlugin
{
    [KernelFunction("execute_sql_query")]
    [Description("Executes a SQL query on drilling reports data available in chat memory.")]
    [return: Description("Output of the SQL query as a list of dictionaries.")]
    public async Task<IList<IDictionary<string, object?>>> ExecuteSqlAsync(
        [Description(
            $"The SQL query to execute on the DDR table. Must be in SQLite dialect. The schema is: {DdrTableSchema}")]
        string query,
        Kernel kernel,
        CancellationToken cancellationToken)
    {
        // Attempt to retrieve stored chat history from the kernel's data storage.
        var chatHistory = RetrieveChatHistory(kernel.Data);

        return await ExecuteQueryAsync(query, chatHistory, cancellationToken);
    }
}

### Schema Definition and Data Loading

The data schema for drilling reports is explicitly provided to the LLM via the plugin’s annotations, ensuring accurate query generation:

```sql
CREATE TABLE DrillingReports (
    Id TEXT PRIMARY KEY,
    Depth REAL,
    Comment TEXT,
    Phase TEXT,
    Unit TEXT,
    Start DATETIME,
    End DATETIME,
    Hours REAL,
    OperationCode TEXT,
    OperationSubCode TEXT
);

CREATE TABLE DrillingReportTags (
    DrillingReport_Id TEXT,
    Tag TEXT NOT NULL,
    FOREIGN KEY (DrillingReport_Id) REFERENCES DrillingReports(Id)
);

Drilling report data retrieved into the chat memory might look like:

[
    {
        "Id": "1",
        "Depth": 1503,
        "Comment": "Encountered hard formation",
        "Phase": "INTCSG1",
        "Unit": "m",
        "Timeline": {"Start": "2025-01-14T10:00Z", "End": "2025-01-15T08:00Z", "Hours": 22},
        "Operation": {"Code": "DRL", "SubCode": "HD"},
        "Tags": ["harddrilling"]
    },
    {
        "Id": "2",
        "Depth": 1600,
        "Comment": "Drilling slowed by boulders",
        "Phase": "COND1",
        "Unit": "m",
        "Timeline": {"Start": "2025-01-15T08:00Z", "End": "2025-01-16T08:00Z", "Hours": 24},
        "Operation": {"Code": "DRL", "SubCode": "BD"},
        "Tags": ["boulders", "harddrilling"]
    }
]

Real-Life Example: Analytical Query Execution

Let’s illustrate the workflow with an example.

The user asks a question requiring analytical processing, such as:
“How long did each operation and each phase take?”

The system internally generates queries similar to:

SELECT Phase, SUM(Hours) AS TotalHours FROM DrillingReports GROUP BY Phase;
SELECT OperationCode, SUM(Hours) AS TotalHours FROM DrillingReports GROUP BY OperationCode;

SqlPlugin executes the SQL and returns structured results:

[
   {"Phase": "INTCSG1", "TotalHours": 22},
   {"Phase": "COND1", "TotalHours": 24}
]
[
   {"OperationCode": "DRL", "TotalHours": 46}
]

The LLM formulates this into a conversational response:
Phase INTCSG1 took 22 hours, and Phase COND1 took 24 hours. Operation code DRL totaled 46 hours.

Handling Complex Analytical Queries

For simple math, Semantic Kernel provides a dedicated Math Plugin. However, answering questions about the drilling reports often involves analytical logic beyond simple arithmetic, such as aggregations, joins, and filters. Our approach reliably handles these complexities by allowing the LLM to translate natural language directly into precise SQL queries, leveraging structured schema information provided explicitly through plugin annotations.

Safeguards and Limitations

Recognizing the potential for abuse or resource exhaustion from dynamic SQL execution, we implemented safeguards. These measures ensure that our solution remains secure, efficient, and reliable—even under heavy use.

Query Timeout (5 seconds): Automatically terminates excessively slow queries.
Result Size Limits: Prevents overly large outputs to avoid memory exhaustion.
Read-only Mode: SQLite databases are locked into read-only mode, preventing modification of data or schema.

Additionally, our solution runs each query in a transient in-memory database scoped to individual user requests, thus maintaining strong thread isolation and efficient memory management.

See the Appendix for more details about the implemented safeguards.

Conclusion: Benefits of Integrating Semantic Kernel and SQL Plugins

Our innovative approach using Semantic Kernel and plugins on ephemeral databases has significantly enhanced the analytical capabilities of the Copilot:

Accuracy: SQL execution ensures deterministic and precise analytical results.
Performance: Efficient in-memory queries accelerate user interactions.
Security: Reduced attack surface compared to traditional dynamic database queries.
Scalability and Extensibility: Easily adapted to new analytical scenarios and additional data schemas.

Through this structured integration, we’ve empowered drilling engineers to engage intuitively with their data, driving productivity and enhancing operational insights.

Appendix: Simplified Version of the Implementation

    /// <summary>
    /// Executes a SQL query on an ephemeral in-memory database populated from chat
    /// history data.
    /// </summary>
    public async Task<IList<IDictionary<string, object?>>> ExecuteQueryAsync(
        string query,
        ChatHistory history,
        CancellationToken cancellationToken)
    {
        // SQLite uses an in-memory database, which is temporary and private to this connection.
        var conn = new SqliteConnection("Data Source=:memory:;Mode=Memory;Cache=Private");

        // Open the SQLite connection asynchronously, allowing it to handle queries.
        await conn.OpenAsync(cancellationToken);

        // Limit the total number of database pages (affects memory consumption).
        await ExecuteSqlNonQueryAsync(cancellationToken, conn, "PRAGMA max_page_count = 5000");

        // Adjust the cache size for better performance (negative values indicate memory size in KB).
        await ExecuteSqlNonQueryAsync(cancellationToken, conn, "PRAGMA cache_size = -2000");

        // Disable memory-mapped I/O, which is not beneficial for an in-memory database.
        await ExecuteSqlNonQueryAsync(cancellationToken, conn, "PRAGMA mmap_size = 0");

        // Store temporary tables and indexes in memory instead of on disk.
        await ExecuteSqlNonQueryAsync(cancellationToken, conn, "PRAGMA temp_store = MEMORY");

        // Set an upper limit for SQLite’s heap memory usage (20MB in this case).
        SQLitePCL.raw.sqlite3_hard_heap_limit64(20 * 1024 * 1024);

        // Create tables
        await ExecuteSqlNonQueryAsync(cancellationToken, conn, DdrTableSchema);

        // Load data from chat history (implementation omitted)
        await LoadDataFromChatHistory(conn, chatHistory, cancellationToken);

        // Enforce read-only mode, preventing write operations.
        await ExecuteSqlNonQueryAsync(cancellationToken, conn, "PRAGMA query_only = 1");

        // Execute the provided SQL query on the DDR table and return results
        return await ExecuteQueryAndRetrieveResultsAsync(connection, query, cancellationToken);
    }

    /// <summary>
    /// Executes a SQL query on the in-memory database and retrieves the results.
    /// Implements a query timeout mechanism to prevent excessively long execution.
    /// </summary>
    private async Task<IList<IDictionary<string, object?>>> ExecuteQueryAndRetrieveResultsAsync(
        SqliteConnection connection, string query, CancellationToken cancellationToken)
    {
        var results = new List<IDictionary<string, object?>>();
        int totalChars = 0;

        // Create a new cancellation token source to enforce query timeout.
        using var cts = new CancellationTokenSource();
        using var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, cts.Token);
        var linkedToken = linkedCts.Token;

        // Background task to interrupt the query if it exceeds the timeout duration.
        var interruptTask = Task.Run(async () =>
            {
                try
                {
                    // Wait for the query timeout duration before triggering cancellation
                    await Task.Delay(TimeSpan.FromSeconds(QueryTimeoutSeconds), linkedToken);

                    // If timeout occurs, send an interrupt signal to SQLite
                    if (connection.Handle != null)
                    {
                        // Forces ongoing SQL query to stop execution as soon as possible
                        sqlite3_interrupt(connection.Handle);
                    }
                }
                catch (TaskCanceledException)
                {
                    // This exception occurs when cancellationToken is already canceled before delay completes
                    // Safe to ignore because it means execution finished within the allowed time
                }
            },
            // Using `CancellationToken.None` here ensures that the background task is NOT bound
            // to any external cancellation logic. This is crucial because:
            //
            // 1️⃣ **Independent Timeout Enforcement:**
            //    - The purpose of this task is to enforce a **hard timeout** for the SQL query.
            //    - It must continue counting down the `QueryTimeoutSeconds` even if the
            //      parent operation is canceled by an external source (`cancellationToken`).
            //
            // 2️⃣ **Prevents Premature Termination:**
            //    - If we pass `linkedToken` instead, the timeout task might be canceled
            //      **too early**, before it gets a chance to interrupt SQLite.
            //    - This could happen if `ExecuteSqlAsync` is canceled externally, meaning
            //      the SQL query could continue running indefinitely.
            //
            // 3️⃣ **Ensures Proper Cleanup:**
            //    - The `finally` block ensures that `cts.CancelAsync()` is called
            //      to clean up the timeout task once execution is complete.
            //    - If `linkedToken` were passed instead, the cancellation might
            //      propagate before `cts.CancelAsync()` is explicitly invoked,
            //      leading to unintended early termination.
            //
            // 🔍 **TL;DR:** We use `CancellationToken.None` so the timeout logic
            // remains **independent** of the primary cancellation mechanism,
            // ensuring the SQL query is always interrupted if it exceeds the time limit.
            CancellationToken.None);

        try
        {
            await using var command = connection.CreateCommand();
            command.CommandText = query;
            await using var reader = await command.ExecuteReaderAsync(cancellationToken);

            await foreach (var row in ReadQueryResultsAsync(reader, cancellationToken))
            {
                string serializedRow = JsonSerializer.Serialize(row);
                totalChars += serializedRow.Length;

                if (totalChars > MaxQueryResultSize)
                {
                    throw LogPluginException(
                        $"SQL execution failed: SQL query result exceeded the size limit of {MaxQueryResultSize} characters.");
                }

                results.Add(row);
            }
        }
        catch (SqliteException e) when (e.SqliteErrorCode == 9) // SQLITE_INTERRUPT
        {
            throw LogPluginException("SQL execution failed: SQL query time exceeded execution time limit", e);
        }
        catch (Exception e)
        {
            throw LogPluginException($"SQL execution failed: {e.Message}", e);
        }
        finally
        {
            // Ensure that the timeout task is properly canceled and cleaned up
            await cts.CancelAsync();
            await interruptTask;
        }

        return results;
    }

Enhancing Analytical Queries with Semantic Kernel and In-Memory SQL Processing

Semantic Kernel in-memory SQL processing for analytical queries

Introducing the Copilot and Business Context

Traditional Query Patterns vs. In-Context Memory Queries

Database Queries and Limitations

In-Context Queries Using Semantic Kernel

Implementing the SQL Computation Plugin

Semantic Kernel Plugin

Real-Life Example: Analytical Query Execution

Handling Complex Analytical Queries

Safeguards and Limitations

Conclusion: Benefits of Integrating Semantic Kernel and SQL Plugins

Appendix: Simplified Version of the Implementation

Author

Read next

Learnings from External Data Handling

Customizable and Extensible Evaluation of a GenAI Application

Semantic Kernel in-memory SQL processing for analytical queries

Introducing the Copilot and Business Context

Traditional Query Patterns vs. In-Context Memory Queries

Database Queries and Limitations

In-Context Queries Using Semantic Kernel

Implementing the SQL Computation Plugin

Semantic Kernel Plugin

Real-Life Example: Analytical Query Execution

Handling Complex Analytical Queries

Safeguards and Limitations

Conclusion: Benefits of Integrating Semantic Kernel and SQL Plugins

Appendix: Simplified Version of the Implementation

Author

Read next

Learnings from External Data Handling

Customizable and Extensible Evaluation of a GenAI Application

Stay informed