February 12th, 2024

Using AI for Content Moderation with Azure SQL Database

Brian Spendolini
Senior Product Manager, Azure SQL Database

In my previous posts, I have covered using Azure OpenAI services with Azure SQL Database via REST endpoints. This post with further explore that pathway by creating stored procedures you can use to encapsulate the REST calls and combine them with custom business logic for AI content moderation with Azure SQL Database.

A Quick Introduction

The services to being used in this post are the new Azure AI Content Safety service and the Azure AI Language service. Azure AI Content Safety is a solution designed to detect harmful user-generated and AI-generated content in applications and services. It can detect such content from either text or images. Azure AI Language is a service that provides Natural Language Processing (NLP) features for understanding and analyzing text. Here we can extract and redact sensitive information in documents or text. 

Using the Services

Both services have REST endpoints we will be contacting from the database using External REST Endpoint Invocation (lovingly referred to as EREI). The call to the endpoints will be encapsulated inside stored procedures so that we can use them with other objects and services such as Data API Builder or Azure SQL bindings for Azure Functions.

READ ME!

In the following examples, you will need to provide the procedures with the names of your AI Language and AI Content Safety services.

Azure AI Language

First up is Azure AI Language. In this sample, we are using the Analyze Text REST endpoint and asking it to look for and redact any personally identifiable information (PII). The endpoint requires the request body to contain some parameters for this type of analysis. Here is an example for reference:

{
    "kind": "PiiEntityRecognition",
    "analysisInput":
    {
        "documents":
        [
            {
                "id":"1",
                "language": "en",
                "text": "abcdef@abcd.com, this is my phone is 6657789887, and my IP: 255.255.255.255 127.0.0.1 fluffybunny@bunny.net, My Addresses are 1 Microsoft Way, Redmond, WA 98052, SSN 543-55-6654, 123 zoo street chickenhouse, AZ 55664"
            }
        ]
    }
}

The attribute “kind” is set to PiiEntityRecognition tell the endpoint how to analyze the text. Inside the analysisInput, is the document. Here you provide the text and the language of text (which is optional, the service can automatically discover the language).
The endpoint is in the structure of
{Endpoint}/language/:analyze-text?api-version=2023-04-01
Where the Endpoint is your Azure AI Language endpoint which is set on creation of the service.

Create the Stored Procedure

Putting this all together into a stored procedure looks like the following:
CREATE PROCEDURE check4pii @cogserver nvarchar(100), @cogkey nvarchar(100), @message nvarchar(max)
AS

declare @url nvarchar(4000) = N'https://'+ @cogserver +'.cognitiveservices.azure.com/language/:analyze-text?api-version=2023-04-01';
declare @headers nvarchar(300) = N'{"Ocp-Apim-Subscription-Key":"'+ @cogkey +'"}';
declare @payload nvarchar(max) = N'{
                                        "kind": "PiiEntityRecognition",
                                        "analysisInput":
                                        {
                                            "documents":
                                            [
                                                {
                                                    "id":"1",
                                                    "language": "en",
                                                    "text": "'+ @message +'"
                                                }
                                            ]
                                        }
                                    }';
declare @ret int, @response nvarchar(max);

exec @ret = sp_invoke_external_rest_endpoint
@url = @url,
@method = 'POST',
@headers = @headers,
@payload = @payload,
@timeout = 230,
@response = @response output;

SELECT A.[value] as "Redacted Text"
FROM OPENJSON(@response,'$.result.results.documents') AS D
CROSS APPLY OPENJSON([value]) as A
where A.[key] = 'redactedText'

select JSON_VALUE(B.[value],'$.category') as "PII Category",
JSON_VALUE(B.[value],'$.text') as "PII Value",
CONVERT(FLOAT,JSON_VALUE(B.[value],'$.confidenceScore'))*100 as "Confidence Score"
from OPENJSON(
(
    SELECT A.[value]
    FROM OPENJSON(@response,'$.result.results.documents') AS D
    CROSS APPLY OPENJSON([value]
    ) AS A
    where A.[key] = 'entities'
), '$') AS B

GO
Breaking this down, we first declare the pieces that are going to make up the REST request body and header. The next section calls the endpoint via sp_invoke_external_rest_endpoint. Once we receive the response, we pull out the redacted text as well as the PII it found and the classification and confidence score as a percent.
The response message looks like the following for reference:
"result": {
"kind": "PiiEntityRecognitionResults",
"results": {
    "documents": [
    {
    "redactedText": "my email is ***************, this my phone is **********, and my IP: *************** My address ********************************** and SSN ***********",
    "id": "1",
    "entities": [
        {
        "text": "abcdef@abcd.com",
        "category": "Email",
        "offset": 12,
        "length": 15,
        "confidenceScore": 0.8
        },
        {
        "text": "6657789887",
        "category": "EUTaxIdentificationNumber",
        "offset": 46,
        "length": 10,
        "confidenceScore": 0.93
        },
        {
        "text": "255.255.255.255",
        "category": "IPAddress",
        "offset": 69,
        "length": 15,
        "confidenceScore": 0.8
        },
        {
        "text": "1 Microsoft Way, Redmond, WA 98052",
        "category": "Address",
        "offset": 96,
        "length": 34,
        "confidenceScore": 1.0
        },
        {
        "text": "543-55-6654",
        "category": "USSocialSecurityNumber",
        "offset": 139,
        "length": 11,
        "confidenceScore": 0.85
        }
And this is an example of running the stored procedure:
EXEC check4pii @cogserver='cogserver',
               @cogkey='12345678901234567890',
               @message='my email is abcdef@abcd.com, this my phone is 6657789887, and my IP: 255.255.255.255 My address 1 Microsoft Way, Redmond, WA 98052 and SSN 543-55-6654';

with the results being:

Image Screenshot 2024 02 09 at 10 01 18 AM

Azure Content Safety

Using Azure Content Safety, AI content moderation with Azure SQL Database can be furthered by analyzing text for hate, self-harm, sexual connotations, and violence. The service also allows for specialized block lists where you provide a set of word/phrases to watch for.
The REST endpoint requires only the text to be analyzed for a payload, but can also contain blocklists and specific filters to use. Here is an example:
{
    "text": "string",
    "categories": [
        "Hate"
    ],
    "blocklistNames": [
        "string"
    ],
    "haltOnBlocklistHit": true,
    "outputType": "FourSeverityLevels"
}
and as before, this is an example response:
"result": {
    "blocklistsMatch": [],
    "categoriesAnalysis": [
        {
        "category": "Hate",
        "severity": 0
        },
        {
        "category": "SelfHarm",
        "severity": 0
        },
        {
        "category": "Sexual",
        "severity": 0
        },
        {
        "category": "Violence",
        "severity": 4
        }
    ]
}
Next, a stored procedure can be created that takes in similar values to the AI Language example:
CREATE PROCEDURE analyzeText @safetyserver nvarchar(100), @safetykey nvarchar(100), @message nvarchar(max)
AS
declare @url nvarchar(4000) = N'https://'+ @safetyserver +'.cognitiveservices.azure.com/contentsafety/text:analyze?api-version=2023-10-01';
declare @headers nvarchar(300) = N'{"Ocp-Apim-Subscription-Key":"'+ @safetykey +'"}';
declare @payload nvarchar(max) = N'{
"text": "'+ @message +'"
}';

declare @ret int, @response nvarchar(max);
exec @ret = sp_invoke_external_rest_endpoint
@url = @url,
@method = 'POST',
@headers = @headers,
@payload = @payload,
@timeout = 230,
@response = @response output;

SELECT JSON_VALUE(D.[value],'$.category') as "Category",
JSON_VALUE(D.[value],'$.severity') as "Severity"
FROM OPENJSON(@response,'$.result.categoriesAnalysis') AS D

GO
And again, looking at the example piece by piece, the declare section begins with setting the values for the request header and payload. The select statements parses the JSON response and pulls out what the service discovered and a severity. Here is an example of the results. First, call the procedure:
EXEC analyzeText @safetyserver='safetyserver',
                 @safetykey='12345678901234567890',
                 @message='I am going to kill all the ants in my house';
and the result set:
Image Screenshot 2024 02 09 at 11 40 12 AM

Summary

The post went through using two of Azure’s AI services, AI Language and AI Content Safety. With the stored procedures created, business logic can be added for your scenarios where you could use content moderation.
If you would like to learn more, here are some resources to help.

Author

Brian Spendolini
Senior Product Manager, Azure SQL Database

0 comments

Discussion are closed.