December 9th, 2021

Using custom models with Azure Cognitive Service for Language

Deyaaeldeen Almahallawi
Software Engineer

We’re delighted to announce the Language service now supports multilingual custom models for named entity recognition and both single- and multi-labeled classifications. The support for custom models for analyzing text has been added in the Text Analytics .NET, Java, JavaScript/TypeScript, and Python client libraries, starting in version 5.2.0 Beta 2.

Code samples in this blog post will be provided in JavaScript.

Azure Cognitive Service for Language with custom models

Named entity recognition (NER)

Cognitive Service for Language already supports NER. A predictive model is run by NER. This pre-built model identifies and categorizes named entities. Examples of such categories include person, location, and organization.

Custom NER also enables you to build custom AI models to extract domain-specific entities from unstructured texts. An unstructured text example is a legal document. To train such custom models, you can tag short phrases and assign them to categories of your choice. For more information, see the Custom NER documentation.

Classification

Cognitive Service for Language offers the following custom text classification features:

  • Single-labeled classification: Each input document will be assigned exactly one label. A model that classifies movies based on their genres could only assign one genre per document. For example, the model could classify a movie as “Romance”.
  • Multi-labeled classification: Each input document will be assigned at least one label. The movie genres model in the previous example could assign multiple genres to each input document. For example, the model could classify a movie as both “Romance” and “Comedy”.

For more information, see the custom text classification documentation.

Train custom models

The service offers a web portal, Language Studio, which makes it easy to train your custom models and deploy them. From the portal, you can tag entities/labels in your dataset, which your model will be trained on. To get started with Language Studio, follow the NER and classification quickstart guides. The example below will demonstrate custom NER.

Use custom models

Once custom models are deployed, you can use them from the Text Analytics client library. To use them, specify their project and deployment names found in your Language Studio account.

An example

This section walks through a JavaScript example. The example uses custom models with the Text Analytics client library. The following samples show you how to use the Text Analytics client library with other languages:

Before starting, familiarize yourself with Cognitive Service for Language. Make sure you’ve followed at least one of the NER and classification quickstart guides to train your custom models.

The first step is to create a client with your resource endpoint and your preferred authentication method. This example uses an API key. You could also use the @azure/identity package for other forms of authentication. For example, Azure Active Directory.

  const client = new TextAnalyticsClient(endpoint, new AzureKeyCredential(apiKey));

Next, let’s define a couple documents to run the models on:

const docs = [
    "The restaurant menu includes steak",
    "Redmond is about 15 miles east of Seattle"
];

From the client, we’ll call the beginAnalyzeActions method, which supports various actions including the ones that use custom models. The input actions are an object with a property for each action type and the ones we would like to use here are recognizeCustomEntitiesActions, singleCategoryClassifyActions, and multiCategoryClassifyActions. To use any of them, you’ll need the project and deployment names for your models. For example, the following code snippet defines a sample custom NER action:

  const actions = {
    recognizeCustomEntitiesActions: [
      {
        projectName: "<project name>",
        deploymentName: "<deployment name>"
      }
    ]
  };

Once the actions are defined, apply them to the input documents by calling beginAnalyzeActions:

  const poller = await client.beginAnalyzeActions(documents, actions);

The service could take a few seconds, depending on the number of documents and actions used. The beginAnalyzeActions method returns a poller object. The poller object can be used to check the status of the operation and access the results when they’re ready.

const result = await poller.pollUntilDone();

Finally, let’s print the results of our sample action:

for await (const page of result) {
    const customEntitiesAction = page.recognizeCustomEntitiesResults[0];

    if (!customEntitiesAction.error) {
      for (const doc of customEntitiesAction.results) {
        console.log(`- Document ${doc.id}`);

        if (!doc.error) {
          console.log("tEntities:");

          for (const entity of doc.entities) {
            console.log(`t- Entity ${entity.text} of type ${entity.category}`);
          }
        } else {
          console.error("tError:", doc.error);
        }
      }
    }
  }

And here’s some sample output:

- Document 0
        Entities:
        - Entity steak of type served_dish
- Document 1
        Entities:
        - Entity Redmond of type city
        - Entity miles east of type spatial_relation

Summary

This article introduced the Text Analytics client library features for custom models:

For more information about each language from this article, see the following resources:

Author

Deyaaeldeen Almahallawi
Software Engineer

Deyaa is a software engineer working on Azure SDKs ranging from Event Hubs to Schema Registry to Text Analytics. He is passionate about library design and developer tools.

0 comments

Discussion are closed.