New features in the Azure Form Recognizer client libraries

We’re pleased to announce a stable release of the Azure Form Recognizer (now known as Document Intelligence) libraries for .NET, Python, Java, and JavaScript/TypeScript.

Highlighted features

The new, stable release of the Form Recognizer client libraries targets the 2023-07-31 service version and includes many new features and quality improvements. For a complete list of what’s new, see What’s new in Form Recognizer?. This blog post highlights the following features:

Build a custom classification model for document splitting and classification
Add-on recognition capabilities
New prebuilt models support

Library availability

The new Form Recognizer libraries can be downloaded from each language’s preferred package manager.

Language	Package	Command	Project	Get started
.NET	NuGet	`dotnet add package Azure.AI.FormRecognizer`	link	link
Python	PyPI	`pip install azure-ai-formrecognizer`	link	link
Java	Maven	Add to POM.xml file	link	link
JavaScript/TypeScript	npm	`npm install @azure/ai-form-recognizer`	link	link

Document classification

One of the most significant improvements in this version of the Form Recognizer library is the support to build a custom classification model for document splitting and classification. With a custom-built classification model, users can now analyze a single- or multi-file document to identify if any of the trained document types are contained within an input file.

The following samples illustrate how a single custom classification model can be built to analyze input documents related to a loan application package using the Form Recognizer library for Java.

Build a classification model

The following code builds a custom classification model trained to analyze a loan application package containing a loan application form, payslip, and bank statement.

Java

DocumentModelAdministrationClient client = new DocumentModelAdministrationClientBuilder()
  .credential(new AzureKeyCredential("{key}"))
  .endpoint("https://{endpoint}.cognitiveservices.azure.com/")
  .buildClient();

// Provide source for training the model
ContentSource loanApplnFormSource = new BlobContentSource("{SAS URL to your container}");
ContentSource payslipSource = new BlobContentSource("{SAS URL to your container}");
ContentSource bankStatementSource = new BlobContentSource("{SAS URL to your container}");

HashMap<String, ClassifierDocumentTypeDetails> docTypes = new HashMap<>();
docTypes.put("loan application form", new ClassifierDocumentTypeDetails(loanApplnFormSource));
docTypes.put("payslip", new ClassifierDocumentTypeDetails(payslipSource));
docTypes.put("bank statement", new ClassifierDocumentTypeDetails(bankStatementSource));

/**
 * Alternatively, if you have a flat list of files to train the model, you can use the
 * BlobFileListContentSource type to train the model.
 */
ContentSource loanApplnFormListSource 
  = new BlobFileListContentSource("{SAS URL to your container}", "Loan-Application-Documents.jsonl");

HashMap<String, ClassifierDocumentTypeDetails> fileListDocTypes = new HashMap<>();
fileListDocTypes.put("loan application form", new ClassifierDocumentTypeDetails(loanApplnFormListSource));

// Build a custom classifier document model
SyncPoller<OperationResult, DocumentClassifierDetails> buildOperationPoller
  = client.beginBuildDocumentClassifier(docTypes);

DocumentClassifierDetails documentClassifierDetails = buildOperationPoller.getFinalResult();

// Get the custom built classifier ID 
System.out.printf("Classifier ID: %s%n", documentClassifierDetails.getClassifierId());

Find a similar example in other languages here:

.NET – Build a document classifier
Python – Build a document classifier
JavaScript – Build a document classifier

Note: Users can also build a classification model using Document Intelligence Studio.

Analyze a document using classification model

Now that the built custom classification model is ready, it can be used to identify the page ranges for the individual documents comprising loan applications, payslips, or bank statements. The following code shows how it can be used.

For example, a user wants to classify a single document containing a mix of loan application forms and payslips.

Java

// File URL to analyze
String documentUrl = "{URL to the sample document}";
SyncPoller<OperationResult, AnalyzeResult> syncPoller
  = client.beginClassifyDocumentFromUrl(documentClassifierDetails.getClassifierId(),
      documentUrl)
AnalyzeResult analyzeResult = syncPoller.getFinalResult();

// Notice the classified documents under each doc type
analyzeResult.getDocuments()
  .forEach(analyzedDocument -> System.out.printf("Doc Type: %s%n", analyzedDocument.getDocType()));

// Get identified/classified page/page ranges
analyzeResult.getPages().forEach(documentPage -> {
  System.out.printf("Page has width: %.2f and height: %.2f, measured with unit: %s%n",
      documentPage.getWidth(),
      documentPage.getHeight(),
      documentPage.getUnit());

  // lines
  documentPage.getLines().forEach(documentLine ->
      System.out.printf("Line '%s' is within a bounding box %s.%n",
          documentLine.getContent(),
          documentLine.getBoundingPolygon().toString()));

  // words
  documentPage.getWords().forEach(documentWord ->
      System.out.printf("Word '%s' has a confidence score of %.2f.%n",
          documentWord.getContent(),
          documentWord.getConfidence()));
});

Find the preceding example in other languages here:

.NET – Classify a document
Python – Classify a document
JavaScript – Classify a document

Add-on recognition capabilities

Form Recognizer now supports more sophisticated analysis capabilities. These optional capabilities can be enabled and disabled depending on the scenario of the document extraction. The following add-on capabilities are available for service version 2023-07-31 and later releases:

ocr.barcode – Support for extracting layout barcodes.
ocr.highResolution – The task of recognizing small text from large documents.
ocr.formula – Detect formulas in documents, such as mathematical equations.
ocr.font – Recognize font-related properties of extracted text.

Users can use the add-on capabilities by including the DocumentAnalysisFeature object in the analysis request.

Barcode recognition

Many documents can now be detected using the barcode recognition feature of the library. Examples of such documents include healthcare and procurement-related document types in which critical information like patient ID is encoded in the barcode. The detected barcodes are represented in the barcodes collection as a top-level property under DocumentPage. Each object describes the:

Barcode type (QRCode, UPCA, etc.).
Decoded value (general string representing URL, number, or other data).
Bounding polygon.
Span within which the embedded barcode content as value resides.
Overall extraction confidence.

Users can use the following code to access the first barcode property of the first page on their respective analyzed result object.

Java

DocumentBarcode barcode =
    analyzeResult.getPages().get(0).getBarcodes().get(0);
System.out.printf("Barcode kind: '%s'", barcode.getKind());

// Output:
// Barcode kind: 'Code39'

.NET

DocumentBarcode barcode = analyzeResult.Pages[0].Barcodes[0];

Console.WriteLine($"Barcode kind: '{barcode.Kind}'");

// Output:
// Barcode kind: 'Code39'

Python

print(f"Barcode kind: {result.pages[0].barcodes[0].kind}") # "Code39"

JavaScript

const [barcode1, barcode2] = anaylzeResult.pages?.[0].barcodes as DocumentBarcode[];
console.log(barcode1.kind); // "Code39"
console.log(barcode1.value) // "D589992-X"

High-resolution recognition

With this add-on feature, users can now easily extract content from complex documents comprising a mix of graphical and structural elements and have varying fonts, sizes, and orientations.

For example, the following code includes the high-resolution recognition add-on feature when analyzing a document:

Java

SyncPoller<OperationResult, AnalyzeResult> syncPoller
  = client.beginAnalyzeDocumentFromUrl("prebuilt-layout", "sourceUrl", 
      new AnalyzeDocumentOptions()
        .setDocumentAnalysisFeatures(Collections.singletonList(DocumentAnalysisFeature.OCR_HIGH_RESOLUTION)));
AnalyzeResult analyzeResult = syncPoller.getFinalResult();

.NET

var documentUri = new Uri("source-url");
var options = new AnalyzeDocumentOptions
{
    Features = { DocumentAnalysisFeature.OcrHighResolution }
};

AnalyzeDocumentOperation operation = client.AnalyzeDocumentFromUri(
    WaitUntil.Completed, "prebuilt-layout", documentUri, options);
AnalyzeResult analyzeResult = operation.Value;

Python

poller = document_analysis_client.begin_analyze_document(
      "prebuilt-layout",
      document = document_to_analyze,
      features = [AnalysisFeature.OCR_HIGH_RESOLUTION]
)
result = poller.result()

JavaScript

const poller = await client.beginAnalyzeDocumentFromUrl("prebuilt-layout", "source-url", {
  features: [FormRecognizerFeature.OcrHighResolution],
});
const anaylzeResult = await poller.pollUntilDone();

Detect formulas

Formulae are often found in scientific document types and now can be detected with this add-on feature. The detected formulas are represented in the formula collection as a top-level property under DocumentPage. Each object describes the formula type as inline or display, and its LaTeX representation as the value along with its polygon coordinates.

Java

DocumentFormula formula = analyzeResult.getPages().get(0).getFormulas().get(0);
System.out.printf("Formula kind: '%s' %n", formula.getKind());
System.out.printf("Formula value: '%s'", formula.getValue());

// Output:
// Formula kind: 'inline'
// Formula value: 'a+b=c'

.NET

DocumentFormula formula = analyzeResult.Pages[0].Formulas[0];

Console.WriteLine($"Formula kind: '{formula.Kind}'");
Console.WriteLine($"Formula value: '{formula.Value}'");

// Output:
// Formula kind: 'inline'
// Formula value: 'a+b=c'

Python

formula = result.pages[0].formulas[0]
print(f"Formula kind: {formula.kind}") # Formula kind: inline
print(f"Formula value: {formula.value}") # Formula value: a+b=c

JavaScript

const [formula1, formula2] = anaylzeResult.pages?.[0].formulas as DocumentFormula[];
console.log(formula1.kind); // "inline"
console.log(formula1.value) // "a+b=c"

Font extraction

This add-on feature enables users to detect various font properties associated with the extracted text in the document. The detected font properties collection is represented in the top-level property styles under DocumentPage. DocumentStyle provides font-related properties, like similarFontFamily, specifying the visually most similar font within a supported documented set of fonts, fontStyle, fontWeight, color, and backgroundColor for the extracted text.

The following code sample illustrates the use of DocumentAnalysisFeature.STYLE_FONT to extract font properties from text:

Java

DocumentStyle documentStyle = analyzeResult.getStyles().get(0);
System.out.printf("Font style: '%s' %n", documentStyle.getFontStyle());
System.out.printf("Font background color: '%s'", documentStyle.getBackgroundColor());

// Output:
// Font style: 'italic'
// Font background color: '#0000FF'

.NET

DocumentStyle documentStyle = analyzeResult.Styles[0];

Console.WriteLine($"Font style: '{documentStyle.FontStyle}'");
Console.WriteLine($"Font background color: '{documentStyle.BackgroundColor}'");

// Output:
// Font style: 'italic'
// Font background color: '#0000FF'

Python

for style in result.styles:
  if style.font_style:
      print(f"Font style: '{style.font_style}'") # Font style: 'italic'
  if style.background_color:
      print(f"Background color: '{style.background_color}'") # Font background color: '#0000FF'

JavaScript

const [style1, style2] = anaylzeResult.styles as DocumentStyle[];
console.log(style1.fontStyle);      // "italic"
console.log(style2.backgroundColor) // "#0000FF"

Support for new prebuilt models

New prebuilt models are now supported with Form Recognizer libraries to analyze:

Contracts (prebuilt-contract)
Tax forms (prebuilt-tax.us.1098, prebuilt-tax.us.1098E, prebuilt-tax.us.1098T)
Health insurance cards (prebuilt-healthInsuranceCard.us)

Prebuilt models offer the convenience of extracting fields from a document without having to build a model. To find more information about models, including a list of supported prebuilt models, see Form Recognizer models.

The following code analyzes a healthcare card using a prebuilt model provided by the service:

Java

SyncPoller<OperationResult, AnalyzeResult> syncPoller 
  = client.beginAnalyzeDocumentFromUrl("prebuilt-healthInsuranceCard.us", "URL to health document").getSyncPoller();

AnalyzeResult analyzeResult = syncPoller.getFinalResult();

for (int i = 0; i < analyzeResult.getDocuments().size(); i++) {
  System.out.printf("--------Analyzing health care card %d--------%n", i);
  AnalyzedDocument analyzedHealthCard = analyzeResults.getDocuments().get(i);
  Map<String, DocumentField> healthCardFields = analyzedHealthCard.getFields();
  System.out.printf("Health care insurer: '%s'%n", healthCardFields.get("Insurer").getValueAsString());
  System.out.println("--------Member details --------");
  DocumentField memberDocumentField = healthCardFields.get("Member");
  if (memberDocumentField != null) { 
    if (DocumentFieldType.MAP == memberDocumentField.getType()) {
      memberDocumentField.getValueAsMap().forEach((key, documentField) -> {
        if ("Member.Name".equals(key)) {
          if (DocumentFieldType.STRING == documentField.getType()) {
            String name = documentField.getValueAsString();
            System.out.printf("\tMember Name: %s, confidence: %.2f%n", name, documentField.getConfidence());
          }
        }
        if ("Member.BirthDate".equals(key)) {
          if (DocumentFieldType.DATE == documentField.getType()) {
            LocalDate birthDate = documentField.getValueAsDate();
            System.out.printf("\tMember birth date: %s, confidence: %.2f%n",
              birthDate, documentField.getConfidence());
          }
        }
      }));
    }
  }
}

Find a similar example in other languages here:

.NET – Analyze a document with a prebuilt model – ‘prebuilt-invoice’
Python – Analyze with prebuilt model – ‘prebuilt-tax.us.W-2’
JavaScript – Analyze with prebuilt model – ‘prebuilt-receipt’

Learn more

To learn more and to try the new features, see these links to our official documentation:

Give us your feedback

We appreciate your feedback and encourage you to share your thoughts with us. We thrive on improvement and would welcome any suggestions you may have. Let’s work together to make our experience even better!

You can reach out to us by filing issues in the language-specific GitHub repository:

Include the “[Form Recognizer]” string in the issue title so it gets routed to the right people.

References

Official Form Recognizer Service documentation

New features in the Azure Form Recognizer client libraries

Highlighted features

Library availability

Document classification

Build a classification model

Java

Analyze a document using classification model

Java

Add-on recognition capabilities

Barcode recognition

Java

.NET

Python

JavaScript

High-resolution recognition

Java

.NET

Python

JavaScript

Detect formulas

Java

.NET

Python

JavaScript

Font extraction

Java

.NET

Python

JavaScript

Support for new prebuilt models

Java

Learn more

Give us your feedback

References

Author

0 comments

Read next

Azure SDK Release (September 2023)

Azure Developer CLI (azd) – October 2023 Release

Highlighted features

Library availability

Document classification

Build a classification model

Java

Analyze a document using classification model

Java

Add-on recognition capabilities

Barcode recognition

Java

.NET

Python

JavaScript

High-resolution recognition

Java

.NET

Python

JavaScript

Detect formulas

Java

.NET

Python

JavaScript

Font extraction

Java

.NET

Python

JavaScript

Support for new prebuilt models

Java

Learn more

Give us your feedback

References

Author

0 comments

Read next

Azure SDK Release (September 2023)

Azure Developer CLI (azd) – October 2023 Release

Stay informed