We’re excited to announce the latest stable release for the Azure Form Recognizer SDKs for .NET, Python, Java, and JS. Form Recognizer is an Azure Applied AI Service that uses machine-learning models to extract information from documents. The Form Recognizer client libraries strive to provide an intuitive and simple way for people to use the service.
The following table shows the latest stable release for each SDK:
SDK | Latest stable version |
---|---|
Azure Form Recognizer SDK – .NET | 4.0.0 |
Azure Form Recognizer SDK – Python | 3.2.0 |
Azure Form Recognizer SDK – Java | 4.0.0 |
Azure Form Recognizer SDK – JS | 4.0.0 |
This release includes important changes and updates in each SDK. The most notable of which is the introduction of two new clients, the DocumentAnalysisClient
and the DocumentModelAdministrationClient
. The SDKs support the latest version of the service through these new clients. The previous FormRecognizerClient
and FormTrainingClient
are still supported for older versions of the service, if you’re using these clients and wish to migrate to the latest clients to use new service features, see the table below with the migration guides for each language:
NOTE: The JavaScript SDK doesn’t support the old clients when upgrading to the latest package version, the migration guide includes recommendations for how to handle this situation.
SDK | Migration guide |
---|---|
Azure Form Recognizer SDK – .NET | Migration guide |
Azure Form Recognizer SDK – Python | Migration guide |
Azure Form Recognizer SDK – Java | Migration guide |
Azure Form Recognizer SDK – JS | Migration guide |
Key features
In this release, the Azure SDK team introduced two new clients in each SDK, the DocumentAnalysisClient
and the DocumentModelAdministrationClient
. These clients aim to improve the methods and responses used to interact with the Form Recognizer service and must be used with the latest stable service API version, 2022-08-31
, and later (NOTE: these new clients can’t be used with older API versions).
DocumentAnalysisClient
The DocumentAnalysisClient
provides two document analysis methods (one for stream inputs and one for URL inputs) that can be used to analyze documents for both prebuilt and custom models. The document analysis methods accept a model ID parameter that will specify the desired model used for analysis requests. Many prebuilt models that have been trained by the service are enabled in each Form Recognizer resource. As part of the latest stable release of the service, new prebuilt document analysis models have been added. These new models provide varying features and elements that are extracted, such as the "prebuilt-document"
model that analyzes documents and returns useful information in the shape of key-value pairs, tables, pages, among others. To find more information about models, including a list of supported prebuilt models, see the Form Recognizer models page.
Example: Instantiate DocumentAnalysisClient
The examples below showcase how to instantiate a client using an AzureKeyCredential
, however our clients also support Azure Identity
credentials that provide support for multiple authentication scenarios. Use the links below to find more information about Azure Identity
and supported credentials for each SDK:
- Azure Identity SDK – .NET
- Azure Identity SDK – Python
- Azure Identity SDK – Java
- Azure Identity SDK – JS
.NET
var endpoint = new Uri("<endpoint>");
var credential = new AzureKeyCredential("<apiKey>");
var client = new DocumentAnalysisClient(endpoint, credential);
Python
document_analysis_client = DocumentAnalysisClient(
endpoint="<endpoint>", credential=AzureKeyCredential("api_key")
)
Java
DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()
.credential(new AzureKeyCredential("{key}"))
.endpoint("https://{endpoint}.cognitiveservices.azure.com/")
.buildClient();
JS
const endpoint = "<endpoint>";
const credential = new AzureKeyCredential("<api key>");
const client = new DocumentAnalysisClient(endpoint, credential);
Example: Analyze a receipt using a prebuilt model
Below is an example of analyzing a receipt using the "prebuilt-receipt"
model provided by the service and extracting fields specific to a receipt.
.NET
using var stream = new FileStream("<path_to_receipt>", FileMode.Open);
AnalyzeDocumentOperation operation = await client.AnalyzeDocumentAsync(WaitUntil.Completed, "prebuilt-receipt", stream);
AnalyzeResult result = operation.Value;
for (int i = 0; i < result.Documents.Count; i++)
{
AnalyzedDocument receipt = result.Documents[i];
Console.WriteLine($"--------Analysis of receipt #{i + 1}--------");
Console.WriteLine($"Receipt type: {receipt.DocumentType}");
if (receipt.Fields.TryGetValue("MerchantName", out DocumentField merchantName))
{
if (merchantName.FieldType == DocumentFieldType.String)
{
Console.WriteLine($"Merchant Name: {merchantName.Value.AsString()} has confidence: {merchantName.Confidence}");
}
}
if (receipt.Fields.TryGetValue("TotalTax", out DocumentField tax))
{
if (tax.FieldType == DocumentFieldType.Double)
{
Console.WriteLine($"Total tax: {tax.Value.AsDouble()} has confidence: {tax.Confidence}");
}
}
if (receipt.Fields.TryGetValue("Total", out DocumentField total))
{
if (total.FieldType == DocumentFieldType.Double)
{
Console.WriteLine($"Total: {total.Value.AsDouble()} has confidence: {total.Confidence}");
}
}
}
Python
with open("<path_to_receipt>", "rb") as f:
poller = document_analysis_client.begin_analyze_document(
"prebuilt-receipt", document=f
)
receipts = poller.result()
for idx, receipt in enumerate(receipts.documents):
print("--------Analysis of receipt #{}--------".format(idx + 1))
print("Receipt type: {}".format(receipt.doc_type or "N/A"))
merchant_name = receipt.fields.get("MerchantName")
if merchant_name:
print(
"Merchant Name: {} has confidence: {}".format(
merchant_name.value, merchant_name.confidence
)
)
tax = receipt.fields.get("TotalTax")
if tax:
print("Total tax: {} has confidence: {}".format(tax.value, tax.confidence))
total = receipt.fields.get("Total")
if total:
print("Total: {} has confidence: {}".format(total.value, total.confidence))
Java
File sourceFile = new File("<path_to_receipt>");
Path filePath = sourceFile.toPath();
BinaryData fileData = BinaryData.fromFile(filePath);
SyncPoller<OperationResult, AnalyzeResult> analyzeReceiptPoller
= client.beginAnalyzeDocument("prebuilt-receipt", fileData);
AnalyzeResult analyzeResult = analyzeReceiptPoller.getFinalResult();
for (int i = 0; i < analyzeResult.getDocuments().size(); i++) {
AnalyzedDocument analyzedReceipt = analyzeResult.getDocuments().get(i);
Map<String, DocumentField> receiptFields = analyzedReceipt.getFields();
System.out.printf("----------- Analyzing receipt info %d -----------%n", i);
DocumentField merchantNameField = receiptFields.get("MerchantName");
if (merchantNameField != null) {
if (DocumentFieldType.STRING == merchantNameField.getType()) {
String merchantName = merchantNameField.getValueAsString();
System.out.printf("Merchant Name: %s, confidence: %.2f%n",
merchantName, merchantNameField.getConfidence());
}
}
DocumentField totalTaxField = receiptFields.get("TotalTax");
if (totalTaxField != null) {
if (DocumentFieldType.DOUBLE == totalTaxField.getType()) {
Double totalTax = totalTaxField.getValueAsDouble();
System.out.printf("Total tax: %.2f, confidence: %.2f%n",
totalTax, totalTaxField.getConfidence());
}
}
DocumentField totalTaxField = receiptFields.get("Total");
if (totalField != null) {
if (DocumentFieldType.DOUBLE == totalField.getType()) {
Double total = totalField.getValueAsDouble();
System.out.printf("Total: %.2f, confidence: %.2f%n",
total, totalField.getConfidence());
}
}
}
JS
const poller = await client.beginAnalyzeDocument(
"prebuilt-receipt",
fs.createReadStream("<receipt file path>")
);
const { documents } = await poller.pollUntilDone();
for (const receipt of documents ?? []) {
console.log(`- Receipt Type: ${receipt.docType}`);
const merchantNameField = receipt.fields["MerchantName"];
if (merchantNameField) {
console.log(
` Merchant Name: ${merchantNameField.value} (confidence: ${merchantNameField.confidence})`
);
}
const taxField = receipt.fields["TotalTax"];
if (taxField) {
console.log(
` Tax: ${taxField.value} (confidence: ${taxField.confidence})`
);
}
const totalField = receipt.fields["Total"];
if (totalField) {
console.log(
` Total: ${totalField.value} (confidence: ${totalField.confidence})`
);
}
}
The document analysis methods return an AnalyzeResult
model that is populated with the extracted data, such as the content, pages, paragraphs, tables, languages, styles, and key-value pairs, at the top level of the result. The fields on AnalyzeResult
may or may not be populated with data depending on the model that is used for analysis. For a full table describing the data returned per model, see the Model data extraction table.
Example: Search for handwritten content using spans
Another feature of the new result structure is that many models have a span(s) field that indicates the offset and length where an item is found in the text content of the document. Some more unitary models have a single span, like document word, while larger components, like document style, can have a list of spans that they cover within the text content of the document. Below is an example from each SDK that searches the concatenated content of the document to find specific text sections that are handwritten.
.NET
using var stream = new FileStream("<path_to_documents>", FileMode.Open);
AnalyzeDocumentOperation operation = await client.AnalyzeDocumentAsync(WaitUntil.Completed, "prebuilt-document", stream);
AnalyzeResult result = operation.Value;
foreach (DocumentStyle style in result.Styles)
{
if (style.IsHandwritten == true)
{
Console.WriteLine("Document contains handwritten content:");
foreach (DocumentSpan span in style.Spans)
{
Console.WriteLine($" {result.Content.Substring(span.Index, span.Length)}");
}
}
}
Python
with open("<path_to_document>", "rb") as f:
poller = document_analysis_client.begin_analyze_document(
"prebuilt-document", document=f
)
result = poller.result()
for style in result.styles:
if style.is_handwritten:
print("Document contains handwritten content: ")
print(",".join([result.content[span.offset:span.offset + span.length] for span in style.spans]))
Java
File sourceFile = new File("<path_to_receipt>");
Path filePath = sourceFile.toPath();
BinaryData fileData = BinaryData.fromFile(filePath);
SyncPoller<OperationResult, AnalyzeResult> analyzePoller=
client.beginAnalyzeDocument("prebuilt-document", fileData);
AnalyzeResult analyzeResult= analyzeReceiptPoller.getFinalResult();
analyzeResult.getStyles()
.stream().filter(DocumentStyle::isHandwritten)
.forEach(documentStyle -> documentStyle.getSpans()
.stream()
.map(documentSpan -> analyzeResult.getContent().substring(documentSpan.getOffset(),
documentSpan.getOffset() + documentSpan.getLength()))
.forEach(System.out::println));
JS
const poller = await client.beginAnalyzeDocument(
"prebuilt-document",
fs.createReadStream("<path to file>")
);
const { styles, content } = await poller.pollUntilDone();
const handwrittenStyles = styles?.filter((style) => style.isHandwritten) ?? [];
if (handwrittenStyles.length > 0) {
console.log("Document contains handwritten text:");
for (const style of handwrittenStyles) {
const slices = style.spans.map((span) =>
content.slice(span.offset, span.offset + span.length)
);
console.log(`- ${slices.join(",")}`);
}
}
DocumentModelAdministrationClient
The DocumentModelAdministrationClient
provides methods related to building, composing, copying, getting, and deleting document models in your Form Recognizer resource, as well as methods to get resource information, and list and get operations.
Example: Instantiate DocumentModelAdministrationClient
.NET
var endpoint = new Uri("<endpoint>");
var credential = new AzureKeyCredential("<apiKey>");
var client = new DocumentModelAdministrationClient(endpoint, credential);
Python
document_model_admin_client = DocumentModelAdministrationClient(
"<endpoint>", AzureKeyCredential("<api_key>")
)
Java
DocumentModelAdministrationClient client = new DocumentModelAdministrationClientBuilder()
.credential(new AzureKeyCredential("{key}"))
.endpoint("https://{endpoint}.cognitiveservices.azure.com/")
.buildClient();
JS
const endpoint = "<endpoint>";
const credential = new AzureKeyCredential("<api key>");
const client = new DocumentModelAdministrationClient(endpoint, credential);
Example: Build a custom document model
Below is an example of how to build a custom document model. In order to be able to build a custom model, you must provide a set of labeled training files that will be used by the machine-learning algorithm to create the model. These training files can be created and labeled through Form Recognizer Studio, see more information about how to create your training dataset in Building a training dataset. Another important update in the latest release is the addition of build modes used to build custom models. Each build mode specifies a different machine-learning algorithm to use when creating a custom model. To get more information about each build mode, see the documentation on Custom model types. Building a custom model relies on the document type in the training files and will only support that document type for analysis.
NOTE: The service provides composed model creation in order to group several custom models into one model ID. With a composed model, the service will perform a classification step when a document is sent to determine which custom model is the best option for analysis. An example of a composed model is the prebuilt model for analyzing identity documents, or
prebuilt-idDocument
, that supports analyzing U.S. driver’s licenses, U.S. state IDs, social security cards, permanent resident cards, and international passports. For more information, read about Composed custom models.
.NET
Uri blobContainerUri = new Uri("<blobContainerUri>");
BuildDocumentModelOperation operation = await client.BuildDocumentModelAsync(WaitUntil.Completed, blobContainerUri, DocumentBuildMode.Template);
DocumentModelDetails model = operation.Value;
Console.WriteLine($"Model ID: {model.ModelId}");
Console.WriteLine($"Description: {model.Description}");
Console.WriteLine($"Model created on: {model.CreatedOn}");
Console.WriteLine($"Document types the model can recognize:");
foreach (KeyValuePair<string, DocumentTypeDetails> docTypeKvp in model.DocumentTypes)
{
string name = docTypeKvp.Key;
DocumentTypeDetails docType = docTypeKvp.Value;
Console.WriteLine($"Document type: '{name}' built with '{docType.BuildMode}' mode which has the following fields:");
foreach (KeyValuePair<string, DocumentFieldSchema> fieldKvp in docType.FieldSchema)
{
string fieldName = fieldKvp.Key;
DocumentFieldSchema field = fieldKvp.Value;
float confidence = docType.FieldConfidence[fieldName];
Console.WriteLine($"Field: '{fieldName}' has type '{field.Type}' and confidence score {confidence}");
}
}
Python
poller = document_model_admin_client.begin_build_document_model(
ModelBuildMode.TEMPLATE, blob_container_url="<container_sas_url>", description="my model description"
)
model = poller.result()
print("Model ID: {}".format(model.model_id))
print("Description: {}".format(model.description))
print("Model created on: {}\n".format(model.created_on))
print("Doc types the model can recognize:")
for name, doc_type in model.doc_types.items():
print("\nDoc Type: '{}' built with '{}' mode which has the following fields:".format(name, doc_type.build_mode))
for field_name, field in doc_type.field_schema.items():
print("Field: '{}' has type '{}' and confidence score {}".format(
field_name, field["type"], doc_type.field_confidence[field_name]
))
Java
// Build custom document analysis model
String blobContainerUrl = "{SAS_URL_of_your_container_in_blob_storage}";
// The shared access signature (SAS) Url of your Azure Blob Storage container with your forms.
String prefix = "{blob_name_prefix}";
SyncPoller<OperationResult, DocumentModelDetails> buildOperationPoller =
client.beginBuildDocumentModel(blobContainerUrl,
DocumentModelBuildMode.TEMPLATE,
prefix,
new BuildDocumentModelOptions()
.setModelId("custom-model-id")
.setDescription("model desc"),
Context.NONE);
DocumentModelDetails documentModelDetails = buildOperationPoller.getFinalResult();
// Model Info
System.out.printf("Model ID: %s%n", documentModelDetails.getModelId());
System.out.printf("Model Description: %s%n", documentModelDetails.getDescription());
System.out.printf("Model created on: %s%n%n", documentModelDetails.getCreatedOn());
System.out.println("Doc types the model can recognize:");
documentModelDetails.getDocumentTypes().forEach((name, documentTypeDetails) -> {
System.out.printf("\nDoc Type: %s built with %s mode which has the following fields:",
name,
documentTypeDetails.getBuildMode());
documentTypeDetails.getFieldSchema().forEach((fieldName, documentFieldSchema) ->
System.out.printf("Field: %s has type %s and confidence score %.2f",
fieldName,
documentFieldSchema.getType(),
documentTypeDetails.getFieldConfidence()));
});
JS
const poller = await client.beginBuildDocumentModel(
"<model id>",
"<training container SAS URL>",
DocumentModelBuildMode.Template,
{
description: "an example model",
}
);
const model = await poller.pollUntilDone();
console.log("Model ID:", model.modelId);
console.log("Description:", model.description);
console.log("Created:", model.createdOn);
console.log("Document Types:");
for (const [
docType,
{ description, fieldSchema: schema, buildMode, fieldConfidence },
] of Object.entries(model.docTypes ?? {})) {
console.log(`- Name: "${docType}"`);
console.log(` Build mode: ${buildMode}`);
console.log(` Description: "${description ?? "<no description>"}"`);
// For simplicity, this example will only show top-level field names
console.log(" Fields:");
for (const [fieldName, fieldSchema] of Object.entries(schema)) {
console.log(` - "${fieldName}" (${fieldSchema.type})`);
console.log(
` Description: ${fieldSchema.description ?? "<no description>"}`
);
console.log(
` Confidence: ${fieldConfidence?.[fieldName] ?? "<unknown>"}`
);
}
}
Example: Analyze a document with your custom model
Once you’ve built your custom model, it can be used for the analysis of those custom document types it was trained on.
.NET
string modelId = "<model_id>";
using var stream = new FileStream("<path_to_documents>", FileMode.Open);
AnalyzeDocumentOperation operation = await client.AnalyzeDocumentAsync(WaitUntil.Completed, modelId, stream);
AnalyzeResult result = operation.Value;
for (int i = 0; i < result.Documents.Count; i++)
{
AnalyzedDocument document = result.Documents[i];
Console.WriteLine($"--------Analyzing document #{i + 1}--------");
Console.WriteLine($"Document has type: {document.DocumentType}");
Console.WriteLine($"Document has confidence: {document.Confidence}");
Console.WriteLine($"Document was analyzed by model with ID {result.ModelId}");
foreach (DocumentField field in document.Fields.Values)
{
Console.WriteLine($"......found field of type '{field.FieldType}' with content '{field.Content}' and with confidence {field.Confidence}");
}
}
for (int i = 0; i < result.Tables.Count; i++)
{
DocumentTable table = result.Tables[i];
Console.WriteLine($"Table {i + 1} can be found on page:");
foreach (BoundingRegion region in table.BoundingRegions)
{
Console.WriteLine($"...{region.PageNumber}");
}
foreach (DocumentTableCell cell in table.Cells)
{
Console.WriteLine($"...Cell[{cell.RowIndex}][{cell.ColumnIndex}] has content '{cell.Content}'");
}
}
Python
# Make sure your document's type is included in the list of document types the custom model can analyze
with open("<path_to_document>", "rb") as f:
poller = document_analysis_client.begin_analyze_document(
model_id="<model_id>", document=f
)
result = poller.result()
for idx, document in enumerate(result.documents):
print("--------Analyzing document #{}--------".format(idx + 1))
print("Document has type {}".format(document.doc_type))
print("Document has confidence {}".format(document.confidence))
print("Document was analyzed by model with ID {}".format(result.model_id))
for name, field in document.fields.items():
field_value = field.value if field.value else field.content
print("......found field of type '{}' with value '{}' and with confidence {}".format(field.value_type, field_value, field.confidence))
for i, table in enumerate(result.tables):
print("\nTable {} can be found on page:".format(i + 1))
for region in table.bounding_regions:
print("...{}".format(region.page_number))
for cell in table.cells:
print(
"...Cell[{}][{}] has content '{}'".format(
cell.row_index, cell.column_index, cell.content
)
)
Java
String document = "{document-path}";
String modelId = "{custom-built-model-ID}";
SyncPoller<OperationResult, AnalyzeResult> analyzeDocumentPoller =
client.beginAnalyzeDocument(modelId, document);
AnalyzeResult analyzeResult = analyzeDocumentPoller.getFinalResult();
for (int i = 0; i < analyzeResult.getDocuments().size(); i++) {
final AnalyzedDocument analyzedDocument = analyzeResult.getDocuments().get(i);
System.out.printf("----------- Analyzing custom document %d -----------%n", i);
System.out.printf("Analyzed document has doc type %s with confidence : %.2f%n",
analyzedDocument.getDocType(), analyzedDocument.getConfidence());
}
// tables
List<DocumentTable> tables = analyzeResult.getTables();
for (int i = 0; i < tables.size(); i++) {
DocumentTable documentTable = tables.get(i);
System.out.printf("Table %d has %d rows and %d columns.%n", i, documentTable.getRowCount(),
documentTable.getColumnCount());
documentTable.getCells().forEach(documentTableCell -> {
System.out.printf("Cell '%s', has row index %d and column index %d.%n",
documentTableCell.getContent(),
documentTableCell.getRowIndex(), documentTableCell.getColumnIndex());
});
System.out.println();
}
JS
const poller = await client.beginAnalyzeDocument(
"<model id>",
fs.createReadStream("<path to file>")
);
const { modelId, documents, tables, pages } = await poller.pollUntilDone();
console.log(`Results from model: "${modelId}"`);
console.log("Documents:");
for (const document of documents ?? []) {
console.log(
`- Document Type: ${document.docType} (confidence: ${document.confidence})`
);
console.log(" Fields:");
for (const [name, field] of Object.entries(document.fields)) {
console.log(
` - ${name} (${field.kind}): ${
field.value ?? "<unknown value>"
} (confidence: ${field.confidence})`
);
}
}
console.log("Tables:");
for (const table of tables ?? []) {
console.log(
`- Table (${table.rowCount}x${table.columnCount}, ${table.cells.length} cells):`
);
console.log(" Bounding Regions:");
for (const region of table.boundingRegions ?? []) {
const pageUnits = pages?.[region.pageNumber - 1]?.unit ?? "<unknown>";
console.log(
` - Page ${region.pageNumber} (unit: ${pageUnits}), [${region.polygon
?.map(({ x, y }) => `(${x},${y})`)
.join(",")}]`
);
}
console.log(" Cells:");
for (const cell of table.cells) {
console.log(
` - Cell (${cell.rowIndex},${cell.columnIndex}): "${cell.content}"`
);
}
}
Example: List the document models within your Form Recognizer resource
You can list the models that exist within your Form Recognizer resource, the list includes the prebuilt models that exist in the resource.
.NET
await foreach (DocumentModelSummary model in client.GetDocumentModelsAsync())
{
Console.WriteLine($"- Model ID: {model.ModelId}");
Console.WriteLine($" Description: {model.Description}");
Console.WriteLine($" Created on: {model.CreatedOn}");
}
Python
models = document_model_admin_client.list_document_models()
print("We have the following 'ready' models:")
print("Model ID | Description | Created on")
for model in models:
print("{} | {} | {}".format(model.model_id, model.description, model.created_on))
Java
System.out.println("We have following models in the account:");
client.listDocumentModels().forEach(documentModelInfo -> {
System.out.printf("Model ID: %s%n", documentModelInfo.getModelId());
System.out.printf("Model Description: %s%n", documentModelInfo.getDescription());
System.out.printf("Model Created on: %s%n", documentModelInfo.getCreatedOn());
}
JS
for await (const modelSummary of client.listDocumentModels()) {
console.log("- ID:", modelSummary.modelId);
console.log(" Created:", modelSummary.createdOn);
console.log(" Description: ", modelSummary.description || "<none>");
}
Summary
The Azure Form Recognizer SDKs have released new stable versions that feature two new clients, the DocumentAnalysisClient
and DocumentModelAdministrationClient
. The new clients provide methods to use and interact with the latest stable API version of the Form Recognizer service. These SDKs look to help improve your interaction with the service. If you’re currently using the FormRecognizerClient
and/or FormTrainingClient
and want to migrate to the latest versions, take a look at the migration guides linked above.
Additionally, see the samples for each SDK below:
SDK | Samples |
---|---|
Azure Form Recognizer SDK – .NET | Samples |
Azure Form Recognizer SDK – Python | Samples |
Azure Form Recognizer SDK – Java | Samples |
Azure Form Recognizer SDK – JS | Samples |
Feedback
The Azure SDK team is excited for you to try the client libraries and encourages feedback and questions. Feel free to reach out with feedback, questions, and/or issues about the client library you’re using. The list below links to the repository for each Form Recognizer SDK:
0 comments