Comprehensive Document Translation Solution

Sreedhar Mallangi

Richard Posada

Ted Shelton

Background and Use Cases

Many of our customers have a requirement to translate documents from a variety of languages into a common language to ensure their mission success. These documents can be in a variety of formats, with unique page layouts and styles, and they may contain images with embedded text essential for a complete understanding by the reader. In many scenarios, there are large numbers of documents that must be translated quickly and securely to ensure mission success.

Below are some common use cases for government organizations requiring document translation:

  • Intelligence and Security: translating foreign documents and communications to monitor threats and understand global dynamics.
  • International Cooperation and Alliances: Translating treaties, agreements, and training materials in support of global military alliances.
  • Local Engagement and Stability Operations: Translation in support of humanitarian, disaster relief, and local engagement.
  • Technical and Equipment Manuals: Translation required to ensure correct use and maintenance of diverse technologies and equipment. While international support often includes financial and equipment aid, a significant challenge arises when equipment manuals are not in the recipient’s native language. This impedes the effective and timely use of the equipment, highlighting the critical need for document translation to ensure the success of missions.
  • Government Communications: Translating official communications, public service announcements, and information about public health, safety, and welfare ensures that all members of a diverse population have access to important information.
  • Immigration Services: Translating documents related to immigration, visas, and citizenship services helps streamline the process for both applicants and the authorities.

The Challenge

The native Azure Document Translation service is feature rich and can translate complex documents across a multitude of languages and preserve the original document structure and data format. However, it does not support translation of text embedded in images in digital documents. Often, the text located inside of images can be critical for an accurate and complete understanding and therefore it is a “must have” capability for our customers.

Our challenge was to find the perfect balance between the accuracy of digital text-only documents and the completeness of scanned documents.

The Solution

Our Comprehensive Document Translation Solution solves this problem through a “Hybrid Translation” approach. The Hybrid Translation process splits the digital PDF into two files. One file is a digital document that contains all the pages that are text-only. The other file is a scanned document that contains all the pages that have images, including images embedded with text. The solution then translates both files separately. By translating both, we get the most accurate translation and layout of text-only digital documents and the completeness of scanned documents.

After both versions are translated, the solution then “stitches” back together the complete document, in the correct page order, taking the best and most accurate translation of each page from either the digital or scanned document.

For flexible application, the solution provides these options:

  1. Scanned-Only Translation: Converts a document to scanned version and translate the scanned version to ensure complete translation, including images with text.
  2. Hybrid Translation: provides the best quality and complete translation. Hybrid Translation combines the best aspects of digital page translation and scanned page translation as needed.

The Comprehensive Document Translation Solution is built on several Azure services and capabilities that allow for a fast, secure, and scalable translation process. Each of the services has its own pricing and scaling options, therefore the total cost of the solution will depend on what is selected.

The core functionality of the solution is an Azure Functions application, consisting of three functions. The functions are written in Python and utilize open-source libraries for PDF conversion, splitting and processing. The documents, both original and translated, securely reside in Azure Storage and are only accessible to users and services with correctly configured access control. The language translation is provided by Azure AI Translator, a cloud-based neural machine translation service (part of the Azure AI family of services).

This solution leverages various Azure services like Document Translation, Storage, Functions, and Event Grid.

Note: Many customers require secure network enclaves to translate their documents and this solution easily integrates with most security policies and network architectures.

Supported File Types

The Comprehensive Document Translation Solution supports image files (BMP, PNG, JPG) and the file types documented here: https://learn.microsoft.com/en-us/azure/ai-services/translator/document-translation/overview#supported-document-formats

Image files are converted to PDF before translation.

Note: Currently, Microsoft Office documents, like Word, Excel, and PowerPoint that contain images with embedded text must be converted to PDF prior to loading them into the solution.

Code Repository

The Comprehensive Document Translation Solution is an open-source project made available by the US Regulated Industries of Microsoft. The source code and instructions are available here: GitHub Repository

Additional Contributors: Jose Alanis, Joshua Donnelly, Elliott Fields, Krishna Doss Mohan, Krishnakumar Muthukrishnan

0 comments

Discussion is closed.

Feedback usabilla icon