Building an Image Classification Pipeline Using Serverless Architecture
The International Committee of the Red Cross (ICRC) Trace-the-Face program uses photos to match up missing family members who have been separated due to migration and conflict. We have been working with the ICRC to update and extend Trace-the-Face to perform automatic face detection and matching using machine learning, via Microsoft Cognitive Services Face API.
As we discussed how Trace-the-Face could use computer vision technologies, the ICRC grew more and more intrigued and developed a long list of their challenges that could potentially be solved by the use of computer vision technologies. Through a series of workshops and discussions, we designed a centralized image classification pipeline that could be integrated across the organization to build up a rich, tagged image set over time – including faces found in any images from any source – with the intent of addressing many of their computer vision related challenges.
The choice to go serverless gave us the flexibility to add or remove functions as the pipeline grows. The initial focus was to do general image classification and face detection and matching. We have additional plans to incorporate custom image classification and object detection built with ML frameworks like CNTK and TensorFlow.
Our architecture uses message queuing to move images along the pipeline. Each message contains the information needed to move the image to the next step, including a link to the image blob and the collected properties about the image classification.
Each step in the pipeline is an Azure Function; building the pipeline involved creating each Azure Function, and then adding its input queue name into the pipeline definition.
If you set the steps post-deployment, you can simply edit them in your Function App’s environment variables.
private const string ClassifierName = "generalclassification";
public static void Run(string inputMsg, TraceWriter log)
PipelineHelper.Process(GeneralClassificationFunction, ClassifierName, inputMsg, log);
public static dynamic GeneralClassificationFunction(dynamic inputJson, string imageUrl, TraceWriter log)
var parameters = inputJson.job_definition.image_parameters;
var response = ComputerVisionFunctions.AnalyzeImageAsync(imageUrl, log).Result;
The code includes calls to helper methods that allow the function to take part in the pipeline in a well-defined way. They ensure adherence to the message contract and then advance the message to the next defined queue. These methods are defined in the GitHub repo, if you want to see what they do in the context of the pipeline.
The first parameter in the function’s Run method is defined as string inputMsg and is tied to an Azure Function, QueueTrigger. The framework will watch the Azure Storage Queue that is defined in the function.json file for new messages and trigger the function when a new message is found.
Here is a sample of a function definition for a QueueTrigger from the documentation:
"name": "<The name used to identify the trigger data in your code>",
"queueName": "<Name of queue to poll>",
"connection":"<Name of app setting - see below>"
To add a custom step to the pipeline:
- Create a new Azure Storage Queue in the attached Storage Account for your Function App
- Add a new function to the Function App with a QueueTrigger
- Set the name field to “inputMsg” to match our Run method
- Set the queueName field to the name of the input queue that you created in step 1
- Set the connection field to the Storage Account name defined in appsettings.json
Keep in mind that you can perform any action on an image in this step of the pipeline. For example, we’ve had partners use a step to combine multiple images into a grid for submission to the Cognitive Services Face API in order to save on API calls.
Through our collaboration with ICRC, we created a nice, tidy reusable framework for processing images using message queues. It detects faces, searches for similar or matching faces, identifies objects, and extracts text characters from images. The results can be easily extracted from the JSON object or retrieved from the SQL database.
Imaginem could be reused in many scenarios that require facial recognition and matching. We welcome your feedback and PRs!
Note: This code was built using preview tools for Azure Functions in Visual Studio 2015. The Azure Functions team recommends using Visual Studio 2017 (read more).