Built-in model pre-processing with ONNX

Craig Dunn

Hello Android developers,

Previously we looked at how to pre-process image inputs for an ONNX model using Kotlin. It’s useful to understand this process because the principles apply to any model that you wish to use. On the other hand, having to write boilerplate code for input processing can be tedious – it also means there’s more code that could be buggy and require testing.

The ONNXRuntime-Extensions project simplifies this by including custom operators for common pre- and post-processing of model inputs. This means you can build, test, and deploy models that include the necessary input pre-processing and output post-processing, making it even easier to incorporate into application projects like Android.

The super resolution sample demonstrates how to add pre- and post-processing to a model and run it with just a few lines of code.

Add pre-processing to the model

Instead of having to write the input image resizing and other manipulations in Kotlin, the pre- and post-processing steps can be added to the model using operators available in newer versions of the ONNX runtime (eg. ORT 1.14/opset 18) or in the onnxruntime_extensions library.

The super-resolution sample is a great demonstration of this feature as both the input and output are images, so including the processing steps in the model greatly reduces platform-specific code.

The steps to add the pre- and post-processing to the superresolution model are in the Prepare the model section of the sample documentation. The python script that creates the updated model can be viewed at superresolution_e2e.py.

The superresolution_e2e script is explained step-by-step on pytorch.org

When you run the script as instructed, it produces two models in ONNX format – the basic pytorch_superresolution.onnx model and another version that includes additional processing pytorch_superresolution_with_pre_and_post_proceessing.onnx. The second model, including the processing instructions, can be called in an Android app with fewer lines of code than the previous sample we discussed.

Comparing the models

Before we dive into the Android sample app, the difference between the two models can be observed using the online tool Netron.app. Netron helps to visualize various neural network, deep learning and machine learning models.

The first model – requiring native pre-processing and post-processing of the input and output – shows the following:

Figure 1: superresolution model converted to ONNX and viewed in Netron

This is a relatively simple model, and it’s not important to understand the graph for now, just to compare it to the model that includes the pre- and post-processing operations shown below:

Figure 2: superresolution model with input and output processing, converted to ONNX and viewed in Netron

The additional operations for format and resize image bytes are chained before and after the original model. Once again diving into the details of the graph isn’t important, it’s included here to illustrate that additional operations have been packaged into the model to make it easier to consume with less native code on any supported platform.

Integrating with Android

The Android sample that can host this model is available on GitHub. The initialization step in MainActivity.kt is similar to the image classifier sample but with the addition of the sessionOptions object where the ONNX runtime extensions are added to the session. Without the extensions, the model with extra processing might be missing operations required to function.

This code snippet highlights the key lines for creating the ONNX runtime environment class:

  var ortEnv: OrtEnvironment = OrtEnvironment.getEnvironment()
  // fun onCreate
  val sessionOptions: OrtSession.SessionOptions = OrtSession.SessionOptions()
  sessionOptions.registerCustomOpLibrary(OrtxPackage.getLibraryPath())
  ortSession = ortEnv.createSession(readModel(), sessionOptions) // the model is in raw resources
  // fun performSuperResolution
  var superResPerformer = SuperResPerformer()
  var result = superResPerformer.upscale(readInputImage(), ortEnv, ortSession)
  // result.outputBitmap contains the output image!

The superResPerformer.upscale function is shown below.

The code to run the model in SuperResPerformer.kt has a lot less code than the image classifier sample, because there is no native processing required (like the ImageUtil.kt helper class, which is no longer needed). Image bytes can be used to create a tensor and image bytes are emitted as the result. ortSession.run is the key function takes the inputTensor and returns the resulting upscaled image:

fun upscale(inputStream: InputStream, ortEnv: OrtEnvironment, ortSession: OrtSession): Result {
    var result = Result()
    // Step 1: convert image into byte array (raw image bytes)
    val rawImageBytes = inputStream.readBytes()
    // Step 2: get the shape of the byte array and make ort tensor
    val shape = longArrayOf(rawImageBytes.size.toLong())
    val inputTensor = OnnxTensor.createTensor(
        ortEnv,
        ByteBuffer.wrap(rawImageBytes),
        shape,
        OnnxJavaType.UINT8
    )
    inputTensor.use {
        // Step 3: call ort inferenceSession run
        val output = ortSession.run(Collections.singletonMap("image", inputTensor))
        // Step 4: output analysis
        output.use {
            val rawOutput = (output?.get(0)?.value) as ByteArray
            val outputImageBitmap =
                byteArrayToBitmap(rawOutput)
            // Step 5: set output result
            result.outputBitmap = outputImageBitmap
        }
    }
    return result
}

A screenshot from the Android super resolution demo is shown below:

Super resolution on a cat

Figure 3: Android demo “superresolution” that improves resolution of an image

This particular upscaling model is relatively simple, but the concepts behind packaging models to make them easier to consume can be applied to other models with pre- and post-processing requirements.

Resources and feedback

More information about the ONNX Runtime is available at onnxruntime.ai and also on YouTube. You can read more about Netron.app as well as find the source and downloads on GitHub.

If you have any questions about applying machine learning, or would like to tell us about your apps, use the feedback forum or message us on Twitter @surfaceduodev.

There won’t be a livestream this week, but check out the archives on YouTube. We’ll see you online again soon!

1 comment

Discussion is closed. Login to edit/delete existing comments.

  • Mystery Man 0

    Wow. Someone inside Microsoft knows about the word “built-in”! You are gold, Mr. Dunn.

    You might be wondering whether the above sentence is serious or ironic sarcasm. It is serious! Your colleagues at Microsoft used a different, incorrect word in their communications with me, thus causing me much hardship.

Feedback usabilla icon