Are you ready to revolutionize the way you optimize your AI models? Say hello to Olive (ONNX Live), the advanced model optimization tool that integrates seamlessly with DirectML for hardware acceleration across the breadth of the Windows ecosystem.
With Olive, you can easily incorporate cutting-edge techniques like model compression, optimization, and compilation, all in one powerful tool. And the best part? You don’t need to be an expert in optimizing models for underlying GPUs or NPUs – Olive does all the heavy lifting for you to get the best possible performance with DirectML!
In our Stable Diffusion tests, we saw over 6x speed increase to generate an image after optimizing with Olive for DirectML!
Olive and DirectML in Practice
The Olive workflow consists of configuring passes to optimize a model for one or more metrics. Olive then executes each pass to find the best candidate model. Our recommended passes for GPU optimization with DirectML are as follows:
Generic non-transformer models
Transformer models:
- Convert to ONNX
- Orttransformeroptimizations with the following parameters:
- “use_gpu”: true
- “float16”: true
- “optimization_options”: see example
For configuring multi-model pipelines (e.g. Stable Diffusion), see our sample on the Olive repository. To learn more about configuring Olive passes, visit: Configuring Pass — Olive documentation (microsoft.github.io)
With Olive, you’ll be able to take your AI models to the next level. Say goodbye to complicated optimization processes and hello to a streamlined, efficient workflow. To get started, check out our Olive & DirectML samples and stay tuned for additional DirectML samples like quantization.
0 comments