At Inspire this year we talked about how developers will be able to run Llama 2 on Windows with DirectML and the ONNX Runtime and we've been hard at work to make this a reality.
We now have a sample showing our progress with Llama 2 7B!
See https://github.com/microsoft/Olive/tree/main/examples/directml/llama_v2
This sample relies on ...