May 2, 2025
Post likes count2
Running RAG with ONNX Runtime GenAI for On-Prem Windows
Exploring how to efficiently run a RAG pipeline with structured language models (SLMs) and guardrails on Windows, achieving inference under 5 seconds with ONNX Runtime GenAI.