{"id":16214,"date":"2025-05-23T01:23:18","date_gmt":"2025-05-23T08:23:18","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/ise\/?p=16214"},"modified":"2025-05-23T01:32:43","modified_gmt":"2025-05-23T08:32:43","slug":"durable-functions-for-rag-indexing","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/ise\/durable-functions-for-rag-indexing\/","title":{"rendered":"Durable Functions for Indexing in RAG: A Practical Python Approach"},"content":{"rendered":"<h1>Durable Functions for Indexing in RAG: A Practical Python Approach<\/h1>\n<p>Have you ever tried building an indexing pipeline for a <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/search\/retrieval-augmented-generation-overview\">Retrieval-Augmented\nGeneration<\/a> (RAG) app and struggled to\nchoose between <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/search\/search-what-is-data-import\">\u201cpush\u201d and \u201cpull\u201d<\/a>? Both have their\nadvantages, but each comes with its own challenges. In this article, we\u2019ll introduce a hybrid approach that combines the best of\nboth while keeping overhead low\u2014using <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/search\/retrieval-augmented-generation-overview\">Azure Durable\nFunctions<\/a> (in Python) to get the\nright balance. We&#8217;ll have a look at the strengths and weaknesses of existing methods before diving into how Durable Functions can\naddress common issues like scaling, state management, and retries\u2014without adding unnecessary complexity.<\/p>\n<h2>Setup, Deployment, and Prerequisites<\/h2>\n<p>Before you dive in, note that all setup details are provided in our <a href=\"https:\/\/github.com\/Azure-Samples\/indexadillo\">sample\nrepository<\/a>. We use <strong><a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/developer\/azure-developer-cli\/\">azd<\/a><\/strong> and <strong><a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/azure-resource-manager\/bicep\/overview?tabs=bicep\">Bicep<\/a><\/strong> to set up the entire\ninfrastructure\u2014including <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/search\/search-what-is-azure-search\">AI Search<\/a>, <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/ai-services\/document-intelligence\/overview?view=doc-intel-4.0.0\">Azure\nDocument Intelligence<\/a>,\nOpenAI embeddings, and more. A <a href=\"https:\/\/code.visualstudio.com\/docs\/devcontainers\/containers\">dev container<\/a> is provided to simplify the setup steps significantly, so you don\u2019t need to worry about specific package versions or\nmanual configurations.<\/p>\n<hr \/>\n<h2>Why Indexing Matters for RAG<\/h2>\n<p>Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by grounding responses in external enterprise\ndata. <strong>The accuracy of the generated output depends entirely on the quality of the retrieved data\u2014garbage in, garbage out.<\/strong>\nWhile prompt engineering can fine-tune model behavior, it cannot compensate for irrelevant, outdated, or poorly structured data\nin the index.<\/p>\n<p>That\u2019s why a <strong>robust indexing pipeline<\/strong> is critical. Instead of relying solely on prompt tuning, focus on ensuring <strong>the\nright data is retrieved at the right time<\/strong>. A well-structured RAG pipeline ensures that:<\/p>\n<ol>\n<li><strong>Enterprise documents<\/strong> (like PDFs or reports) are ingested properly.  <\/li>\n<li><strong>Content is extracted<\/strong> cleanly and structured for retrieval.  <\/li>\n<li><strong>Chunks are meaningful and contextually relevant.<\/strong>  <\/li>\n<li><strong>Embeddings are high-quality, ensuring accurate retrieval.<\/strong>  <\/li>\n<li><strong>The index is continuously refined<\/strong> to reflect the latest ground truth.  <\/li>\n<\/ol>\n<p>This is where <strong>Azure Durable Functions<\/strong> come into play. By automating document ingestion, processing, and indexing, you ensure\nthe most <strong>relevant and up-to-date<\/strong> information is always available for retrieval. This leads to more reliable, fact-based\nresponses from the LLM, reducing the risk of hallucinations or outdated results.<\/p>\n<p>By investing in indexing, rather than overfitting prompts to imperfect data, you create a <strong>scalable, adaptable<\/strong> RAG system\nthat remains robust as your enterprise data evolves.<\/p>\n<hr \/>\n<h2>Push vs. Pull: The Usual Approaches<\/h2>\n<h3>Push Method<\/h3>\n<ul>\n<li><strong>How it works:<\/strong> You write code or scripts that directly send your documents to AI Search.<\/li>\n<li><strong>Pros:<\/strong>\n<ul>\n<li>Full control over the process.<\/li>\n<li>Ability to handle custom processing as needed.<\/li>\n<\/ul>\n<\/li>\n<li><strong>Cons:<\/strong>\n<ul>\n<li>You must handle code retries and error handling yourself.<\/li>\n<li>Scaling can be tricky if you have a large volume of documents.<\/li>\n<li>It can be hard to track the status of each document if something fails.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h3>Pull Method<\/h3>\n<ul>\n<li><strong>How it works:<\/strong> AI Search uses a built-in indexer that pulls documents from a data source (like Blob Storage) on a schedule\nor trigger.<\/li>\n<li><strong>Pros:<\/strong>\n<ul>\n<li>Automatic retries and production-ready features.<\/li>\n<li>Less code to write.<\/li>\n<\/ul>\n<\/li>\n<li><strong>Cons:<\/strong>\n<ul>\n<li>Limited configuration options, which can be frustrating.<\/li>\n<li>Debugging is harder because logs can be minimal.<\/li>\n<li>Extending beyond the built-in features is challenging.<\/li>\n<li><a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/search\/search-limits-quotas-capacity#indexer-limits\">Service limits<\/a> that might\nconstrain what you can do.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>My team and I tried both approaches. Push sometimes feels too bare-bones, while pull can be overly rigid. So, we started wondering if there\u2019s a\nway to combine the flexibility of push with the production-ready capabilities of pull.<\/p>\n<hr \/>\n<h2>Enter Azure Durable Functions (in Python)<\/h2>\n<p>Durable Functions enable you to write stateful workflows in code. In Python, you can use the <code>azure-functions<\/code> and\n<code>azure-durable-functions<\/code> libraries to do things like:<\/p>\n<ul>\n<li><strong>Track State:<\/strong> Durable Functions remember where they left off for each document, so you know exactly what\u2019s happening.<\/li>\n<li><strong>Scale:<\/strong> They can fan out and process multiple documents in parallel, then fan back in when done.<\/li>\n<li><strong>Retry:<\/strong> If a step fails, you can retry it without losing your entire workflow.<\/li>\n<\/ul>\n<p>This approach brings together the flexibility of push (it\u2019s your own Python code) with many of the production-ready features you\nusually only get with the pull method. You can find an example of indexing with the push method using regular Azure Functions in\n<a href=\"https:\/\/devblogs.microsoft.com\/ise\/unlock-ai-search-potential-the-case-for-azure-functions-in-data-ingestion\/\">this previous\narticle<\/a>.<\/p>\n<hr \/>\n<h2>A Sample Python Workflow<\/h2>\n<blockquote>\n<p><strong>Note:<\/strong> The following code snippets are illustrative. They\u2019re meant to help you grasp the overall workflow, not serve as\nfully executable code.<\/p>\n<\/blockquote>\n<ol>\n<li>\n<p><strong>Trigger the Orchestrator<\/strong>\nAn HTTP call or a blob event can start the indexing. For example, uploading a new file to Blob Storage can trigger the\nworkflow automatically.<\/p>\n<\/li>\n<li>\n<p><strong>List Documents in Blob Storage<\/strong>\nThe orchestrator function lists all files in the relevant container.<\/p>\n<blockquote>\n<p><strong>Important:<\/strong> Since Durable Functions replay your orchestrator code to maintain state consistency, ensure that this\nfunction is replay-safe. That means the function should return the next files in storage consistently when restarted.<\/p>\n<\/blockquote>\n<\/li>\n<li>\n<p><strong>Ensure the Index Exists<\/strong>\nCheck if your index is present in AI Search. If not, create it.<\/p>\n<\/li>\n<li>\n<p><strong>Fan Out to Index Each Document<\/strong>\nA \u201csub-orchestrator\u201d is launched for each document. This sub-orchestrator handles document cracking, chunking, embedding,\nand uploading to AI Search. Running them in parallel lets you process many documents quickly.<\/p>\n<\/li>\n<li>\n<p><strong>Handle Failures Gracefully<\/strong>\nIf one document fails, it won\u2019t bring down the entire pipeline. You\u2019ll see exactly which file failed and why, so you can\nretry it when ready.<\/p>\n<\/li>\n<\/ol>\n<h3>Python Code Snippet Highlights<\/h3>\n<p><strong>Orchestrator<\/strong> (Main function, reduced to show significant pieces):<\/p>\n<pre><code class=\"language-python\">from azure.durable_functions import DurableOrchestrationContext\nfrom application.app import app\nimport os\n\n@app.function_name(name=\"index\")\n@app.orchestration_trigger(context_name=\"context\")\ndef index(context: DurableOrchestrationContext):\n    while True:\n        # List blobs from Blob Storage in batches (ensure replay-safe code)\n        blob_list_result = yield context.call_activity(\"list_blobs_batch\")\n        if not blob_list_result[\"blob_names\"]:\n            break\n\n        # Ensure that the index exists\n        yield context.call_activity(\"ensure_index_exists\")\n\n        task_list = []\n        for blob_name in blob_list_result[\"blob_names\"]:\n            task_list.append(\n                context.call_sub_orchestrator(\n                    \"index_document\",\n                    {\"blob_url\": blob_name},\n                    instance_id=context.new_uuid()\n                )\n            )\n        yield context.task_all(task_list)<\/code><\/pre>\n<p><strong>Sub-Orchestrator:<\/strong><\/p>\n<pre><code class=\"language-python\">@app.function_name(name=\"index_document\")\n@app.orchestration_trigger(context_name=\"context\")\ndef index_document(context: DurableOrchestrationContext):\n    data = context.get_input()\n\n    # Document cracking (extract text)\n    document = yield context.call_activity(\"document_cracking\", data[\"blob_url\"])\n\n    # Chunking\n    chunks = yield context.call_activity(\"chunking\", document)\n\n    # Create embeddings\n    chunks_with_embeddings = yield context.call_activity(\"embedding\", chunks)\n\n    # Upload chunks and embeddings to AI Search\n    yield context.call_activity(\"add_documents\", {\n        \"chunks\": chunks_with_embeddings\n    })<\/code><\/pre>\n<p>Each activity (e.g., <code>document_cracking<\/code>, <code>chunking<\/code>, etc.) is a separate function that handles one piece of work. Durable Functions\nautomatically track input and output for every step, so you can see the status of each document.<\/p>\n<h3>Visualizing the Workflow<\/h3>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/ise\/wp-content\/uploads\/sites\/55\/2025\/05\/workflow.png\" alt=\"Visual representation of the workflow\" \/><\/p>\n<h2>Seeing the State in Real Time<\/h2>\n<p>One of the biggest advantages of Durable Functions is built-in state tracking. When a run starts, it generates an instance ID,\nwhich you can use to check progress, seeing which steps have completed, which failed, and what\u2019s still in progress. You can\nretrieve this information through the <code>\/status\/:id<\/code> endpoint or directly from blob storage, which serves as the persistence\nlayer for the durable function. This makes debugging much easier; if a chunking step fails on a specific file, you\u2019ll know\nexactly which file it was and why it failed.<\/p>\n<pre><code class=\"language-json\">{\n  \"name\": \"index_document\",\n  \"instanceId\": \"283f3d47-6b75-5f18-b6d9-5e6dg8caab59\",\n  \"createdTime\": \"2025-02-14T09:01:33.000000Z\",\n  \"lastUpdatedTime\": \"2025-02-14T09:02:10.000000Z\",\n  \"output\": null,\n  \"input\": \"\\\"{\\\\\\\"blob_url\\\\\\\": \\\\\\\"example.blob.core.windows.net\/source\/Agentic%20Frameworks%20Research.pdf\\\\\\\", \\\\\\\"index_name\\\\\\\": \\\\\\\"other-index\\\\\\\"}\\\"\",\n  \"runtimeStatus\": \"Completed\",\n  \"customStatus\": null,\n  \"history\": null,\n  \"historyEvents\": [\n    {\n      \"EventType\": \"ExecutionStarted\",\n      \"Input\": \"\\\"{\\\\\\\"blob_url\\\\\\\": \\\\\\\"example.blob.core.windows.net\/source\/Agentic%20Frameworks%20Research.pdf\\\\\\\", \\\\\\\"index_name\\\\\\\": \\\\\\\"other-index\\\\\\\"}\\\"\",\n      \"Timestamp\": \"2025-02-14T09:01:33.6014815Z\",\n      \"FunctionName\": \"index_document\"\n    },\n    {\n      \"EventType\": \"TaskCompleted\",\n      \"Timestamp\": \"2025-02-14T09:01:46.345134Z\",\n      \"FunctionName\": \"document_cracking\"\n    },\n    {\n      \"EventType\": \"TaskCompleted\",\n      \"Timestamp\": \"2025-02-14T09:01:47.8006035Z\",\n      \"FunctionName\": \"chunking\"\n    },\n    {\n      \"EventType\": \"TaskCompleted\",\n      \"Timestamp\": \"2025-02-14T09:01:51.468373Z\",\n      \"FunctionName\": \"embedding\"\n    },\n    {\n      \"EventType\": \"TaskCompleted\",\n      \"Timestamp\": \"2025-02-14T09:02:10.2548588Z\",\n      \"FunctionName\": \"add_documents\"\n    },\n    {\n      \"EventType\": \"ExecutionCompleted\",\n      \"OrchestrationStatus\": \"Completed\",\n      \"Timestamp\": \"2025-02-14T09:02:10.4357025Z\"\n    }\n  ]\n}<\/code><\/pre>\n<p>This JSON shows a successful run. Each step can be evaluated and tracked for each document without the overhead of external\ntracking systems. When extended or changed, the system automatically tracks progress.<\/p>\n<hr \/>\n<h2>Pros and Cons of Using Durable Functions<\/h2>\n<h3>Pros<\/h3>\n<ul>\n<li><strong>Flexibility:<\/strong> Write Python code without strict configuration limitations.<\/li>\n<li><strong>State Management:<\/strong> Automatically stores the input\/output of each step, so you know what\u2019s happening.<\/li>\n<li><strong>Scalability:<\/strong> Fan out to handle multiple documents in parallel.<\/li>\n<li><strong>Retries:<\/strong> Built-in retry mechanisms mean you don\u2019t have to code extensive retry logic.<\/li>\n<\/ul>\n<h3>Cons<\/h3>\n<ul>\n<li><strong>Learning Curve:<\/strong> Durable Functions can be new to some teams, especially if you\u2019re used to simple Azure Functions.<\/li>\n<li><strong>Service Limits:<\/strong> If your document-cracking or embedding service has rate limits, it could still be overwhelmed if you scale too fast.<\/li>\n<li><strong>Extra Overhead:<\/strong> You need a function app and some storage for orchestration history.<\/li>\n<li><strong>Missing Integrations:<\/strong> More code is required to integrate other inputs (like SharePoint or databases) compared to the pull\nmethod.<\/li>\n<\/ul>\n<hr \/>\n<h2>Conclusion<\/h2>\n<p>If you\u2019re frustrated by the push or pull methods for indexing in a RAG application, Azure Durable Functions (in Python) might be\na breath of fresh air. You get clear visibility into each document\u2019s progress, scalability is built-in, and Durable Functions\nhandle the tricky parts of state management and retries. You still keep full control over how you chunk, embed, or otherwise\nprocess your documents, while benefiting from production-ready features.<\/p>\n<p>I\u2019ve shared a <a href=\"https:\/\/github.com\/Azure-Samples\/indexadillo\">sample repository<\/a> that demonstrates this approach. Feel free to\nclone it, explore, and adapt it to your own use case. It\u2019s still early, so if you spot any issues or have ideas for improvement,\nplease let me know. And if your scenario doesn\u2019t quite fit Durable Functions, there are many other ways to tackle indexing\u2014just\nreach out and we can chat.<\/p>\n<p><strong>Happy indexing!<\/strong><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Azure Durable Functions streamline RAG indexing by combining push flexibility with pull reliability for scalable,<\/p>\n","protected":false},"author":118100,"featured_media":16223,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1,3451],"tags":[3579,3599,3542,3553],"class_list":["post-16214","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cse","category-ise","tag-ai-search","tag-durable-functions","tag-llm","tag-rag"],"acf":[],"blog_post_summary":"<p>Azure Durable Functions streamline RAG indexing by combining push flexibility with pull reliability for scalable,<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/posts\/16214","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/users\/118100"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/comments?post=16214"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/posts\/16214\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/media\/16223"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/media?parent=16214"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/categories?post=16214"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/ise\/wp-json\/wp\/v2\/tags?post=16214"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}