{"id":125,"date":"2016-12-20T12:45:58","date_gmt":"2016-12-20T04:45:58","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/seteplia\/?p=125"},"modified":"2019-06-11T22:44:34","modified_gmt":"2019-06-12T05:44:34","slug":"dissecting-the-actionblock-a-short-story-about-a-nasty-deadlock","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/premier-developer\/dissecting-the-actionblock-a-short-story-about-a-nasty-deadlock\/","title":{"rendered":"Dissecting the ActionBlock: a Short Story About a Nasty Deadlock"},"content":{"rendered":"<p>I think almost every project in the real world uses some form of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Producer%E2%80%93consumer_problem\">producer-consumer queue<\/a>. The idea behind this problem is very simple. Application needs to decouple consumers of some data from the logic that processes it. Consider, for instance, the thread pool from the CLR: application can schedule some work using <b>ThreadPool.QueueUserWorkItem<\/b> and the thread pool will do its best to maximize application throughput by using optimal number of threads that will process the input data.<\/p>\n<p>But using the thread pool is not always possible and\/or appropriate. Even though you can control minimum and maximum number of threads in the thread pool, this configuration will affect the entire application but not some specific parts.<\/p>\n<p>There are numerous possible solutions to producer-consumer problem. You can implement entirely custom solution with the application logic intermixed with low level threading aspects. It can be something semi-custom, like a list of tasks that deals with shared <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/dd267312%28v=vs.110%29.aspx?f=255&amp;MSPPError=-2147217396\">BlockingCollection<\/a>. Or it could be a simple solution that is based on an existing component, like the <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/hh194684(v=vs.110).aspx\">ActionBlock&lt;T&gt;<\/a> from TPL Dataflow.<\/p>\n<p>Today we\u2019re going to explore internals of the <b>ActionBlock<\/b>, discuss design decisions that were made by their authors and learn why you need to know them in order to avoid some weird behavior. Ready? Let\u2019s go.<\/p>\n<p>In my current project we have several cases where we need to solve producer-consumer problem. Here is one case: we have a parser and custom interpreter for a TypeScript-like language. Without digging too deep into the details, you can assume that we just need to parse a set of files and build what is called a \u2018transitive closure\u2019 of all the dependencies .<\/p>\n<p>Roughly the logic is following:<\/p>\n<ol>\n<li>Parse the file<\/li>\n<li>Analyze the file to understand what the dependencies are.<\/li>\n<li>Resolve the dependencies (i.e. resolve which TypeScript files are required by this one by analysig \u2018import * from\u2019, \u2018require\u2019 etc)<\/li>\n<li>Schedule all the dependencies for parsing.<\/li>\n<\/ol>\n<p>Pretty simple, right? Actually it is and here you can see how the logic could be implemented using TPL DataFlow and <b>ActionBlock&lt;T&gt;<\/b>:<\/p>\n<pre class=\"lang:default decode:true \">private static Task&lt;ParsedFile&gt; ParseFileAsync(string path)\r\n{\r\n    Console.WriteLine($\"Parsing '{path}'. {{0}}\",\r\n        $\"Thread Id - {Thread.CurrentThread.ManagedThreadId}\");\r\n    Thread.Sleep(10);\r\n\r\n    return Task.FromResult(\r\n        new ParsedFile()\r\n        {\r\n            FileName = path,\r\n            Dependencies = GetFileDependencies(path),\r\n        });\r\n}\r\n\r\nstatic void Main(string[] args)\r\n{\r\n    long numberOfProcessedFiles = 0;\r\n    ActionBlock&lt;string&gt; actionBlock = null;\r\n    Func&lt;string, Task&gt; processFile = async path =&gt;\r\n    {\r\n        Interlocked.Increment(ref numberOfProcessedFiles);\r\n        ParsedFile parsedFile = await ParseFileAsync(path);\r\n\r\n        foreach (var dependency in parsedFile.Dependencies)\r\n        {\r\n            Console.WriteLine($\"Sending '{dependency}' to the queue... {{0}}\",\r\n                $\"Thread Id - {Thread.CurrentThread.ManagedThreadId}\");\r\n            await actionBlock.SendAsync(dependency);\r\n        }\r\n\r\n        if (actionBlock.InputCount == 0)\r\n        {\r\n            \/\/ This is a marker that this is a last file and there \r\n            \/\/ is nothing to process\r\n            actionBlock.Complete();\r\n        }\r\n    };\r\n\r\n    actionBlock = new ActionBlock&lt;string&gt;(processFile);\r\n\r\n    actionBlock.SendAsync(\"FooBar.ts\").GetAwaiter().GetResult();\r\n    Console.WriteLine(\"Waiting for an action block to finish...\");\r\n\r\n    actionBlock.Completion.GetAwaiter().GetResult();\r\n    Console.WriteLine($\"Done. Processed {numberOfProcessedFiles}\");\r\n    Console.ReadLine();<\/pre>\n<p>Let\u2019s discuss what is going on here. For the sake of simplicity all the logic resides in the <b>Main<\/b> method. <b>numberOfProcessedFiles<\/b> is used to check that the logic is correct and we\u2019re not missing files because of race conditions. The main part of it resides in the <b>processFile<\/b> action that will be provided to an <b>ActionBlock<\/b> during construction. This delegate acts like \u2018consumer\u2019 and \u2018producer\u2019 at the same time: it gets a <b>path<\/b> from the queue, parses the file, schedules new items by calling <b>actionBlock.SendAsync<\/b> and then, if the input of the queue is empty it signals the queue that all the files were processed by calling <b>actionBlock.Complete()<\/b> (*). Then the <b>Main<\/b> method creates an <b>ActionBlock,<\/b> initiate the process by sending the first file and then waits for the completion.<\/p>\n<p><b>ParseFileAsync<\/b> fakes a parsing process and computes the dependencies by using following logic: file \u2018foo.ts\u2019 depends on \u2018fo.ts\u2019, that depends on \u2018f.ts\u2019 etc. Basically, each file depends on the same file with a shorter name. This is a very na\u00efve logic but it helps to show the process.<\/p>\n<p><b>ActionBlock<\/b> provides an easy to use API and handles concurrency for you. The only caveat here is that the default \u2018degree of parallelism\u2019 is 1, and you should override the default by using <b><a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/system.threading.tasks.dataflow.executiondataflowblockoptions(v=vs.110).aspx\">ExecutionDataflowBlockOptions<\/a> <\/b>during construction of an <b>ActionBlock<\/b>. If the <b><a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/system.threading.tasks.dataflow.executiondataflowblockoptions.maxdegreeofparallelism(v=vs.110).aspx\">MaxDegreeOfParallelism<\/a><\/b> property is greater than 1 then the <b>ActionBlock<\/b> will call a callback function from different threads (actually, from different Tasks) to process multiple incoming items at the same time.<\/p>\n<h3>Post vs. SendAsync or what to use when<\/h3>\n<p>Everyone who tried to solve producer-consumer problem faced a dilemma: what to do if an incoming flow exceeds an ability to process it? How to \u2018throttle\u2019 back? Just keep every possible object in memory and grow the queue indefinitely? Throw an exception? Return \u2018false\u2019? Block an \u2018Enqueue\u2019 method while the queue is full?<\/p>\n<p>To solve this problem, <b>ActionBlock<\/b> authors decided to use the following well-known pattern:<\/p>\n<ol>\n<li>The client of an action block may provide a queue size (in the constructor).<\/li>\n<li>When a queue is full the <b><a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/hh462696(v=vs.110).aspx\">Post<\/a><\/b> method returns <b>false<\/b> and <b><a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/hh194681(v=vs.110).aspx\">SendAsync<\/a><\/b> method \u201cblocks\u201d until the queue will get a free spot (**).<\/li>\n<\/ol>\n<p>In our first example, the code doesn\u2019t specify any limits to the queue. This means that if the producer will push really hard the application can end up with <b>OutOfMemoryException<\/b>. Let\u2019s change it. And for a sake of this example let\u2019s set the queue size to some ridiculously low number, like \u2026 1.<\/p>\n<pre style=\"overflow: visible; ;background: #1e1e1e; ;padding-bottom: 0px; padding-top: 0px; padding-left: 0px; line-height: normal; padding-right: 0px; border-style: none;\"><span style=\"color: #dcdcdc;\">actionBlock <\/span><span style=\"color: #b4b4b4;\">=<\/span><span style=\"color: #dcdcdc;\">\u00a0<\/span><span style=\"color: #569cd6;\">new<\/span><span style=\"color: #dcdcdc;\">\u00a0<\/span><span style=\"color: #4ec9b0;\">ActionBlock<\/span><span style=\"color: #dcdcdc;\">&lt;<\/span><span style=\"color: #569cd6;\">string<\/span><span style=\"color: #dcdcdc;\">&gt;(processFile,\r\n\u00a0\u00a0\u00a0 <\/span><span style=\"color: #569cd6;\">new<\/span><span style=\"color: #dcdcdc;\">\u00a0<\/span><span style=\"color: #4ec9b0;\">ExecutionDataflowBlockOptions<\/span><span style=\"color: #dcdcdc;\">() {BoundedCapacity <\/span><span style=\"color: #b4b4b4;\">=<\/span><span style=\"color: #dcdcdc;\">\u00a0<\/span><span style=\"color: #b5cea8;\">1<\/span><span style=\"color: #dcdcdc;\">});<\/span><\/pre>\n<p>Now, if when we\u2019ll run this code, we\u2019ll get \u2026 a deadlock!<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/31\/2019\/06\/image211.png\"><img decoding=\"async\" style=\"padding-top: 0px; padding-left: 0px; padding-right: 0px; border: 0px;\" title=\"image\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/31\/2019\/06\/image_thumb155.png\" alt=\"image\" width=\"640\" height=\"160\" border=\"0\" \/><\/a><\/p>\n<h3>Deadlock<\/h3>\n<p>Let\u2019s think about producer-consumer queue from the design point of view. You\u2019re building your own custom implementation of the queue that takes a callback for processing elements. You need to decide should the queue be bounded or not. If the queue is bounded, you will end up with a similar set of methods that <b>ActionBlock<\/b> has: synchronous API for adding elements, that returns <b>false<\/b> when the queue is full and asynchronous version that returns a task. If the queue is full a client of the queue can decide what to do: to handle \u2018overflow\u2019 manually by using synchronous version or \u2018await\u2019 on the task with asynchronous version.<\/p>\n<p>Then you need to decide when a given callback should be called. You may end up with a following logic: check the queue size, if the queue is not empty pick the element from it, call the callback, wait for it to finish and then remove the item from the queue. (Real implementation will be way more complicated because it should consider all possible race conditions that could happen here.) You may remove the item from the queue before calling the callback but as we\u2019ll see in a moment it won\u2019t change the possibility of getting deadlock.<\/p>\n<p>This design is clean and simple but it can cause a problem very easily. Suppose the queue is full and the queue calling back a function to process the element. And instead of quickly processing the element, this callback tries to schedule another item by awaiting <b>SendAsync<\/b>:<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/31\/2019\/06\/image212.png\"><img decoding=\"async\" style=\"padding-top: 0px; padding-left: 0px; padding-right: 0px; border: 0px;\" title=\"image\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/31\/2019\/06\/image_thumb156.png\" alt=\"image\" width=\"640\" height=\"466\" border=\"0\" \/><\/a><\/p>\n<p>The queue is full and the queue can\u2019t accept new element because the callback is in progress. But the callback itself just got stuck on awaiting for <b>SendAsync<\/b> to finish. Classical deadlock!<\/p>\n<p>Ok. We\u2019re getting a deadlock because an <b>ActionBlock<\/b> removes elements only *after* the callback is called. Let\u2019s consider an alternative: what if <b>ActionBlock<\/b> will remove an item *before* calling the callback?<\/p>\n<p>Actually, deadlock would still be possible. Let\u2019s consider <b>ActionBlock<\/b> with a bound size of 1 and degree of parallelism of 2.<\/p>\n<ul>\n<li>Thread T1 adds an element to the queue. <b>ActionBlock<\/b> removes an item and calls the callback.<\/li>\n<li>Thread T2 adds an element to the queue. <b>ActionBlock<\/b> removes an item and calls the callback.<\/li>\n<li>Thread T1 adds an element to the queue. <b>ActionBlock<\/b> can\u2019t call the callback, because degree of parallelism is 2. <b>The queue is full<\/b>.<\/li>\n<li>The callback 1 tries to add an item to the queue using \u2018<b>await SendAsync<\/b>\u2019, but got stuck because the queue is full.<\/li>\n<li>The callback 2 tries to add an item to the queue using \u2018<b>await SendAsync<\/b>\u2019, but got stuck because the queue is full.<\/li>\n<\/ul>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/31\/2019\/06\/image213.png\"><img decoding=\"async\" style=\"padding-top: 0px; padding-left: 0px; padding-right: 0px; border: 0px;\" title=\"image\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/31\/2019\/06\/image_thumb157.png\" alt=\"image\" width=\"576\" height=\"480\" border=\"0\" \/><\/a><\/p>\n<p>It means that removing elements before won\u2019t help with the problem. In fact it will make it even worse because the probability of a deadlock would be lower (we need N callbacks that schedules additional work, where N is degree of parallelism). Another drawback of this solution is more subtle: actually, <b>ActionBlock<\/b> is not a generic purpose producer-consumer queue. This class implements <b>ITargetSource<\/b> and could be used in more complicated dataflows. For instance, you may use one <b>BufferedBlock<\/b> with more than one target action block to process items in parallel. With existing behavior when the action block is full it won\u2019t accept more elements from the source and will leave an ability for other blocks in the chain to process the same item immediately instead.<\/p>\n<p>If the item will be removed from the queue before calling the callback the actual \u2018size\u2019 of the queue would be \u2018BoundedCapacity\u2019 + \u2018MaxDegreeOfParallelism\u2019 which is way harder to reason about.<\/p>\n<h3>How to solve the problem?<\/h3>\n<p>UPDATE: in the original post, I\u2019ve proposed the solution based on the call to <strong>SendAsync<\/strong><b> <\/b>method in the <strong>Task.Run<\/strong>. But that solution just uses different TPL queue and if that queue would be full the same issue will occur.<\/p>\n<p>I don\u2019t think that there is a solution to this problem. If you need to bound queue capacity and the callback could schedule more work, then the <b>ActionBlock<\/b> is just not the right tool. In this case, you need to give up with a bounded capacity or implement producer-consumer pattern using different building blocks, for instance using <b>BlockingCollection<\/b> and manually control a set of workers.<\/p>\n<h3>Degree of Parallelism<\/h3>\n<p>Unlike the primitives from TPL all the blocks from TPL Dataflow are single threaded by default. It means that an <b>ActionBlock<\/b> will process items one-by-one with one thread, a <b>TransformBlock<\/b> will transform items one-by-one with one thread etc. The reason for this is simplicity: it is much easier to reason about dataflow graphs when there is no concurrency involved.<\/p>\n<p>To enable parallelism you should provide an instance of <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/system.threading.tasks.dataflow.executiondataflowblockoptions(v=vs.110).aspx\"><b>ExecutionDataflowBlockOptions<\/b><\/a> in the constructor with <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/system.threading.tasks.dataflow.executiondataflowblockoptions.maxdegreeofparallelism(v=vs.110).aspx\"><b>MaxDegreeOfParallelism<\/b><\/a> property greater than 1. Btw, setting this property to -1 will enable \u2018unbounded\u2019 mode when <b>ActionBlock<\/b> will create as many tasks as possible and their number would only be limited by a given <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/system.threading.tasks.dataflow.dataflowblockoptions.taskscheduler(v=vs.110).aspx\"><b>TaskScheduler<\/b><\/a> that you may also provide at the construction time.<\/p>\n<h3>Conclusion<\/h3>\n<p>Designing an easy-to-use component is hard. Designing an easy to-use-component that deals with concurrency for you &#8212; even harder. The best way to use it correctly is to know how it is implemented, and what the restrictions the design team had in mind.<\/p>\n<p><b>ActionBlock&lt;T&gt;<\/b> is a great type that drastically simplifies most common producer-consumer scenarios. But even in this case, in order to use it correctly, you should know some key aspects of TPL Dataflow, like default degree of parallelism, behavior of bounded blocks and idea of work-items ownership.<\/p>\n<p>&#8212;&#8211;<\/p>\n<p>(*) This example is not 100% thread safe and full-blown implementation should not rely on <b>actionBlock.InputCount<\/b>. Could you spot the issue?<\/p>\n<p>(**) The <b>Post<\/b> method returns <b>false<\/b> in one of two cases: the queue is full or the queue is completed. This could be confusing because one value indicates two different conditions. The <b>SendAsync<\/b> method on the other hand is different: a given <b>Task&lt;bool&gt;<\/b> would be blocked while the queue is full and the <b>task.Result<\/b> would be <b>false<\/b> if the queue can\u2019t process elements anymore.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I think almost every project in the real world uses some form of producer-consumer queue. The idea behind this problem is very simple. Application needs to decouple consumers of some data from the logic that processes it. Consider, for instance, the thread pool from the CLR: application can schedule some work using ThreadPool.QueueUserWorkItem and the [&hellip;]<\/p>\n","protected":false},"author":4004,"featured_media":37840,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[6696,1,6697],"tags":[6695],"class_list":["post-125","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-concurrency","category-permierdev","category-tpl","tag-seteplia"],"acf":[],"blog_post_summary":"<p>I think almost every project in the real world uses some form of producer-consumer queue. The idea behind this problem is very simple. Application needs to decouple consumers of some data from the logic that processes it. Consider, for instance, the thread pool from the CLR: application can schedule some work using ThreadPool.QueueUserWorkItem and the [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/posts\/125","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/users\/4004"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/comments?post=125"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/posts\/125\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/media\/37840"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/media?parent=125"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/categories?post=125"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/tags?post=125"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}