October 2nd, 2011

Don’t Forget To Complete Your Tasks

Stephen Toub - MSFT
Partner Software Engineer

“Don’t forget to complete your tasks.”  That guidance may sound trivial and silly, but I recently saw it as a source of a bug in software written by some very smart folks, and thus thought this would be a good opportunity to remind folks of the imperative.

Tasks represent a promise.  If you hand one out, someone else may  wait for that task to complete before doing some subsequent action.  It doesn’t matter whether they wait synchronously (e.g. with Wait) or asynchronously (e.g. with ContinueWith or await)… that consumer is expecting the task to eventually complete, and it’s your duty to ensure that happens.

Of course, in the majority of cases, this is done automatically for you.  When you use Task.Run, the implementation of Run ensures the returned task will be completed, typically when the supplied delegate finishes its execution.  When you use Task.Factory.FromAsync, the implementation of FromAsync ensures the returned task will be completed once the wrapped operation completes.  When you use the new async keyword to write an async method, the compiler-generated code ensures the task is completed when the method’s execution completes.  And so on.  For the most part, you don’t need to be concerned with this guidance, because the tasks you rely on are generated by code you don’t own, and it’s that code’s responsibility to eventually complete the task it gave to you.

There are cases, however, where it is your responsibility as a developer.  Most importantly, the TaskCompletionSource<TResult> type enables you as a developer control over a task’s lifecycle.  It’s then up to you as the developer to ensure that the task is completed by using one of the Set* or TrySet* methods on the completion source.  As a trivial example, consider the following (buggy) method:

static void BuggyMethod()
{
    var tcs = new TaskCompletionSource<bool>();
    var t = tcs.Task;
    t.Wait();
    Console.WriteLine(“Will never get here.”);
}

This method will never complete.  It will wait forever for the task ‘t’ to complete, but ‘t’ never will, since the only way to complete the task is via the TaskCompletionSource<bool> that’s local to this method, and obviously no one’s calling any of its completion methods.  The same problem manifests of course for async methods:

static async Task BuggyMethodAsync()
{
    var tcs = new TaskCompletionSource<bool>();
    var t = tcs.Task;
    await t;
    Console.WriteLine(“Will never get here.”);
}

This will behave just as it was coded to, asynchronously waiting forever for the task to complete.

Now, you might look at these examples and think “man, what silly examples, why would anyone ever do that?”.  I recently came across an example that’s only slightly more complicated.  Consider a serial work queue like the following naïve implementation:

static class Worker {
    static readonly BlockingCollection<Action> m_actions =
        new BlockingCollection<Action>();

    static Worker() {
        Task.Run(() =>
        {
            foreach (var action in m_actions.GetConsumingEnumerable())
            {
                try { action(); } catch (Exception e) { Debug.WriteLine(e); }
            }
        });
    }

    public static void Enqueue(Action action) {
        m_actions.Add(action);
    }

    public static void Clear() {
        Action dumped;
        while (m_actions.TryTake(out dumped)) ;
    }
}

The idea here is that we have a dedicated worker which sits in a loop pulling delegates off of a queue and executing them serially one after the other.  The type also supports emptying the queue, for cases where you no longer care about previously queued work.  Now, imagine our previous silly example was expanded slightly to use this Worker:

static async Task BuggyMethodAsync()
{
    var tcs = new TaskCompletionSource<bool>();
    var t = tcs.Task;
    Worker.Enqueue(() =>
    {
        … // do work here
        tcs.SetResult(true);
    });
    await t;
    Console.WriteLine(“May never get here.”);
}

Here we’re queuing some work that’ll be executing in the worker, and we’re waiting for that work to complete.  Unfortunately, if someone else comes along and decides to Clear the work queue, our TaskCompletionSource will never complete, and we’ll be right back in the situation of waiting forever.  This of course makes sense, and the issues stems from a fundamental disconnect between code that was queuing work and expecting every queued action to eventually run, and code which was clearing the queue and expecting that it no longer cared about those old items and no one else should either.  Really what should have happened here is that the clearing code should have ensured the TaskCompletionSource<bool> got canceled, such as by the Worker.Enqueue method accepting two delegates, one for the Action to run, and one for a cancellation notification callback. The calling method could have supplied a cancellation callback which would invoke TrySetCanceled on the completion source.

So, the moral of this story is, always complete your tasks.  Unless you explicitly know otherwise, you need to assume that someone is waiting on the task your holding.  And if you don’t complete it, they could end up waiting for a very, very long time.

Author

Stephen Toub - MSFT
Partner Software Engineer

Stephen Toub is a developer on the .NET team at Microsoft.

3 comments

Discussion is closed. Login to edit/delete existing comments.

  • Code DJ

    Hi Stephen, I am hitting this scenario with .net core where sometimes the tasks never complete. The tool is scheduled to run nightly. On most days it completes as expected but on somedays it continues to run forever. Exactly the scenario you have listed above. Can you please take a look and tell me what I am doing wrong? Thank you. https://gist.github.com/Code-DJ/8eb3f9c2f80a4fe9f4585099c9b9ca23

    Read more
    • Stephen Toub

      There are no tasks anywhere in that example.

      • Code DJ

        Oops sorry. I used to have it with Tasks which had the same “occasionally runs forever” issue so I changed it to Thread. I debugged by giving it trivial work, the example works with multiple threads so the problem is elsewhere. Thank you for looking at the code.