Psychic Debugging of Async Methods

Stephen Toub - MSFT

These days it’s not uncommon for me to receive an email or read a forum post from someone concerned about a problem they’re experiencing with an async method they’ve written, and they’re seeking help debugging the issue.  Sometimes plenty of information about the bug is conveyed, but other times the communication is void of anything more than a root problem statement.  That’s when I engage my powers of psychic debugging to suggest what the root cause might be, without actually knowing more about the developer’s codebase. 

Here are four of the more common issues I’ve heard raised, along with their likely culprits.  If you experience one of these problems, look at one of these causes first, as there’s a very good chance it’s to blame.

1. “I converted my synchronous method to be asynchronous using ‘async’, but it still runs synchronously.”

As explained in the Async/Await FAQ, marking a method as ‘async’ does not force the method to run asynchronously, e.g. it doesn’t queue a work item to run the method, it doesn’t spawn a new thread to run the method, etc.  Marking a method as ‘async’ really just tells the compiler to allow usage of ‘await’ inside the body of the method, and to handle completion/results/exceptions from the method in a special way (i.e. putting them into a task that’s returned from the method call).  When you invoke a method that’s marked as ‘async’, the method is still invoked synchronously, and it continues to run synchronously until either a) the method completes, or b) the method awaits an awaitable (e.g. a task) that’s not yet complete.  If the method doesn’t contain any awaits (which will generate a compiler warning) or if it only ever awaits awaitables that are already completed, the method will run to completion synchronously, and the task returned to the caller will be completed by the time it’s handed back.  This is by design.

If your goal in using ‘async’ isjust to offload some synchronous work to another thread, you can instead use Task.Run to do this rather than using the async/await language features, e.g. if you have the synchronous method:

int DoWork()
{
    …
}

instead of converting that to the following (which isn’t doing what you wanted, as the method will run entirely synchronously due to a lack of awaits):

async Task<int> DoWorkAsync()
{
    … // same code as in the synchronous method
}

you can convert it to:

Task<int> DoWorkAsync()
{
    return Task.Run(() =>
    {
        … // same code as in the synchronous method
    });
}

Task.Run will run the supplied delegate on a ThreadPool thread, immediately returning a Task that represents the eventual completion of that work.

(This is a fine thing to do inside of your application, such as for offloading some compute-intensive work from your UI thread to the ThreadPool.  For predominantly philosophical reasons, however, I don’t recommend exposing such asynchronous wrappers publicly for others to consume.)

2. “I made my method asynchronous using ‘async’ and ‘await’, but I can’t await it.”

My psychic powers tell me that you have a void-returning synchronous method, and that when you applied the ‘async’ keyword, you didn’t change the return type from ‘void’ to ‘Task’ (and subsequently ignored the warning and friendly recommendation the compiler made when you tried to await the method’s result: “error CS4008: ‘MethodAsync’ does not return a Task and cannot be awaited. Consider changing it to return Task.”)

Methods marked as ‘async’ can return void, Task, or Task<T>, but void should really only be reserved for top-level entry points to your program, e.g. UI event handlers.  If you have a synchronous library method that you’re converting to be asynchronous with async/await, and if you think it should stay returning void, think again: unless you have a really good reason, asynchronous operations exposed from libraries should return Task or Task<T>, not void.

3. “My async method never completes.”

The problem statement here is that an async method returns a Task that never completes.  Often the description also includes a statement that one of the awaits inside of the async method never completed.  This is behavior is typically due to one of two things, or variations off of these:

A) The caller had a non-null SynchronizationContext, ConfigureAwait(false) was not used on the await of a task, and the SynchronizationContext is blocked or no longer pumping.

If there is a current SynchronizationContext when a task is awaited, and if that task has not yet completed by the time it’s awaited, by default that SynchronizationContext will be used to run the continuation (the code that comes next after the await) once the task completes.  As this continuation represents part of the async method, the async method can’t be considered complete (and the Task returned to represent this async method’s invocation can’t complete) until that continuation is run.  And when/where/how that continuation is run is entirely up to the SynchronizationContext to which the continuation is Post’d.

Often this SynchronizationContext will be a single-threaded context, meaning that only one thread at a time is able to process queued work (in the case of a GUI context, like DispatcherSynchronizationContext for WPF, WinRTSynchronizationContext for Windows Store apps, or WindowsFormsSynchronizationContext, that one thread is likely to be a dedicated “UI thread”; in the case of a server environment, like AspNetSynchronizationContext for ASP.NET, that thread is just whatever thread is currently the one allowed to process queued work for a given HTTP request). 

If that one thread gets blocked, no other thread will be allowed to process the queued work, which means the aforementioned continuation won’t be run, which means the async method’s Task won’t complete.  And, if it’s that Task that’s being waited on to block the thread, then we have ourselves a case of deadlock.  Look for places where you’re synchronously blocking, e.g. using Task.Wait or Task.WaitAll or other such forms of waiting (you shouldn’t be doing such synchronous blocking on such precious resources as a UI thread); there’s a good chance that’s where you’ll find your culprit.

Of course, there are reasons other than explicit blocking that UI thread might not be processing.  For example, the message pump that had been running on that thread might have shut down, e.g. in WPF you were inside of a call to Dispatcher.PushFrame which was told to exit by setting the frame’s Continue property to false.  As with the deadlock example, if the context isn’t able to run the continuation, the method will never complete.

Now, as an implementer of library code that uses await, you can help protect against such issues by using “await task.ConfigureAwait(false)” instead of just “await task” whenever you await tasks.  This tells the system not to force the continuation back to the original SynchronizationContext.  And since most libraries are agnostic to the environment in which they run, most library-based asynchronous methods shouldn’t need to be SynchronizationContext-aware. 

This good citizenry on your part will also pay dividends to you, in that it’ll often make your code faster.  If your method is invoked when there’s a current SynchronizationContext, all of your method’s continuations will be forced back to that context, incurring costs in the form of thread hops and object allocations and the like, all of which are very likely unnecessary.  Using “.ConfigureAwait(false)” will help avoid those costs. 

So, if you the code in your method doesn’t need to run the continuations back on the original SynchronizationContext, go the extra step of specifying ConfigureAwait(false) whenever you await a task in that method; you’ll be happy you did.

B) An async method used in this method never completes.

If you have a synchronous method that enters an infinite loop or blocks indefinitely or some other such operation that never completes, the invocation of the synchronous method will never return to its caller, e.g.

public void MethodThatNeverCompletes()
{
    …
    var tcs = new TaskCompletionSource<int>();
    tcs.Task.Wait(); // the Task is never completed, so this blocks forever
    …
}

Similarly, a direct async port of this method will result in the returned Task never completing:

public async Task MethodThatNeverCompletesAsync()
{
    …
    var tcs = new TaskCompletionSource<int>();
    await tcs.Task; // the Task is never completed, so this awaits forever
    …
}

And as a result, since the Task returned from MethodThatNeverCompletesAsync will never complete, any awaits on the Task returned from this method will also never complete, e.g.

await MethodThatNeverCompletesAsync(); // awaits forever

This may seem like a silly example, but I’ve seen issues related to this a non-trivial number of times.  The moral of the story is: don’t forget to complete your tasks

Of course, the issues are almost never as simple as I’ve demonstrated above.  Often they occur when a developer has forgotten to complete a TaskCompletionSource task on a failure or cancellation path, e.g.

public Task<int> SomeLibraryMethodAsync()
{
    var tcs = new TaskCompletionSource<int>();
    BeginLibraryMethod(ar =>
    {
        try
        {
            int result = EndLibraryMethod(ar);
            tcs.SetResult(result);
        }
        catch(Exception exc) {} // oops! Forgot tcs.SetException(exc);
    });
    return tcs.Task;
}

Here the developer should have completed the task even if an error occurred, and in neglecting to do so, anyone awaiting the Task<int> returned from SomeLibraryMethodAsync will end up awaiting forever in the event of an exception.

Examples can get much more complicated than this. For example, one system I saw involved code that was maintaining a queue of waiters, and when one part of the system requested cancellation, all of the waiters were dropped, e.g.

private readonly object m_syncObj = new object();
private readonly Queue<T> m_data = new Queue<T>();
private Queue<TaskCompletionSource<T>> m_waiters =
    new Queue<TaskCompletionSource<T>>();

public void Add(T data)
{
    TaskCompletionSource<T> next = null;
    lock(m_syncObj)
    {
        if (m_waiters.Count > 0) 
           
next = m_waiters.Dequeue();
        else
            m_data.Enqueue(data);
    }
    if (next != null)
        next.SetResult(data);
}

public Task<T> WaitAsync()
{
    lock(m_syncObj)
    {
        if (m_data.Count > 0)
            return Task.FromResult(m_data.Dequeue());

        var tcs = new TaskCompletionSource<T>();
        m_waiters.Enqueue(tcs);
        return tcs.Task;
    }
}

public void Reset()
{
    lock(m_syncObj)
    {
        m_data.Clear();
        m_waiters.Clear(); // uh oh!
    }
}

This implementation might look ok, but there’s a dastardly bug lurking: calling Reset will simply dump the TaskCompletionSource instances for anyone currently waiting for data, which will likely cause anyone waiting on the associated tasks to wait forever.  The developer of this code probably should have iterated through each of the waiters and explicitly canceled or faulted them, rather than leaving each incomplete, e.g.

public void Reset()
{
    TaskCompletionSource<T> oldWaiters;
    lock(m_syncObj)
    {
        m_data.Clear();
        oldWaiters = m_waiters;
        m_waiters = new Queue<TaskCompletionSource<T>>();
    }
    foreach(var waiter in oldWaiters)
        waiter.SetCanceled();
}

Long story short: if you ever write or review code that just drops a stored reference to a TaskCompletionSource without first completing it, ask yourself whether there might be a bug lurking there.

4. “The Task returned from my async method completes before the method is done.”

This symptom is oftem caused by one of three things, or variations off of these:

A) An async lambda is passed to a method accepting an Action.

I see this one bite people over and over again.  For a thorough explanation, see pitfalls to avoid when passing around async lambdas.

In short, whenever you pass an async lambda or an async anonymous method to a method, be sure to take a look at the signature of the method you’re calling, and in particular to the type of the delegate parameter to which you’re passing the async lambda/anonymous method.  If the delegate type is a Func<Task>, Func<Task<T>>, or some other delegate type returning a Task or a Task<T>, be happy!  If, however, the delegate type is an Action, or an Action<T>, or any other void-returning delegate type, be afraid, be very very afraid!

The whole point of an asynchronous method returning a Task or some other future/promise type is that it gives the caller an object that represents the eventual completion of that operation.  Just the method returning to its synchronous caller is insufficient to signal the asynchronous operation’s completion: the returned task must also complete.  In the case of a void-returning asynchronous method, there is no such object handed back, and thus there isn’t a good way for a caller to know that the asynchronous operation has completed (in making this statement, I’m largely ignoring the advanced possibility of using a custom SynchronizationContext for its OperationStarted/OperationCompleted methods, which will be invoked by the infrastructure backing an “async void” method… I feel comfortable ignoring that because I don’t consider such an ambient mechanism to be “a good way” for the caller to know).  Without such an indication, many callers will erroneously assume that the method’s synchronous return does in fact mean the operation has completed.

For example, consider a recent misuse I saw: passing an async lambda to Parallel.ForEach.  Here’s the basic overload of Parallel.ForEach (there are longer overloads, but they don’t matter to this discussion):

public static ParallelLoopResult ForEach<TSource>(
    IEnumerable<TSource> source,
    Action<TSource> body);

This overload takes two arguments: an enumerable of data to iterate through, and an action to invoke for each element of data.  The caller of the Action<TSource> in this case is the ForEach implementation, and the ForEach implementation won’t return to its caller until it’s invoked the body delegate for each element of data (I’m ignoring for this discussion the possibility of an exception).  That’s all fine and dandy when a synchronous implementation is passed as the body, but consider the following:

var sw = Stopwatch.StartNew();
Parallel.ForEach(Enumerable.Range(0, 1000), async i => // uh oh!
{
    await Task.Delay(TimeSpan.FromSeconds(1));
});
Console.WriteLine(sw.Elapsed);

You might expect this to take ~17 minutes to run (1000 iterations, each of which should be waiting for a second), but it actually completes almost immediately, outputting a sub-second time.  Why?  Because the delegate is void-returning, so the async lambda is being mapped into an “async void” method, one that returns void.  And as soon as the body of the delegate hits an await for an awaitable that’s not yet completed, it returns to its caller.  It’s caller, in this case, is the ForEach code, which has little choice but to assume that the delegate has completed its work.  As such, the ForEach very quickly enumerates through all 1000 elements, launching the async method for each, but not waiting for each to complete.  (In this particular case of Parallel.ForEach, the developer could have implemented a ForEachAsync, one that used a Func<TSource,Task> for the delegate rather than an Action<TSource>.)

B) A Task<Task> or a Task<Task<T>> is used as if it were just a Task.

Consider this code:

static async Task ReproAsync()
{
    var sw = Stopwatch.StartNew();
    await Task.Factory.StartNew(async delegate
    {
        await Task.Delay(TimeSpan.FromSeconds(1000));
    });
    Console.WriteLine(sw.Elapsed);
}

How much time do you think this outputs, a value close to 1000 seconds, or a value under 1 second?  If you guessed the latter, congratulations.  If you guessed the former, sorry, read on.  This example is an interesting variation on the previous discussion about passing around async lambdas / async anonymous methods. 

The StartNew method has multiple overloads, including ones that accept an Action and ones that accept a Func<TResult>.  The overload that accepts an Action returns a Task, and the overload that accepts a Func<TResult> returns a Task<TResult>.  Makes sense.  Now, the C# compiler is going to use its overload resolution logic to pick the best matching overload given the supplied arguments, and it’s going to end up matching the above to Func<TResult>, where TResult is a Task.  Great, so our async delegate is getting mapped to a Func<Task>… all’s well, right?  Wrong.  As far as StartNew is concerned, that Func<Task> is just a Task<TResult>, and StartNew isn’t going to give any special treatment to a particular TResult type, i.e. it’ll treat a TResult of Task just as if it were an Int32 or a String or an Object or a TimeSpan or anything else.  And that means that StartNew will be returning a Task<Task>.  When the above code is awaiting the result of the StartNew, it’s awaiting that outer task to complete, not the inner Task that was returned from the invocation of the asynchronous delegate.  To do this correctly, the code should either be awaiting the outer task to get the inner task, and then awaiting that inner task, or it should be using the Unwrap method, or it should be using Task.Run instead of Task.Factory.StartNew.

Mindbending?  If so, read this post on Task.Run vs Task.Factory.StartNew, and it should shed more light on the subject.

C) An awaitable object isn’t awaited in a method marked with the async keyword.

If you have the following method:

public async Task PauseOneSecondAsync() // buggy
{
    Task.Delay(1000);
}

The compiler will issue a warning on the call to Task.Delay:

warning CS4014: Because this call is not awaited, execution of the current method continues before the call is completed. Consider applying the ‘await’ operator to the result of the call.

That’s because invoking a Task-returning method (or other methods that return other future-like or promise-like types) just initiates the asynchronous operation, but doesn’t wait or block for it to complete.  Rather, the returned object is the representation for the eventual completion of the operation, and to prevent forward progress until that operation has completed, the returned object must be awaited.

Of course, this warning can be suppressed, either at the project level, more localized with a “#pragma warning disable”, or more commonly just by doing something with the result of the call, including storing it into a variable (even if that variable is then never used):

public async Task PauseOneSecondAsync() // buggy
{
    var t = Task.Delay(1000); // no warning
}

The compiler won’t issue any warning in this case, as it’s been suppressed due to the Task returned from Task.Delay being stored into a local, and yet this method will complete before the developer intended for it to, because of the missing await:

public async Task PauseOneSecondAsync()
{
    await Task.Delay(1000);
}

(As a complete aside about performance, if you actually wanted to implement PauseOneSecondAsync, it doesn’t need to be marked as ‘async’ and use ‘await’.  As described in the post “when at last you await”, this could be rewritten as:

public Task PauseOneSecondAsync()
{
    return Task.Delay(1000);
}

That code will be identical functionally but with better performance characteristics.)

0 comments

Discussion is closed.

Feedback usabilla icon