Astute users of the Task Parallel Library might have noticed three new options available across TaskCreationOptions and TaskContinuationOptions in .NET 4.5: DenyChildAttach, HideScheduler, and (on TaskContinuationOptions) LazyCancellation. I wanted to take a few minutes to share more about what these are and why we added them.
DenyChildAttach
As a reminder, when a Task is created with TaskCreationOptions.AttachedToParent or TaskContinuationOptions.AttachedToParent, the creation code looks to see what task is currently running on the current thread (this Task’s Id is available from the static Task.CurrentId property, which will return null if there isn’t one). If it finds there is one, the Task being created registers with that parent Task as a child, leading to two additional behaviors: the parent Task won’t transition to a completed state until all of its children have completed as well, and any exceptions from faulted children will propagate up to the parent Task (unless the parent Task observes those exceptions before it completes). This parent/child relationship and hierarchy is visible in Visual Studio’s Parallel Tasks window.
If you’re responsible for all of the code in your solution, you have control over whether the tasks you create try to attach to a parent task. But what if your code creates some Tasks, and from those Tasks calls out to code you don’t own? The code you call might use AttachedToParent and attach children to your tasks. Did you expect that? Is your code reliable against that? Have you done all of the necessary testing to ensure it?
For this situation, we introduced DenyChildAttach. When a task uses AttachedToParent but finds there is no current Task, it just doesn’t attach to anything, behaving as if AttachedToParent wasn’t supplied. If there is a parent task, but that parent task was created with DenyChildAttach, the same thing happens: the task using AttachedToParent won’t see a parent and thus won’t attach to anything, even though technically there was a task to which it could have been attached. It’s a slight of hand or Jedi mind trick: “this is not the parent you’re looking for.”
With 20/20 hindsight, if we could do .NET 4 over again, I personally would chosen to make both sides of the equation opt-in. Today, the child task gets to opt in to being a child by specifying AttachedToParent, but the parent must opt out if it doesn’t want to be one. In retrospect, I think it would have been better if both sides had the choice to opt in, with the parent specifying a mythical flag like AllowsChildren to opt in rather than DenyChildAttach to opt out. Nevertheless, this is just a question of default. You’ll notice that the new Task.Run method internally specifies DenyChildAttach when creating its Tasks, in affect making this the default for the API we expect to become the most common way of launching tasks. If you want explicit control over the TaskCreationOptions used, you can instead use the existing Task.Factory.StartNew method, which becomes the more advanced mechanism and allows you to control the options, object state, scheduler, and so on.
HideScheduler
With code written in .NET 4, we saw this pattern to be relatively common:
private void button_Click(…)
{
… // #1 on the UI thread
Task.Factory.StartNew(() =>
{
… // #2 long-running work, so offloaded to non-UI thread
}).ContinueWith(t =>
{
… // #3 back on the UI thread
}, TaskScheduler.FromCurrentSynchronizationContext());
}
In other words, Tasks and continuations became a way to offload some work from the UI thread, and then run some follow-up work back on the UI thread. This was accomplished by using the TaskScheduler.FromCurrentSynchronizationContext method, which looks up SynchronizationContext.Current and constructs a new TaskScheduler instance around it: when you schedule a Task to this TaskScheduler, the scheduler will then pass the task along to the SynchronizationContext to be invoked.
That’s all well and good, but it’s important to keep in mind the behavior of the Task-related APIs introduced in .NET 4 when no TaskScheduler is explicitly provided. The TaskFactory class has a bunch of overloaded methods (e.g. StartNew), and when you construct a TaskFactory class, you have the option to provide a TaskScheduler. Then, when you call one of its methods (like StartNew) that doesn’t take a TaskScheduler, the scheduler that was provided to the TaskFactory’s constructor is used. If no scheduler was provided to the TaskFactory, then if you call an overload that doesn’t take a TaskScheduler, the TaskFactory ends up using TaskScheduler.Current at the time the call is made (TaskScheduler.Current returns the scheduler associated with whatever Task is currently running on that thread, or if there is no such task, it returns TaskScheduler.Default, which represents the ThreadPool). Now, the TaskFactory returned from Task.Factory is constructed without a specific scheduler, so for example when you write Task.Factory.StartNew(Action), you’re telling TPL to create a Task for that Action and schedule it to TaskScheduler.Current.
In many situations, that’s the right behavior. For example, let’s say you’re implementing a recursive divide-and-conquer problem, where you have a task that’s supposed to process some chunk of work, and it in turn subdivides its work and schedules tasks to process those chunks. If that task was running on a scheduler representing a particular pool of threads, or if it was running on a scheduler that had a concurrency limit, and so on, you’d typically want those tasks it then created to also run on the same scheduler.
However, it turns out that in other situations, it’s not the right behavior. And one such situation is like that I showed previously. Imagine now that your code looked like this:
private void button_Click(…)
{
… // #1 on the UI thread
Task.Factory.StartNew(() =>
{
… // #2 long-running work, so offloaded to non-UI thread
}).ContinueWith(t =>
{
… // #3 back on the UI thread
Task.Factory.StartNew(() =>
{
… // #4 compute-intensive work we want offloaded to non-UI thread (bug!)
});
}, TaskScheduler.FromCurrentSynchronizationContext());
}
This seems logical: we do some work on the UI thread, then we offload some work to the background, when that work completes we hop back to the UI thread, and then we kick off another task to run in the background. Unfortunately, this is buggy. Because the continuation was scheduled to TaskScheduler.FromCurrentSynchronizationContext, that scheduler is TaskScheduler.Current during the execution of the continuation. And in that continuation we’re calling Task.Factory.StartNew using an overload that doesn’t accept a TaskScheduler. Which means that this compute-intensive work is actually going to be scheduled back to the UI thread! Ugh.
There are of course already solutions to this. For example, if you own all of this code, you could explicitly specify TaskScheduler.Default (the ThreadPool scheduler) when calling StartNew, or you could change the structure of the code so that the StartNew became a continuation off of the continuation, e.g.
private void button_Click(…)
{
… // #1 on the UI thread
Task.Factory.StartNew(() =>
{
… // #2 long-running work, so offloaded to non-UI thread
}).ContinueWith(t =>
{
… // #3 back on the UI thread
}, TaskScheduler.FromCurrentSynchronizationContext()).ContinueWith(t =>
{
… // #4 compute-intensive work we want offloaded to non-UI thread
});
}
But neither of those solutions are relevant if the code inside of the continuation is code you don’t own, e.g. if you’re calling out to some 3rd party code which might unsuspectingly use Task.Factory.StartNew without specifying a scheduler an inadvertently end up running its code on the UI thread. This is why in production library code I write, I always explicitly specify the scheduler I want to use.
For .NET 4.5, we introduced the TaskCreationOptions.HideScheduler and TaskContinuationOptions.HideScheduler values. When supplied to a Task, this makes it so that in the body of that Task, TaskScheduler.Current returns TaskScheduler.Default, even if the Task is running on a different scheduler: in other words, it hides it, making it look like there isn’t a Task running, and thus TaskScheduler.Default is returned. This option helps to make your code more reliable if you find yourself calling out to code you don’t own. Again with our initial example, I can now specify HideScheduler, and my bug will be fixed:
private void button_Click(…)
{
… // #1 on the UI thread
Task.Factory.StartNew(() =>
{
… // #2 long-running work, so offloaded to non-UI thread
}).ContinueWith(t =>
{
… // #3 back on the UI thread
Task.Factory.StartNew(() =>
{
… // #4 compute-intensive work we want offloaded to non-UI thread (bug!)
});
}, CancellationToken.None,
TaskContinuationOptions.HideScheduler,
TaskScheduler.FromCurrentSynchronizationContext());
}
One additional thing to note is around the new Task.Run method, which is really just a simple wrapper around Task.Factory.StartNew. We expect Task.Run to become the most common method for launching new tasks, with developers falling back to using Task.Factory.StartNew directly only for more advanced situations where they need more fine-grained control, e.g. over which scheduler to be targeted. I already noted that Task.Run specifies DenyChildAttach, so that no tasks created within a Task.Run task can attach to it. Additionally, Task.Run always specifies TaskScheduler.Default, so that Task.Run always uses the ThreadPool and ignores TaskScheduler.Current. So, even without HideScheduler, if I’d used Task.Run(Action) instead of Task.Factory.StartNew(Action) in my initially buggy code, it would have been fine.
LazyCancellation
Consider the following code:
Task a = Task.Run(…);
Task b = a.ContinueWith(…, cancellationToken);
The ContinueWith method will create Task ‘b’ such that ‘b’ will be scheduled when ‘a’ completes. However, because a CancellationToken was provided to ContinueWith, if cancellation is requested before Task ‘a’ completes, then Task ‘b’ will just immediately transition to the Canceled state. So far so good… there’s no point in doing any work for ‘b’ if we know the user wants to cancel it. Might as well be aggressive about it.
But now consider a slightly more complicated variation:
Task a = Task.Run(…);
Task b = a.ContinueWith(…, cancellationToken);
Task c = b.ContinueWith(…);
Here there’s a second continuation, off of Task ‘b’, resulting in Task ‘c’. When Task ‘b’ completes, regardless of what state ‘b’ completes in (RanToCompletion, Faulted, or Canceled), Task ‘c’ will be scheduled. Now consider the following situation: Task ‘a’ starts running. Then a cancellation request comes in before ‘a’ finishes, so ‘b’ transitions to Canceled as we’d expect. Now that ‘b’ is completed, Task ‘c’ gets scheduled, again as we’d expect. However, this now means that Task ‘a’ and Task ‘c’ could be running concurrently. In many situations, that’s fine. But if you’d constructed your chain of continuations under the notion that no two tasks in the chain could ever run concurrently, you’d be sorely disappointed.
Enter LazyCancellation. By specifying this flag on a continuation that has a CancellationToken, you’re telling TPL to ignore that CancellationToken until the antecedent has already completed. In other words, the cancellation check is lazy: rather than the continuation doing the work to register with the token to be notified of a cancellation request, it instead doesn’t do anything, and then only when the antecedent completes and the continuation is about to run does it poll the token and potentially transition to Canceled. In our previous example, if I did want to avoid ‘a’ and ‘c’ potentially running concurrently, we could have instead written:
Task a = Task.Run(…);
Task b = a.ContinueWith(…, cancellationToken,
TaskContinuationOptions.LazyCancellation, TaskScheduler.Default);
Task c = b.ContinueWith(…);
Here, even if cancellation is requested early, ‘b’ won’t transition to Canceled until ‘a’ completes, such that ‘c’ won’t be able to start until ‘a’ has completed, and all would be right in the world again.
0 comments