July 9th, 2009

Parallel.Invoke() vs. Explicit Task Management

 

 

Parallel Extensions offers a large variety of APIs supporting parallelism.

During this blog the discussion will be focused on the methodology for making a choice between two of the new Parallel Extensions concepts: parallelism achieved by using Parallel.Invoke() and parallelism achieved through the use of Tasks.

 

Suppose that you wanted the two actions below to be executed in parallel:

 

Action hello = () => { Console.Write(“Hello”); };

Action world = () => { Console.Write(“World”); };

 

Should you use Tasks or Parallel.Invoke() to achieve the desired parallelism?

Parallelism through Parallel.Invoke()

The code to execute our actions in parallel with Parallel.Invoke() looks like this:

 

Parallel.Invoke(hello, world);

 

This is simple and easy, and we get the desired result: either “HelloWorld” or “WorldHello” output to the console, depending on the order in which the parallel actions were scheduled.

Parallelism through Explicit Task Management

For convenience, creating and starting a Task can be done with a single method:

 

Task.Factory.StartNew(hello);

Task.Factory.StartNew(world);

 

If we run this code sequence, something surprising happens – nothing is displayed! The reason is that the Tasks are scheduled via ThreadPool threads, which are background threads.  The main process will execute while the Tasks are still running.  At the same time, if one of the actions throws an exception, the exception will not be thrown until the Finalizer throws it. We can solve both of these problems by performing an explicit Wait() on the Tasks:

 

Task taskHello = Task.Factory.StartNew(hello);

      Task taskWorld = Task.Factory.StartNew(world);

      Task.WaitAll(taskHello, taskWorld);

 

Running this sample now, the result is the same as in the Parallel.Invoke case:  either “HelloWorld” or “WorldHello” is printed to the console.

An Example: Tree Traversal

Let’s see now how we can implement a binary tree traversal using Parallel.Invoke():

 

ConcurrentBag<int> _dataStorage = new ConcurrentBag<int>();

 

void TreeTraversal(Tree<int> node)

{

    if (node == null)

        return;

   

    var actions = new List<Action>();

    if(node._left != null)

        actions.Add(() => TreeTraversal(node._left));

    if(node._right != null)

        actions.Add(() => TreeTraversal(node._right));

 

    Parallel.Invoke(actions.ToArray());

    _dataStorage.Add(node._data);

}

 

And let’s look at a similar version using explicit Task management:

 

ConcurrentBag<int> _dataStorage = new ConcurrentBag<int>();

 

void TreeTraversal(Tree<int> node)

{

    if (node == null)

        return;

 

    var tasks = new List<Task>();

    if(node._left != null)

        tasks.Add(Task.Factory.StartNew(

            () => TreeTraversal(node._left));

    if(node._right != null)

        tasks.Add(Task.Factory.StartNew(

            () => TreeTraversal(node._right));

   

    _dataStorage.Add(node._data);

   

    Task.WaitAll(tasks.ToArray());

}

 

Note that the syntax is more concise in the Parallel.Invoke() version, and the Parallel.Invoke() call takes care of waiting and exception handling.  The flip side of that is that the explicit task management version allows for more control.  Note that the “_dataStorage.Add(node._data);” operation can be performed in parallel with the processing of the left and right branches when explicit task management is used, while the same operation must wait for left/right processing to complete in the Parallel.Invoke() version.

Conclusions

Parallel.Invoke() is a higher-level mechanism for providing parallelism, and allows for more concise code that one would typically get from using explicit task management. 

However, if the coder is interested in more control, perhaps for more complicated scenarios, then explicit task management is probably the way to go.

Author

0 comments

Discussion are closed.