# Recursion and Concurrency

Stephen Toub - MSFT

When teaching recursion in an introductory computer science course, one of the most common examples used involves a tree data structure.  Trees are useful in this regard as they are simple and recursive in nature, with a tree’s children also being trees, and allow for teaching different kinds of traversals (in-order, pre-order, post-order, level-order, and so forth).  But when these introductory concepts meet multithreading, the concepts are no longer so simple, at least not with the tools available in mainstream languages today like C#, C++, and Java. Consider a simple Tree data structure:

class Tree<T>
{
public Tree<T> Left, Right; // children
public T Data; // data for this node
}

Let’s say we want to execute an Action<T> for each datum stored in the tree, and let’s assume I don’t care about order (introducing parallelism could be questionable if I did).  That’s straightforward to do sequentially and recursively:

public static void Process<T>(Tree<T> tree, Action<T> action)
{
if (tree == null) return;

// Process the current node, then the left, then the right

action(tree.Data);
Process(tree.Left, action);
Process(tree.Right, action);
}

I could also do so without recursion by maintaining an explicit stack (or queue, or some other data structure with different ordering guarantees):

public static void Process<T>(Tree<T> tree, Action<T> action)
{
if (tree == null) return;
var toExplore = new Stack<Tree<T>>();

toExplore.Push(tree);
while (toExplore.Count > 0)
{
// Grab the next node, process it, and push its children

var current = toExplore.Pop();
action(current.Data);
if (current.Left != null)

toExplore.Push(current.Left);
if (current.Right != null)

toExplore.Push(current.Right);
}
}

Now, let’s assume the action we’re performing on each node of the tree is independent, relatively expensive and/or that the tree is relatively large, and as such we want to process the tree in parallel (we’re of course also assuming that the action delegate is thread-safe), meaning that we want multiple threads each running the action delegate on distinct tree nodes.  How do we do this with what we have in .NET today?

There are multiple approaches, some more valid than others.  The first thing someone might try is to follow the original recursive implementation but using the ThreadPool, which could look something like this:

public static void Process<T>(Tree<T> tree, Action<T> action)
{
if (tree == null) return;

// Use an event to prevent this method from
// returning until its children have completed

using (var mre = new ManualResetEvent(false))
{
// Process the left child asynchronously

{
Process(tree.Left, action);
mre.Set();
});

// Process current node and right child synchronously
action(tree.Data);
Process(tree.Right, action);

// Wait for the left child
mre.WaitOne();
}
}

The idea behind this implementation is to, given a node, spin up a work item to process that node’s left child in parallel with the current node, and then process the current node’s data as well as its right child.  Of course, I could be losing out on some parallelism here as I delay processing of the right child until I’m done processing the current data.  So we modify it slightly:

public static void Process<T>(Tree<T> tree, Action<T> action)
{
if (tree == null) return;

// Use an event to wait for the children
using (var mre = new ManualResetEvent(false))
{
int count = 2;

// Process the left child asynchronously
{
Process(tree.Left, action);
if (Interlocked.Decrement(ref count) == 0)
mre.Set();
});

// Process the right child asynchronously
{
Process(tree.Right, action);
if (Interlocked.Decrement(ref count) == 0)
mre.Set();
});

// Process the current node synchronously
action(tree.Data);

// Wait for the children
mre.WaitOne();
}
}

Obviously, we need a better implementation. Next up, we can walk the tree sequentially, queuing up a work item in the pool for each node in the tree.  Here we take that approach based on the recursive implementation of a tree walk:

public static void Process<T>(Tree<T> tree, Action<T> action)
{
if (tree == null) return;

// Use an event to wait for all of the nodes to complete
using (var mre = new ManualResetEvent(false))
{
int count = 1;

// Recursive delegate to walk the tree
Action<Tree<T>> processNode = null;
processNode = node =>
{
if (node == null) return;

// Asynchronously run the action on the current node

Interlocked.Increment(ref count);
{
action(node.Data);
if (Interlocked.Decrement(ref count) == 0)
mre.Set();
});

// Process the children

processNode(node.Left);
processNode(node.Right);
};

// Start off with the root node