April 29th, 2009

What’s new in Beta 1 for Parallel LINQ (PLINQ)?

A number of improvements have been made to Parallel Extensions since the Visual Studio 2010 CTP across the Task Parallel Library (TPL), Parallel LINQ (PLINQ), and our coordination data structures.  You can find the latest on TPL (1 2 3) and the data structures (link) on this blog.  Here are the big changes to PLINQ since that CTP:

  • With- Operators Pattern
  • Execution Mode
  • Cancellation
  • Custom Partitioning
  • Refactoring
  • Merge Options
  • AsMerged Renamed Back to AsSequential
  • Binary Operators Now Require AsParallel on Both Sides
  • Performance Improvements
  • Removed Seldom Used Operators

With- Operators Pattern

To parameterize PLINQ, we have added a set of controls beginning with With-.  As we began to add controls to PLINQ, it became clear that pretty soon the number of overloads on AsParallel, Range, and Repeat would increase dramatically.  This pattern resolves that and increases readability. Compare:

OLD: e.AsParallel(4)

NEW: e.AsParallel().WithDegreeOfParallelism(4)

The With- methods are currently:

  • WithDegreeOfParallelism
  • WithExecutionMode (see Execution Mode below)
  • WithCancellation  (see Cancellation below)
  • WithMergeOptions (see Merge Options below)

The latter 3 have their own sections below.

Execution Mode

PLINQ does its best to ensure that it will never behave much worse than LINQ-to-Objects.  For example, if a LINQ-to-Objects query needs a constant amount of memory to execute, PLINQ also aims to use only a constant amount of memory on the same query.  Or, if a LINQ-to-Objects query has low overhead per element processed, the PLINQ query should also have low overhead per element.  If PLINQ cannot meet these constraints by executing the query in parallel, it will execute the query sequentially.

Note that while the goal will be to never do much worse than LINQ to Objects, it is very unlikely that we will be able to achieve that goal for all queries. One suggestion that comes up regularly is to use a smart algorithm that does some sort of time measurements and switches back and forth between parallel and sequential execution. This is significantly more problematic in practice than it may sound, and we have no plans in this direction for V1.

PLINQ makes this decision solely based on query shape.  It does not look into size of the data set or execution time of the delegates.  If you believe that you will not see speedup in your query, consider comparing the versions of the query with and without AsParallel.  Queries that will execute sequentially in the conservative mode include:

  • Queries contain indexed Select, indexed Where, indexed SelectMany, or ElementAt in a position where the indices are no longer in the original order. Index ordering is senstive to operators that change ordering (e.g. OrderBy) and operators that remove elements (e.g. Where).
  • Queries that contain operators Take, TakeWhile, Skip, SkipWhile when indices are not in the original order (see above bullet).
  • Queries that contain Zip, SequenceEquals, unless one of the data sources has an originally ordered index and the other data source is indexible (i.e. an array or IList<T>).
  • Queries that contain Concat, unless it is applied to indexible data sources.
  • Queries that contain Reverse, unless applied to an indexible data source.

A certain class of operators require the data to be partitioned with matching data, this is best achieved through hash-partitioning. Our other partitioning types (chunk, range, and striped) would not organize this data correctly for matching. These operators include Join, GroupBy, GroupJoin, Distinct, Union, Intersect, and Except. As for hash-partitioning operators From the data that we have, it seems that our hash-partitioning operators do not get much of a speedup, but do not inhibit it too drastically either. So, at least presently, hash-partitioning operators will not cause a query to execute sequentially (in the conservative mode). This allows for speedup when these operators combine with other operators.

The ParallelExecutionMode enum is added to the System.Linq namespace. It has Default and ForceParallelism values. If you believe that your query can in fact get speedup, but that we’re currently dropping back to sequential, write your query as

e.AsParallel().WithExecutionMode(ParallelExecutionMode.ForceParallelism)….

to override PLINQ’s conservative behavior.

Over time, we will likely be able to support more and more query shapes, so please let us know which ones are important to you so that we can consider them for perf tuning.

Cancellation

Just as Tasks, Parallel, BlockingCollection, Barrier, CountdownEvent, ManualResetEventSlim, and SemaphoreSlim are plumbed for cancellation, so is PLINQ.  Cancellation patterns have been added to the .NET Framework 4.0, and our team has rebuilt our system on top of them to provide an elegant, common pattern to exit operations.  A dedicated topic on cancellation is coming soon, though we’ll jump in right here with how PLINQ uses the types.

Preparie a query to allow cancellation by using WithCancellation on a query. Here’s an example usage pattern with Cancellation split between two query executions:

var cts = new CancellationTokenSource();

var q = a.AsParallel().WithCancellation(cts.Token).Where(x=>Filter(x)).Select(x=>DoWork(x);

— separate thread —

foreach (var e in q) { … }  // Statement 1

— separate thread —

var l = q.ToList(); // Statement 2

— separate thread —

cts.Cancel(); // this will attempt to cancel any in-flight queries,

// including both statements 1 and 2

Note on Dispose – Disposing the enumerator, as is done automatically by the foreach pattern, will also cancel the plinq work occurring in the background.

Custom Partitioning

There are some data types and data sources that PLINQ will not know the best way to partition.  PLINQ optimizes for general cases and common cases, but sometimes you want control over how the data can best be partitioned in PLINQ.

So, the new Partitioner<TSource> abstract class, the new OrderablePartitioner<TSource> abstract class, and the new Partitioner factory classes have been created to allow you greater control over partitioning for increased performance.    For PLINQ, there are overload extension methods of AsParallel that will hook these types into PLINQ.

This is a weighty topic and will have its own future post. 

Renamings and Refactorings

Because there was never an intent or reason for others to extend IParallelEnumerable, IParallelEnumerable<TSource>, nor IParallelOrderedEnumerable<TSource>, these have been changed from interfaces to abstract classes which cannot be extended.  This avoids the confusion on how they’re supposed to be used.

OLD –> NEW

IParallelEnumerable –>  ParallelQuery

IParallelEnumerable<T> –> ParallelQuery<TSource>

IParallelOrderedEnumerable<T> –> OrderedParallelQuery<TSource>

While we were at it, we improved type names in generics, changing T –> TSource, and we ensured that all of the right extension methods were available for the non-generic ParallelQuery class.

ParallelMergeOptions

Handling of ParallelMergeOptions has been moved out of AsMerged.  Merge buffering is now specified via the  WithMergeOptions method.

AsMerged Renamed Back to AsSequential

In the last CTP we renamed AsSequential to AsMerged when we added an overload to take MergeOptions. Now, we renamed it back to AsSequential so that it would better pair with AsParallel.

Binary Operators Now Require AsParallel on Both Sides

We have changed the handling of the operators in LINQ which task 2 data sources to ensure that ordering expectations are more explicit.  There are a number of operators in LINQ which take 2 data sources. Here are those operators with overload counts in parentheses ():  Zip (1), Join (2), GroupJoin (2), Concat (1), SequenceEqual (2), Union (2), Intersect (2), Except (2).

Some of these, such as SequenceEqual make no sense without ordering.  Others do make sense sense without ordering—and in fact, would be preferred this way—and some could go either way.  Rather than make all kinds of mystical decisions without involving you, steps have been taken for developers to make a decision.

Previously, queries taking two input sequences could be parallelized as follows:

a.AsParallel().AsOrdered().Zip(b, (x, y) => x*y);

These operators, of the form Operator(this ParallelQuery<T>, Enumerable<T>, …), now throw a NotSupportedException at runtime and are marked as obsolete*.

Now, to parallelize these queries, developers must choose:

a.AsParallel().AsOrdered().Zip(b.AsParallel(), (x, y) => x*y);

OR

a.AsParallel().AsOrdered().Zip(b.AsParallel().AsOrdered(), (x, y) => x*y);

* You may wonder why these operators are marked as obsolete during their introductory release. We did this to maintain parity with LINQ-to-Objects and other LINQ providers as much as possible. One of our primary design goals is to “Just add AsParallel” and you can see speedup. In this case, we wanted to have the same methods as in LINQ-to-Objects, but use the obosolete methods to encourage the addition of AsParallel on the second source as well.

Performance Improvements

A number of performance improvements have been undertaken, including:

1. Order-preserving pipelining merges – Previously, just putting AsOrdered on a query forced the entire query to execute before a single element is yielded.  This is now optimized so that elements from the query can be yielded as they are produced with Default (AutoBuffered) and NotBuffered values of MergeOptions.

2. Improved partitioning fairness for data sources that don’t implement IList<T>

3. Better performance of some queries over IList<T> or an array

4. Chunk partitioning size tuning – Chunk partitioning is the most common partitioning scheme for queries over data sources other that IList<T> and arrays (i.e. non-indexible data sources). The sizes of these chunks now grow as more and more chunks are accessed. This is to balance between the cases of a) queries with small data sets but expensive delegates in the queries and b) queries with large data sets but inexpensive delegates in the queries.

5. Removal of likely false sharing cases, which have had up to 6x improvements in some cases

Removed Seldom Used Operators

A few PLINQ-specific operators that don’t really have use cases or existed simply for performance, yet did not show gains over existing LINQ operators, have been removed.  Reduced surface area allows us to focus more on the important stuff for you and lowers the count of concepts you need to know.   If you don’t notice which ones, good!

I hope that you enjoy using the Beta and send us lots of feedback!

Topics
PLINQ

Author

0 comments

Discussion are closed.