November 29th, 2007

Welcome to the Parallel Extensions team blog!

Stephen Toub - MSFT
Partner Software Engineer

Software is headed for a fundamental change.  Over the last 30 years, developers have relied on exponential growth in computing power in order to dream big.  Your cool new application is too slow today?  No problem, just wait two years and everyone will have computers that run twice as fast.  But as Herb Sutter wrote, “the free lunch is over” (if you haven’t read Herb’s excellent article, we recommend doing so).

Whereas the average PC clock speed increased more than 10x between 1993 and 1999, the average processor speed in the last four years hasn’t even doubled.  Instead, CPU manufacturers and designers are gently shifting the industry towards multi-core processors.  The average PC on the market today is a dual-core.  Next year, expect the average to be quad-core.  Two years after that, eight-core.  And so forth.  Unfortunately, most software out there today is inherently single-threaded and sequential in nature and will not take advantage of multiple processors, aside from the improved experience of multitasking between programs.  If, for example, in the next 10 years we experience 100-fold improvement in CPU performance, yet applications are still written without concurrency in mind so that they use just one processor, those applications will only utilize at most 1% of the available computing power.

The key to performance improvement in the future is to write programs that naturally scale up to multiple processors.  Unfortunately, it is still very hard to write parallel algorithms that actually take advantage of such architectures, and it’s very hard to write concurrent code that scales dynamically (without recompilation) as more parallelism becomes available in the hardware.  In fact, most applications use just a single core and see no speed improvements when run on a multi-core machine.  Even those that use threads today are usually motivated to do so by responsiveness and I/O, rather than by better performance.  We need to write our programs in a new way.

That’s where we come in.  Welcome to the Parallel Extensions team blog.  As part of the Parallel Computing Platform group at Microsoft, it’s our team’s primary goal and responsibility to enable all of you developers out there using .NET to successfully build concurrent applications.  This includes significantly advancing the state of the art for building concurrent managed applications, and laying the groundwork needed for the CLR and the .NET Framework to be the runtime and library of choice for apps in a massively parallel world.

So, how are we doing this?  We’re starting in a few ways.  

In our first release of Parallel Extensions to the .NET Framework, we’re providing declarative data parallelism support through Parallel Language Integrated Query (PLINQ).  To understand what PLINQ is, you first have to understand what LINQ is.  In the .NET Framework 3.5 and Visual Studio 2008, Language Integrated Query (LINQ) gives C# and Visual Basic programmers query syntax in the languages, with the capability to retrieve information from a wide array of data source types (databases, XML, in memory data, etc.) via language and API interfaces.  At the library level, these operations are exposed through the .NET Standard Query Operators, such that any .NET language can take advantage of them.

At a high level, PLINQ is an implementation of the Standard Query Operators that uses clever parallel execution techniques underneath the simple LINQ programming model.  PLINQ uses query analysis to determine what sort of algorithms and degrees of parallelism are possible and appropriate, and classic data parallelism techniques to execute them—such as filtering, mapping, reductions, loop tiling, sorts, and more—so that you don’t have to.  For a great introduction to PLINQ, see the PLINQ article in the October 2007 issue of MSDN Magazine.  Note that while this article is written about a prerelease version of PLINQ, very little at the API level should differ between now and when we ship, since most of PLINQ must adhere to the standard LINQ APIs.  For those of you interested in reading about PLINQ in other languages, you can read the same article in English, Spanish, French, German, Italian, Russian, Portuguese, Korean, Japanese, Simplified Chinese, and Traditional Chinese (thanks to MSDN Magazine for providing these translations).

Next, but just as important, we are providing support for imperative data and task parallelism with the Task Parallel Library (TPL).  TPL makes it much easier to write managed code that can automatically use multiple processors.  Whereas PLINQ is focused on data parallelism through declarative queries, TPL is focused on data and task parallelism expressed in a more imperative style of programming, suitable for cases where problems cannot be expressed as queries.   Using the library, you can conveniently express potential parallelism in existing sequential code, where the exposed parallel tasks will dynamically scale to run on all available processors.

For example, consider a bit of code that performs matrix multiplication:

void MultiplyMatrices(int size, double[,] m1, double[,] m2, double[,] result)
{
    for (int i = 0; i < size; i++) {
        for (int j = 0; j < size; j++) {
            result[i, j] = 0;
            for (int k = 0; k < size; k++) {
                result[i, j] += m1[i, k] * m2[k, j];
            }
        }
    }
}

Even for huge matrices, on a multi-proc computer, this will only utilize one processor.  Using TPL, we can scale it to use all available cores:

void MultiplyMatrices(int size, double[,] m1, double[,] m2, double[,] result)
{
    Parallel.For(0, size, delegate(int i) {
        for (int j = 0; j < size; j++) {
            result[i, j] = 0;
            for (int k = 0; k < size; k++) {
                result[i, j] += m1[i, k] * m2[k, j];
            }
        }
    }
}

Notice how little changed between the two code snippets, and yet on a quad-core box, assuming a set of large enough matrices, the latter will run almost four times faster, and on an eight core machine, almost eight times faster.  Of course, we fully realize that only a subset of the problems out there will scale perfectly with such a minimal change to using one of the Parallel constructs (Do, For, and ForEach), which is why TPL also provides a plethora of functionality focused at individual tasks.  In fact, the Parallel constructs are built on top of this underlying, public infrastructure. 

TPL provides a lot of the functionality developers have been asking for over the years from the .NET ThreadPool, such as separate, isolated thread pools, waiting support, and cancelation support.  It also uses highly scalable work-stealing algorithms that have been proven to better suit higher degrees of parallelism in hardware.  For more information on TPL, read the TPL article on it in the October 2007 issue of MSDN Magazine, the same issue that has the PLINQ article.  And as with the PLINQ article, thanks to the MSDN Magazine team, you can read it online in English, Spanish, French, German, Italian, Russian, Portuguese, Korean, Japanese, Simplified Chinese, and Traditional Chinese.  Note that as with most prerelease versions, the TPL API will change by the time it’s released in its final version, but the core functionality it provides will be very similar to what’s described in the MSDN Magazine article.  The first community technology preview (CTP) of Parallel Extensions already includes changes since the APIs discussed in the article, and we’ll follow up this post with a list of differences.

The question I imagine to be on many of your minds right now is likely, “when do we get bits?”  I’m happy to say that the answer is, “now!”  This morning, we’ve released the first CTP of Parallel Extensions to the .NET Framework, and you can download the bits from MSDN.  This is a very early look at what you’re working on, and we encourage you to try it out and provide as much feedback as you possible can.  We’re providing this CTP early on because we need your feedback to know if this is the right solution… please help us.  Post to our Microsoft Connect site, comment on our blog, and so on.  We look forward to hearing from you!

The first CTP contains previews of PLINQ and TPL.  Future releases will include additional functionality, such as scalable thread-safe collections, coordination and synchronization primitives, and more. This is just the beginning of great things to come.

Enjoy!
The Parallel Extensions Team

Author

Stephen Toub - MSFT
Partner Software Engineer

Stephen Toub is a developer on the .NET team at Microsoft.

0 comments

Discussion are closed.