Debugging, Profiling and Analyzing Parallel Applications

Any time a programming model is introduced, developers need robust tooling support for learning, writing, debugging and optimizing their code to make use of it. This is particularly true for parallel programming, which adds a set of new variables to the equation.

Visual Studio 2010 has made great strides in the parallel debugging experience. Many features are also available as add-ins for Visual Studio 2008. Here’s a brief tour of the parallel programming, debugging, and diagnostic features available in Visual Studio 2008 and upcoming in Visual Studio 2010.

Debugging

Although Visual Studio 2005 had a simple built-in debugger for MPI programs, it did not provide a full “F5” experience. The new add-in for Visual Studio 2008, which is also integrated into Visual Studio 2010, allows you to select a cluster head node, how many cores you want, and hit F5 to debug your MPI program.

Debugging MPI programs

In addition to the great core work that the debugger team has done, Allinea, a leader in parallel debugging technologies, has ported their environment to Visual Studio.  Allinea’s add-in enables even further streamlined MPI-specific debugging, including rank based context switching, group-wise step, pause, and run, parallel stack view, and lamination. Below is Allinea’s MPI debugging environment:

Allinea

Service Oriented Architecture Debugging

One of the key new programming models introduced in Windows HPC Server 2008 was Cluster SOA, built on WCF with advanced scheduling and load balancing provided by HPC’s scheduler/broker. Up until now, debugging Cluster SOA was limited to basic WCF/.Net style debugging with no cluster integration. In Visual Studio 2010, an add-in for Cluster SOA enables the SOA Settings tab, allowing you to choose a head node, debug nodes and services, deploy runtime libraries and clean up automatically. Here’s a peak at the new SOA debugger in Visual Studio 2010:

SOA Debugging

Profiling

Integrated MPI-aware profiling was not available in Windows Server HPC 1.0. With Windows HPC Server 2008, tools such as XPerf enabled MPI profiling as well as system-level profiling and troubleshooting. But even XPerf really didn’t know much about the details of MPI message traffic, and no message traffic viewers existed. Since then, Vampir, the premier MPI message traffic viewer, has been ported to Windows and fully integrated with ETW. Vampir allows you to troubleshoot message ordering and delays. Various open source HPC tools are available as well, such as JumpShot, a free Java-based MPI message viewer.

Often times, the built-in VS Profiler can offer insight into performance issues. In Visual Studio 2010, this capability has been fully integrated with the HPC job scheduler to help analyze the behavior of a particular MPI rank or node. The Visual Studio MPI profiler shows line-level profile information, including a temperature view of execution, side-by-side with source view:

Visual Studio MPI Profiler

The profiler also shows a comparison report across multiple runs or builds so you can easily see the effect of your changes.

Comparison Report

MPI Runtime Analysis

Beyond debuggers and profilers, sometimes you need specialized analysis tools to help with the complexities of large scale parallel programs. HLRS/ZIH at Stuttgart, a leading institute in Germany, has ported Marmot, their dedicated MPI analysis tool, to Visual Studio 2008. Marmot can be used to check the validity of parameters passed to MPI calls and detect irreproducibility, deadlocks, and incorrect management of resources. Below is Marmot in action:

Marmot

From Printf to Integrated Profiling and Debugging

In a world where printf-style debugging was the norm not long ago, state-of-the-art debugging and profiling tools have taken a major step forward.

From within Visual Studio, you can debug and profile native as well has high performance MPI and Cluster SOA applications that scale from hundreds to thousands of cores. You can use XPerf and ETW to get a truly holistic view of the application in the context of the whole system. The new multi-core profiling and debugging tools that were introduced in Visual Studio 2010 can be effectively used on a cluster at the node-level as well.

Visual Studio is becoming a rich and productive environment for writing parallel programs of all types. To find out more about Windows HPC programming models, visit the Windows HPC Server Developer Resource Center. You can find a suite of samples that use various parallel programming models on the CodePlex Parallel Dwarfs site.

Namaste!