Improved OpenMP Support for C++ in Visual Studio

Avatar

Bran

As devices with multiple cores and processors became ubiquitous, programming languages adapted to provide developers with control over how tasks are divided across processors. The OpenMP application program interface for C, C++, and Fortran was originally developed in the 1990s for this purpose, and today the standard continues to evolve to support new scenarios, such as off-loading to additional devices and providing more fine-grained control over which threads execute which tasks.

Microsoft Visual Studio has supported the OpenMP 2.0 standard since 2005. In the initial release of Visual Studio 2019 we added the -openmp:experimental switch to enable minimal support for the OpenMP SIMD directive first introduced in the OpenMP 4.0 standard.

Our OpenMP Plans

Starting with Visual Studio 2019 version 16.9 we have begun adding experimental support for newer versions of the OpenMP standard in a more systematic way. As a first step, we added the option to generate code compatible with LLVM’s OpenMP runtime library (libomp) on the x64 architecture. Going forward, support for additional OpenMP features will leverage LLVM’s OpenMP runtime. When we find issues in the LLVM OpenMP runtime on Windows, we will fix them in the version of libomp we ship and contribute fixes back to the LLVM community once they have been tested.

Moving forward, our next step for OpenMP support will be to support the additional features added in the OpenMP 3.1 standard on the x86 and arm64 architectures alongside x64. Then we will add support for the pragmas and clauses added in the OpenMP 4.5 standard that do not involve offloading. Which features are added after that will depend on user feedback. We would love to hear which specific OpenMP features you would like to see, so we can prioritize which features to support first.

New -openmp:llvm switch

A program can be compiled to target the LLVM OpenMP runtime by using the new experimental CL switch -openmp:llvm instead of -openmp. In Visual Studio 2019 version 16.9 the -openmp:llvm switch only works on the x64 architecture. The new switch currently supports all the same OpenMP 2.0 directives as -openmp, as well as support for unsigned integer indices in parallel for loops according to the OpenMP 3.0 standard. Support for more directives will be added in future releases. The -openmp:llvm switch is compatible with all the SIMD directives supported by the -openmp:experimental switch.

Compiling an executable with the -openmp:llvm switch automatically adds a dynamic link to the appropriate libomp DLL. In order for the executable to run, it will need access to either libomp140d.x86_64.dll (if compiled with /DEBUG) or libomp140.x86_64.dll. These DLLs can be found in the Visual Studio installation directory under the Program Files or Program Files (x86) directory at VC\Redist\MSVC\<version>\debug_nonredist\x64\Microsoft.VC142.OpenMP.LLVM and will be automatically included in the PATH if the executable is run from an x64 NativeTools command prompt.

As the -openmp:llvm switch is still experimental, both the release and debug versions of the runtime still have asserts enabled, which makes detecting incorrect behavior easier but will affect performance. The DLLs were compiled with CMAKE_BUILD_TYPE=RelWithDebInfo and LLVM_ENABLE_ASSERTIONS=ON. Future versions of the libomp DLLs may not be backwards compatible and the current version of these DLLs is not redistributable.

The -openmp:llvm switch is not compatible with /clr or /ZW.

Improvements with -openmp:llvm

Using the -openmp:llvm switch enables a few correctness fixes. In Visual Studio version 16.9 Preview 3 the lastprivate clause in #pragma omp sections is now correctly handled. When used with sections, the lastprivate clause guarantees that on exiting a sections block the variables listed in the clause will be set equal to the private version of that variable from the last section. For example, after executing the following code the value of x will be 6.

int x = 0;
#pragma omp parallel sections lastprivate(x)
{
   #pragma omp section
   x = 4;
   #pragma omp section
   x = 6;
}

Visual Studio 2019 version 16.9 Preview 4 also includes fixes to the optimizer to correctly handle OpenMP constructs. MSVC will now avoid moving writes across an implicit or explicit flush boundary. Take the following code using #pragma omp flush as an example:

x = 7;
#pragma omp flush
if (omp_get_thread_num() == 0) {
    x = 10;
}

In some cases, previous versions of the compiler could incorrectly optimize away the potential double write to x by changing this code to:

#pragma omp flush
x = (omp_get_thread_num() == 0) ? 7 : 10;

However, this optimization does not respect the barrier guaranteed by the #pragma omp flush. With the original code, as omp_get_thread_num() returns 0 for exactly one thread in the group, only that thread would write to x after the flush point and x would be 10. Because after the optimization other threads could write to x after the flush point and create a race condition, the optimization was not legal.

The optimizer will also properly recognize that even a variable local to a function can be changed by other threads inside of an OpenMP parallel region. For example, in the following code the value of shared in the x > shared test can not be replaced with -1 because another thread could have written to shared since the initial assignment:

int shared = -1;
#pragma omp parallel
{
    unsigned int x = omp_get_thread_num();
    #pragma omp critical
    {
        if (x > shared) {
            shared = x;
        }
    }
}

New Features with -openmp:llvm

In addition to correctness fixes, the new -openmp:llvm switch already supports a few features added in the OpenMP 3.0 standard. Parallel for loops may now use unsigned integers as indices. Limited support for #pragma omp task has been added, but clauses on the task pragma are not guaranteed to work. Due to the many limitations in #pragma omp task at this time, the pragma is only supported under the -openmp:experimental switch.

Feedback

We encourage you to try out this new feature in Visual Studio 2019 version 16.9 Preview. As always, we welcome your feedback. If you encounter a correctness issue in code generated with the -openmp:llvm switch or bugs in the libomp140 DLLs shipped with Visual Studio, please let us know. We can be reached via the comments below, via twitter (@visualc), or via Developer Community.

10 comments

Leave a comment

  • Avatar
    Stefan Jokisch

    The most important feature for me would be an equivalent to Intel’s kmp_set_blocktime() extension. This function sets the amount of time worker threads keep waiting (“spinning”) for more work before they go to sleep.

    This sounds like a minor thing, but it’s vital for processing video images in real time. Since Visual C++ 2010 all CPU time in-between frames is consumed by worker threads waiting for work. This is bad since I still need CPU time for managing the user interface and other tasks.

    To this day I link vcomp.lib from C++ 2005. I fear this will no longer be possible once modern OMP features have been implemented.

  • Avatar
    Eske Hjalmer Bergishagen Christiansen

    Could be nice to also get GPU offloading into this.
    This fails to compile with ‘target’: expected an OpenMP directive name.
    Was hoping for a bit more. 🙁

    #include 
    #include 
    #include 
    
    int main() {
        int n = 1000000000;
        double total = 0;
        #pragma omp target teams distribute parallel for map(tofrom: total) map(to: n) reduction(+:total)
        for (int i = 0; i < n; ++i) {
            total += exp(sin(3.14 * (double)i / 123.45));
        }
        std::cout << "total is " << total << '\n';
    }
    
    • Avatar
      Bran HaggerMicrosoft employee

      We wish we had more to offer. 🙂

      The target directive was added to the OpenMP standard in version 4.0. We plan to add support for features added in OpenMP 4.0 in the future, but for now we are still working on adding support for the new features added in OpenMP 3.0 and 3.1.

  • Avatar
    simmse

    It is highly important to fully support all of the simd directives and user defined reduction indicated in the blog from almost two years ago. That comment also lists the easiest way to add new features is fully support OpenMP 4.5+. Since the latest OpenMP standard is now 5.1 as of November 13, 2020, that is the new feature request list. Doing so will promote industry use and increase the Visual Studio C++ compiler usage throughout. It would be highly beneficial to support the C++ attributes too (contained in the OpenMP 5.1 standard). There are software engineers who refuse to use OpenMP because of the #pragma omp directives.

    • Avatar
      Bran HaggerMicrosoft employee

      Thank you for the feedback. Knowing which features you use the most is very helpful. We are working to meet more recent versions of the OpenMP standard, but we have a lot of catching up to do and it will take some time to add full support for the most recent version of the standard, so knowing which features you use the most makes sure we implement those features sooner rather than later.

      Based on your comment it sounds like the reduction clause on simd directives and the OpenMP 5.1 standard C++ attribute versions of omp directives are the most important features for you. Is that accurate?