June 7th, 2022

MSVC Backend Updates in Visual Studio 2022 version 17.2

Chris Pulido
Software Engineer

In Visual Studio 2022 version 17.2 we have continued to improve the C++ backend with new features, new and improved optimizations, build throughput improvements, and better security. Here is a list of improvements for you to review.

  • OpenMP: The task directive as defined by OpenMP 3.1 is supported for -openmp:llvm, including all the clauses. Note that the compiler does not yet support `task` clauses added in later versions of OpenMP. See more details in OpenMP Task Support for C++ in Visual Studio.
  • Implemented Intel intrinsic functions for the AVX512-FP16 instruction set extension. More information about these functions can be found on the Intel Intrinsics Guide.
  • Implemented Intel intrinsic functions _castf32_u32, _castf64_u64, _castu32_f32, and _castu64_f64 to cast between floating point values and integer values without conversion on x64 and x86. More information about these functions can be found on the Intel Intrinsics Guide.
  • New ARM64 compiler flags: /Zc:arm64-aliased-neon-types- and /Zc:arm64-aliased-neon-types. When you pass /Zc:arm64-aliased-neon-types- to cl.exe, the compiler will treat NEON intrinsic types as distinct types for ARM64 as defined by the Procedure Call Standard for the Arm 64-bit Architecture, which is consistent with Clang and GCC. This flag is opt-in, so ARM64 NEON intrinsic code that compiled with previous versions of MSVC will still compile when you upgrade. /Zc:arm64-aliased-neon-types (without the minus sign at the end) is the default behavior.
    • For example, consider two function declarations, void foo(float32x4_t) and void foo(int32x4_t). By default, MSVC considers these two the same declaration, and attempting to define them both would lead to a multiple definition error. With /Zc:arm64-aliased-neon-types-, MSVC will treat them as Clang and GCC would.
  • New ARM64 compiler flags: /arch:armv8.0 and /arch:armv8.1. These new flags allow the compiler to generate instructions that were introduced and required by the specified architecture extension. `/arch:armv8.0` is the current default behavior and is the same as if you didn’t specify it. In 17.2, /arch:armv8.1 allows the _Interlocked* intrinsic functions to use the appropriate atomic instruction that was introduced with the ARMv8.1 extension, FEAT_LSE.
  • New and improved optimizations
    • The C standard library functions log2 and log2f have been implemented as compiler intrinsic functions on x64 and ARM64. This allows the compiler to perform optimizations with log2 and log2f under /fp:fast x64 and ARM64.
    • Improved auto-vectorizer loop recognition. The auto-vectorizer now recognizes the average pattern and more cases of decrementing induction variables.
    • More peephole optimizations for multiple targets.
    • Improved load/store pairing on ARM64.
  • ARM64EC
    • Compiler flags incompatible with the /arm64EC flag are now rejected. This includes all CLR flags, /Gy-, and /Gw-.
    • Added the /MACHINE:ARM64EC flag to link.exe, and removed it from lib.exe. For lib.exe, you should specify /MACHINE:ARM64X.
    • When /arm64EC is passed to cl.exe and cl.exe also invokes link.exe, /MACHINE:ARM64EC will be passed by default to link.exe.

Do you want to experience the new improvements of the C++ backend? Please download the latest Visual Studio 2022 and give it a try! Any feedback is welcome. We can be reached via the comments below, Developer Community, and Twitter (@VisualC)

Category
C++

Author

Chris Pulido
Software Engineer

Software engineer in the backend team

1 comment

Discussion is closed. Login to edit/delete existing comments.

Newest
Newest
Popular
Oldest
  • Roger B

    Thanks for the update, it’s been a long time coming.

    That said, after reading a recent reddit thread about this very topic, I hope the above is just a start of what’s to come.

    I’d hope that as backend improvements are committed, said PR’s would have evidence of the improvements in their descriptions. Descriptions that could perhaps just be copy/pasted into an update like this in the future.

    Benchmark timings, assembly diffs of before and after, examples of code size improvements, examples of how the new auto-vectorization can recognize a given loop etc. Those would be very useful to see here. The C# guys, c freakin sharp, are eating much better than us over here still.

Feedback