Visual Studio 2022 17.11 brings new optimizations, intrinsics, features, and improvements to the MSVC backend. Check out the highlights below:
- Performance improvements and additional functionality for all architectures:
-
- The SLP vectorizer can now recognize when vectors need to be permuted and when elements of a vector are defined by different operations.
-
- Updated corruption handling in DIA.
-
- Addressed bugs for debug records emitted during
/LTCG:INCREMENTAL
.
- Addressed bugs for debug records emitted during
-
/OPT:ICF
is now more C++ compliant.
-
- Fixed race conditions in linker while emitting debug information.
-
- Added a new dead code elimination phase to reduce binary bloat.
-
- Added a strength reduction phase to replace expensive instructions which equivalent cheaper instructions in the lexical loop optimizer.
-
- Improved
/guard:cf
compilation speed for many try-catch regions.
- Improved
-
- Process clang-cl OBJ files faster
- Performance improvements and additional functionality for ARM64:
-
- Fixed pointer parameter alignment during conversion of by-value parameters to by-reference.
-
- Support popcnt instructions,
__popcnt
,popcnt16
, andpopcnt64
, thanks to our friends at ARM.
- Support popcnt instructions,
-
- The vectorizer now recognizes and uses ARM64 intrinsics, when possible.
-
- Generate better instructions for
VCREATE
usingFMOV
, thanks to our friends at ARM.
- Generate better instructions for
-
- Add RBIT instrinics,
_bitrev
and_bitrev64
.
- Add RBIT instrinics,
-
- Support expanding
memcmp
when comparison length is constant, thanks to our friends at ARM.
- Support expanding
-
- Adds support for Load-Acquire RCpc instructions v2 (
FEAT_LRCPC2
), /feature:rcpc2
- Adds support for Load-Acquire RCpc instructions v2 (
-
- Add support for
CMP
,XZR
, andXm
in disassembler, thanks to our friends at ARM.
- Add support for
-
- Generate better code for bitfield selection and assignment, thanks to our friends at ARM.
- Improvements to ARM64EC:
-
- Considers hybrid-patchable functions during long-branch optimizations
/OPT:LBR
.
- Considers hybrid-patchable functions during long-branch optimizations
-
- Now emits virtual functions with
__declspec(hybrid_patchable)
.
- Now emits virtual functions with
-
- Fixed imports with fast-forward sequences.
-
- Thunk generation in C++ files, using
_Arm64XGenerateThunk()
, does not require a prototype. Still requires#include <intrin.h>
.
- Thunk generation in C++ files, using
-
- Fixed pointer parameter alignment during conversion of by-value parameter to by-reference.
-
- Sped up delay loading a DLL by enhancing page protection.
- Performance improvements and additional functionality for x86 and x64:
-
- Optimized vector reduction for E-Core CPUs, thanks to our friends at Intel.
-
- Fix float conversions on x86 for standard compliance, thanks to our friends at Intel.
-
- More loops get vectorized, thanks to our friends at AMD.
-
- New functionality on x64:
-
-
- Optimize FMA generation for blended code, removes
/favor:ATOM
restriction , thanks to our friends at Intel.
- Optimize FMA generation for blended code, removes
-
-
-
- Add support for
USER_MSR
intrinsics,urdmsr
anduwrmsr
, thanks to our friends at Intel.
- Add support for
-
-
-
- Add instructions in assembler and disassembler, thanks to our friends at Intel:
-
-
-
-
- Flexible Return and Event Delivery (FRED) includes
ERETS
andERETU
- Flexible Return and Event Delivery (FRED) includes
-
-
-
-
-
- LKGS which is required for FRED
-
-
-
-
- Enable vector loop unroller to be more aggressive, as required, thanks to our friends at AMD.
-
-
-
- Add support for saturating add and subtract intrinsics:
-
-
-
-
_sat_add_i8
-
-
-
-
-
_sat_add_i16
-
-
-
-
-
_sat_add_i32
-
-
-
-
-
_sat_add_i64
-
-
-
-
-
_sat_add_u8
-
-
-
-
-
_sat_add_u16
-
-
-
-
-
_sat_add_u32
-
-
-
-
-
_sat_add_u64
-
-
-
-
-
_sat_sub_i8
-
-
-
-
-
_sat_sub_i16
-
-
-
-
-
_sat_sub_i32
-
-
-
-
-
_sat_sub_i64
-
-
-
-
-
_sat_sub_u8
-
-
-
-
-
_sat_sub_u16
-
-
-
-
-
_sat_sub_u32
-
-
-
-
-
_sat_sub_u64
-
-
Do you want to experience the new improvements in the C++ backend? Please download the latest Visual Studio 2022 and give it a try! Any feedback is welcome. We can be reached via the comments below, Developer Community, X (@VisualC), or email at visualcpp@microsoft.com.
Stay tuned for more information on updates to the latest Visual Studio.
I noticed that “The SLP vectorizer can now recognize when vectors need to be permuted and when elements of a vector are defined by different operations.” It’s interesting to see these enhancements, but I’ve encountered an issue in version 11.10 where it optimizes a portion of my code, leading to a performance drop by a factor of five. I detected that this is due to the SLP optimization. Could you please advise if there is a way to verify that the optimized result is correct?
Thank you very much for your help.