With Visual Studio 2022 version 17.4 Preview 3, we’ve significantly increased the number of situations where we do copy or move elision and given users more control over whether these transformations are enabled.
What are copy and move elision?
When a return
keyword in a C++ function is followed by an expression of non-primitive type, the execution of that return statement copies the result of the expression into the return slot of the calling function. To do this, the copy or move constructor of the non-primitive type is called. Then, as part of exiting the function, destructors for function-local variables are called, likely including any variables named in the expression following the return
keyword.
The C++ specification allows the compiler to construct the returned object directly in the return slot of the calling function, eliding the copy or move constructor executed as part of the return. Unlike most other optimizations, this transformation is allowed to have an observable effect on the program’s output – namely, the copy or move constructor and associated destructor are called one less time.
Mandatory copy/move elision in Visual Studio
The C++ standard requires copy or move elision when the returned value is initialized as part of the return
statement (such as when a function with return type Foo
returns return Foo()
). The Microsoft Visual C++ compiler always performs copy and move elision for return statements where it is required to do so, regardless of the flags passed to the compiler. This behavior is unchanged.
Changes to optional copy/move elision in Visual Studio 17.4 Preview 3
When the returned value is a named variable, the compiler may elide the copy or move but is not required to do so. The standard still requires a copy or move constructor to be defined for the named return variable, even if the compiler elides the constructor in all cases. Prior to Visual Studio 2022 version 17.4 Preview 3, when optimizations were disabled (such as with the /Od
compiler flag or for functions marked with #pragma optimize("", off)
) the compiler would only perform mandatory copy and move elision. With the /O2
flag, the compiler would perform optional copy or move elision for optimized functions with simple control flow.
Starting with Visual Studio 2022 version 17.4 Preview 3, we are giving developers the option for consistency with the new /Zc:nrvo
compiler flag. The /Zc:nrvo
flag will be passed by default when code is compiled with the /O2
flag, the /permissive-
flag, or when compiling for /std:c++20
or later. When this flag is passed, copy and move elision will be performed wherever possible. We would like to turn /Zc:nrvo
on by default in a future release.
Starting with Visual Studio 2022 version 17.4 Preview 3, optional copy/move elision can also be explicitly disabled with the /Zc:nrvo-
flag. It is impossible to disable mandatory copy/move elision.
In Visual Studio 2022 version 17.4 Preview 3, we are also increasing the number of places where we do copy/move elision when optional copy/move elision is enabled with the /Zc:nrvo
, /O2
, /permissive-
, or /std:c++20
or later flags.
Earlier versions of Visual Studio | Visual Studio 17.4 Preview 3 and later | |
---|---|---|
Mandatory copy/move elision | Always occurs. | Always occurs. |
Optional copy/move elision for return of named variable in function without loops or exception handling | Occurs under /O2 unless the function has multiple returned symbols with overlapping lifetimes or the type’s copy or move constructor has default arguments. |
Does not occur under /Zc:nrvo- . Otherwise, occurs under /O2 , /permissive- , /std:c++20 or later, or /Zc:nrvo unless the function has multiple returned symbols with overlapping lifetimes. |
Optional copy/move elision for return of named variable in a loop | Never occurs. | Does not occur under /Zc:nrvo- . Otherwise, occurs under /O2 , /permissive- , /std:c++20 or later, or /Zc:nrvo unless the function has multiple returned symbols with overlapping lifetimes. |
Optional copy/move elision for return of named variable in functions with exception handling | Never occurs. | Does not occur under /Zc:nrvo- . Otherwise, occurs under /O2 , /permissive- , /std:c++20 or later, or /Zc:nrvo unless the function has multiple returned symbols with overlapping lifetimes. |
Optional copy/move elision for return of named variable when the copy or move constructor has additional default arguments | Never occurs. | Does not occur under /Zc:nrvo- . Otherwise, occurs under /O2 , /permissive- , /std:c++20 or later, or /Zc:nrvo unless the function has multiple returned symbols with overlapping lifetimes. |
Optional copy/move elision for throw of a named variable | Never occurs. | Never occurs. |
Examples of optional copy/move elision
The simplest example of optional copy or move elision is a function such as:
Foo SimpleReturn() {
Foo result;
return result;
}
Earlier versions of the MSVC compiler already elided the copy or move of result
into the return slot in this case if the /O2
flag was passed. In Visual Studio 2022 version 17.4 Preview 3, the copy or move is also elided if the /permissive-
, /std:c++20
or later, or /Zc:nrvo
flags are passed, and retained if the /Zc:nrvo-
flag is passed.
Starting with Visual Studio 2022 version 17.4 Preview 3, we now perform copy/move elision in the following additional cases if the /O2
, /permissive-
, /std:c++20
or later, or /Zc:nrvo
flags are passed to the compiler and the /Zc:nrvo-
flag is not:
Return inside a loop
Foo ReturnInALoop(int iterations) {
for (int i = 0; i < iterations; ++i) {
Foo result;
if (i == (iterations / 2)) {
return result;
}
}
}
The result
object will be properly constructed at the start of each iteration of the loop and destructed at the end of each iteration. On the iteration where result
is returned, its destructor will not be called on exit from the function. The function’s caller will destroy the returned object when it falls out of scope in that function.
Return with exception-handling
Foo ReturnInTryCatch() {
try {
Foo result;
return result;
} catch (...) {}
}
The copy or move of the result
object will now be elided if the /O2
, /permissive-
, /std:c++20
or later, or /Zc:nrvo
flags are passed and the /Zc:nrvo-
flag is not. We also now properly handle more complex cases such as:
int n;
void throwFirstThreeIterations() {
++n;
if (n <= 3) throw n;
}
Foo ComplexTryCatch()
{
Label1:
Foo result;
try {
throwFirstThreeIterations();
return result;
}
catch(...) {
goto Label1;
}
}
The result
object will be constructed in the return slot for the caller function and no copy/move constructor or destructor will be called for it on a successful return. When an exception is thrown, whether or not the result
object is destructed is determined by which exception-handling flags are passed to the compiler. By default, no stack-unwinding will occur and therefore no destructors will be called. However, if stack-unwinding exception handling is enabled with the /EHs
, /EHa
, or /EHr
flags, goto Label1
will cause result
‘s destructor to be called because it jumps to before result
is initialized. Either way, when the expression Foo result
is reached again, the object will be constructed again in the return slot.
Copy constructors with default arguments
We now properly detect that a copy or move constructor with default arguments is still a copy or move constructor, and therefore can be elided in the cases above. A copy constructor with default parameters will look something like the following:
struct StructWithCopyConstructorDefaultParam {
int X;
StructWithCopyConstructorDefaultParam(int x) : X(x) {}
StructWithCopyConstructorDefaultParam(StructWithCopyConstructorDefaultParam const& original, int defaultParam = 0) :
X(original.X + defaultParam) {
printf("Copy constructor called.\n");
}
};
Limitations on NRVO
Although the MSVC compiler now performs copy and move elision in many more situations, it is not always possible to perform copy/move elision. To see why this is true, consider the following function:
Foo WhichShouldIReturn(bool condition) {
Foo resultA;
if (condition) {
Foo resultB;
return resultB;
}
return resultA;
}
Copy elision constructs the object to be returned in the return slot, but which object should be constructed in the return slot in this case? For the copy of resultA
to be elided at return resultA
, it must be constructed in the return slot. However, if condition
is true, resultB
will need to be constructed in the return slot before resultA
is destroyed. There is no way to perform copy elision for both paths.
We currently choose to avoid doing optional copy/move elision on all paths in a function if copy/move elision is impossible on any path. However, changes to inlining decisions, dead code elimination, and other optimizations can change whether copy or move elision is possible. For this reason, it is never safe to write code that depends on certain behavior for copy/move elision of named variables unless all optional copy/move elision is disabled with /Zc:nrvo-
.
As long as stack-unwinding exception handling is enabled or no exceptions are thrown, it is still safe to assume that every constructer call has a matching destructor call.
Feedback
We encourage you to try out this update in the latest Visual Studio 2022 version 17.4 Preview. Please let us know what you think or any issues you encounter. We can be reached via the comments below, via twitter (@visualC) or via Developer Community.
Hi, could you give a definition of “optimized functions with simple control flow” please?! Or conversely what causes a function not to be deemed having “simple control flow” anymore? Does it have to do with C++ exceptions versus simple if/while/for? Thanks!
Hi Oliver,
I used the term “simple control flow” as a summary. The table gives a more concrete list of cases where we do or don’t do optional copy/move elision on Visual Studio 17.4’s C++ compiler and on earlier versions of Visual Studio. In particular, prior to 17.4 we would not do copy/move elision if there was a loop (while, for, goto an earlier block, etc.) of any kind, or any C++ exceptions.
Optional copy/move elision is also not possible if there are multiple return statements that return different symbols and those symbols have overlapping lifetimes. We bail on cases where we detect possible overlapping lifetimes, but those calculations may not be exact. If there’s a case where you think copy/move elision should be legal but we don’t elide the copy/move, please feel free to open a feedback bug at Developer Community
We currently do optional copy/move elision before any other optimizations. This provides consistency between /Od and /O2 when using the /Zc:nrvo flag in versions of Visual Studio where it is available. Clang and GCC perform copy/move elision after branch elimination, and so can sometimes perform optional copy/move elision in cases where the code appears to have multiple returns with overlapping lifetimes as long as one of those returns is never taken. This is one reason why it is not safe to rely on the presence or absence of optional copy/move elision for program correctness.
Good improvement. In our code base we return string and vectors a lot which is a good use case for this.
Besides the mandatory case one often doesn’t know why the compiler didn’t apply NRVO. It could help if the compiler gave hints why it didn’t. The last example could probably easily be rewritten and use only one ‘Foo’ variable and return statement if that would trigger NRVO.
Next on the wishlist is more aggressive move, e.g. when named variables go out of scope.
The suggestion of adding a diagnostic for why NRVO is not applied is a good one. There are usually parts of the code base where copy/move elision matters a lot and parts where it is much less important, so we’d need to find a way to scope the information to be useful for a large program.
The current standard doesn’t allow the copy or move to be elided in your last case, unfortunately.
Great to see these kinds of low-level enhancements being made to the compiler. 👍 By the way, in case anybody else wonders what “nrvo” stands for: “Named Return Value Optimization”.
Have you performed any benchmarks on representative code bases to evaluate the effect of these optimizations on generated code size and runtime performance? After all, efficiency is probably the main reason why this optimization is there in the first place, I assume. Thus it would be interesting to know how much of an impact one can expect due to these improvements on real-world applications.
Most of the cases where copy/move elision is critical for performance were already caught by the existing implementation.
The big win for increasing the number of situations where copy/move elision occurs is actually increased consistency and better handling of objects/classes where the move/copy constructor is not intended to be called. When following the RAII pattern, for example, a constructor call (including a copy or move constructor call) may involve acquiring resources. Eliding the copy or move makes it easier to return these types of objects.