The C++ std::move function casts its parameter to an rvalue reference, which enables its contents to be consumed by another operation. But in your excitement about this new expressive capability, take care not to overuse it.
std::string get_name(int id)
{
std::string name = std::to_string(id);
/* assume other calculations happen here */
return std::move(name);
}
You think you are giving the compiler some help by saying “Hey, like, I’m not using my local variable name after this point, so you can just move the string into the return value.”
Unfortunately, your help is actually hurting. Adding a std::move causes the return statement to fail to satisfy the conditions for copy elision (commonly known as Named Return Value Optimization, or NRVO): The thing being returned must be the name of a local variable with the same type as the function return value.
The added std::move prevents NRVO, and the return value is move-constructed from the name variable.
std::string get_name(int id)
{
std::string name = std::to_string(id);
/* assume other calculations happen here */
return name;
}
This time, we return name directly, and the compiler can now elide the copy and put the name variable directly in the return value slot with no copy. (Compilers are permitted but not required to perform this optimization, but in practice, all compilers will do it if all code paths return the same local variable.)
The other half of the overzealous std::move is on the receiving end.
extern void report_name(std::string name);
void sample1()
{
std::string name = std::move(get_name());
}
void sample2()
{
report_name(std::move(get_name()));
}
In these two sample functions, we take the return value from get_name and explicitly std::move it into a new local variable or into a function parameter. This is another case of trying to be helpful and ending up hurting.
Constructing a value (either a local variable or a function parameter) from a matching value of the same type will be elided: The matching value is stored directly into the local variable or parameter without a copy. But adding a std::move prevents this optimization from occurring, and the value will instead be move-constructed.
extern void report_name(std::string name);
void sample1()
{
std::string name = get_name();
}
void sample2()
{
report_name(get_name());
}
What’s particularly exciting is when you combine both mistakes. In that case, you took what would have been a sequence that had no copy or move operations at all and converted it into a sequence that creates two extra temporaries, two extra move operations, and two extra destructions.
#include <memory>
struct S
{
S();
S(S const&);
S(S &&);
~S();
};
extern void consume(S s);
// Bad version
S __declspec(noinline) f1()
{
S s;
return std::move(s);
}
void g1()
{
consume(std::move(f1()));
}
Here’s the compiler output for msvc:
; on entry, rcx says where to put the return value
f1:
mov qword ptr [rsp+8], rcx
push rbx
sub rsp, 48
mov rbx, rcx
; construct local variable s on stack
lea rcx, qword ptr [rsp+64]
call S::S()
; copy local variable to return value
lea rdx, qword ptr [rsp+64]
mov rcx, rbx
call S::S(S &&)
; destruct the local variable s
lea rcx, qword ptr [rsp+64]
call S::~S()
; return the result
mov rax, rbx
add rsp, 48
pop rbx
ret
g1:
sub rsp, 40
; call f1 and store into temporary variable
lea rcx, qword ptr [rsp+56]
call f1()
; copy temporary to outbound parameter
mov rdx, rax
lea rcx, qword ptr [rsp+48]
call S::S(S &&)
; call consume with the outbound parameter
mov rcx, rax
call consume(S)
; clean up the temporary
lea rcx, qword ptr [rsp+56]
call S::~S()
; return
add rsp, 40
ret
Notice that calling g1 resulted in the creation of a total of two extra copies of S, one in f1 and another to hold the return value of f1.
By comparison, if we use copy elision:
// Good version
S __declspec(noinline) f2()
{
S s;
return s;
}
void g2()
{
consume(f2());
}
then the msvc code generation is
; on entry, rcx says where to put the return value
f2:
push rbx
sub rsp, 48
mov rbx, rcx
; construct directly into return value (still in rcx)
call S::S()
; and return it
mov rax, rbx
add rsp, 48
pop rbx
ret
g2:
sub rsp, 40
; put return value of f1 directly into outbound parameter
lea rcx, qword ptr [rsp+48]
call f2()
; call consume with the outbound parameter
mov rcx, eax
call consume(S)
; return
add rsp, 40
ret
You get similar results with gcc, clang, and icc icx.
In gcc, clang, and icx, you can enable the pessimizing-move warning to tell you when you make these mistakes.
Is there some side-effect or other reason I can’t see why the
return std::move(name);case isn’t possible to elide? Or is this just a case of the standards missing an opportunity and compilers being bound to obey the standards?The standard requires elision if the return value is a prvalue ("pure rvalue"), but std::move(foo) is an xvalue ("expiring lvalue"). The standard allows elision in the NVRO case described by Raymond, as well as a few oddball scenarios involving exceptions and coroutines. In all other cases, elision is only permitted under the as-if rule, which requires the compiler to prove that there is no difference in observable behavior - in practice, this means it has to prove that constructing the temporary, calling the copy/move constructor, and calling the temporary's destructor, all do not affect program behavior and may be replaced...
Raymond, now we now that you are six months ahead. So, maybe you can give us a glimpse about the future? Pleeease. 🙂
In contrast, except for trivially copiable types (`Copy`), Rust has (trivial) move semantics by default, and requires you to explicitly opt in to copying (`Clone::clone`). It looks like Rust does have a form of NRVO: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/nrvo/struct.RenameReturnPlace.html
Note, though, that unlike the C++17 and later NRVO, Rust’s one is entirely optional. In C++17 and later, when a set of conditions are met – Prvalue semantics (“guaranteed copy elision”) – the compiler is required to perform NRVO.
In Rust, NRVO is entirely optional; indeed, the optimization you linked has been disabled due to https://github.com/rust-lang/rust/issues/111005 in rustc versions up to 1.74, and has not been re-enabled in nightly yet, since it was unsound in some circumstances.
What I'm getting from this: std::move is a cast. It should be treated with exactly the same suspicion as std::static_cast. Only use it if you can articulate what the cast is formally doing (in particular: Why do you expect the compiler to treat the value differently when it is cast to rvalue reference?), why the use case is not covered by copy elision (which includes, but is not limited to, NVRO), and what will become of the moved-from value (in particular: You should be reasonably confident that the moved-from value is never used again).
It gets even spicier when you combine abuse with abuse: whereas
<code>
will just needlessly create a reference that will inhibit NRVO,
<code>
will actually create a dangling reference, because C++ will not extend the lifetime of a temporary if there's a function call between it and the local reference you're declaring.
The punchline, though, is that this will again create a valid reference:
<code>
You may already know that is just implemented as , and yet substituting the former for the latter can introduce a use-after-free!
Finally, I suspect that NRVO too may be restored if you replace...
Assuming I did it correctly, you don’t get the optimisation if you move or cast the value.
z/qhhj18rKT
Interestingly it seems like MSVC can optimize out the move with
return static_cast<T &&>(...);but neither GCC nor Clang can. And while both GCC and Clang issue a warning aboutreturn std::move(...);, they don’t forreturn static_cast<T &&>(...);.(godbolt.org can’t execute MSVC’s output, but you can see it in the disassembly.)
CppCon 2018: Arthur O’Dwyer “Return Value Optimization: Harder Than It Looks”
Offtopic: Raymond’s full interview with Dave drops on YouTube tomorrow…