The C++ std::move
function casts its parameter to an rvalue reference, which enables its contents to be consumed by another operation. But in your excitement about this new expressive capability, take care not to overuse it.
std::string get_name(int id) { std::string name = std::to_string(id); /* assume other calculations happen here */ return std::move(name); }
You think you are giving the compiler some help by saying “Hey, like, I’m not using my local variable name
after this point, so you can just move the string into the return value.”
Unfortunately, your help is actually hurting. Adding a std::move
causes the return
statement to fail to satisfy the conditions for copy elision (commonly known as Named Return Value Optimization, or NRVO): The thing being returned must be the name of a local variable with the same type as the function return value.
The added std::move
prevents NRVO, and the return value is move-constructed from the name
variable.
std::string get_name(int id)
{
std::string name = std::to_string(id);
/* assume other calculations happen here */
return name;
}
This time, we return name
directly, and the compiler can now elide the copy and put the name
variable directly in the return value slot with no copy. (Compilers are permitted but not required to perform this optimization, but in practice, all compilers will do it if all code paths return the same local variable.)
The other half of the overzealous std::move
is on the receiving end.
extern void report_name(std::string name); void sample1() { std::string name = std::move(get_name()); } void sample2() { report_name(std::move(get_name())); }
In these two sample functions, we take the return value from get_name
and explicitly std::move
it into a new local variable or into a function parameter. This is another case of trying to be helpful and ending up hurting.
Constructing a value (either a local variable or a function parameter) from a matching value of the same type will be elided: The matching value is stored directly into the local variable or parameter without a copy. But adding a std::move
prevents this optimization from occurring, and the value will instead be move-constructed.
extern void report_name(std::string name); void sample1() { std::string name = get_name(); } void sample2() { report_name(get_name()); }
What’s particularly exciting is when you combine both mistakes. In that case, you took what would have been a sequence that had no copy or move operations at all and converted it into a sequence that creates two extra temporaries, two extra move operations, and two extra destructions.
#include <memory> struct S { S(); S(S const&); S(S &&); ~S(); }; extern void consume(S s); // Bad version S __declspec(noinline) f1() { S s; return std::move(s); } void g1() { consume(std::move(f1())); }
Here’s the compiler output for msvc:
; on entry, rcx says where to put the return value f1: mov qword ptr [rsp+8], rcx push rbx sub rsp, 48 mov rbx, rcx ; construct local variable s on stack lea rcx, qword ptr [rsp+64] call S::S() ; copy local variable to return value lea rdx, qword ptr [rsp+64] mov rcx, rbx call S::S(S &&) ; destruct the local variable s lea rcx, qword ptr [rsp+64] call S::~S() ; return the result mov rax, rbx add rsp, 48 pop rbx ret g1: sub rsp, 40 ; call f1 and store into temporary variable lea rcx, qword ptr [rsp+56] call f1() ; copy temporary to outbound parameter mov rdx, rax lea rcx, qword ptr [rsp+48] call S::S(S &&) ; call consume with the outbound parameter mov rcx, rax call consume(S) ; clean up the temporary lea rcx, qword ptr [rsp+56] call S::~S() ; return add rsp, 40 ret
Notice that calling g1
resulted in the creation of a total of two extra copies of S
, one in f1
and another to hold the return value of f1
.
By comparison, if we use copy elision:
// Good version S __declspec(noinline) f2() { S s; return s; } void g2() { consume(f2()); }
then the msvc code generation is
; on entry, rcx says where to put the return value f2: push rbx sub rsp, 48 mov rbx, rcx ; construct directly into return value (still in rcx) call S::S() ; and return it mov rax, rbx add rsp, 48 pop rbx ret g2: sub rsp, 40 ; put return value of f1 directly into outbound parameter lea rcx, qword ptr [rsp+48] call f2() ; call consume with the outbound parameter mov rcx, eax call consume(S) ; return add rsp, 40 ret
You get similar results with gcc, clang, and icc icx.
In gcc, clang, and icx, you can enable the pessimizing-move
warning to tell you when you make these mistakes.
Is there some side-effect or other reason I can’t see why the
return std::move(name);
case isn’t possible to elide? Or is this just a case of the standards missing an opportunity and compilers being bound to obey the standards?The standard requires elision if the return value is a prvalue ("pure rvalue"), but std::move(foo) is an xvalue ("expiring lvalue"). The standard allows elision in the NVRO case described by Raymond, as well as a few oddball scenarios involving exceptions and coroutines. In all other cases, elision is only permitted under the as-if rule, which requires the compiler to prove that there is no difference in observable behavior - in practice, this means it has to prove that constructing the temporary, calling the copy/move constructor, and calling the temporary's destructor, all do not affect program behavior and may be replaced...
Raymond, now we now that you are six months ahead. So, maybe you can give us a glimpse about the future? Pleeease. 🙂
In contrast, except for trivially copiable types (`Copy`), Rust has (trivial) move semantics by default, and requires you to explicitly opt in to copying (`Clone::clone`). It looks like Rust does have a form of NRVO: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/nrvo/struct.RenameReturnPlace.html
Note, though, that unlike the C++17 and later NRVO, Rust’s one is entirely optional. In C++17 and later, when a set of conditions are met – Prvalue semantics (“guaranteed copy elision”) – the compiler is required to perform NRVO.
In Rust, NRVO is entirely optional; indeed, the optimization you linked has been disabled due to https://github.com/rust-lang/rust/issues/111005 in rustc versions up to 1.74, and has not been re-enabled in nightly yet, since it was unsound in some circumstances.
What I'm getting from this: std::move is a cast. It should be treated with exactly the same suspicion as std::static_cast. Only use it if you can articulate what the cast is formally doing (in particular: Why do you expect the compiler to treat the value differently when it is cast to rvalue reference?), why the use case is not covered by copy elision (which includes, but is not limited to, NVRO), and what will become of the moved-from value (in particular: You should be reasonably confident that the moved-from value is never used again).
It gets even spicier when you combine abuse with abuse: whereas
<code>
will just needlessly create a reference that will inhibit NRVO,
<code>
will actually create a dangling reference, because C++ will not extend the lifetime of a temporary if there's a function call between it and the local reference you're declaring.
The punchline, though, is that this will again create a valid reference:
<code>
You may already know that is just implemented as , and yet substituting the former for the latter can introduce a use-after-free!
Finally, I suspect that NRVO too may be restored if you replace...
Assuming I did it correctly, you don’t get the optimisation if you move or cast the value.
z/qhhj18rKT
Interestingly it seems like MSVC can optimize out the move with
return static_cast<T &&>(...);
but neither GCC nor Clang can. And while both GCC and Clang issue a warning aboutreturn std::move(...);
, they don’t forreturn static_cast<T &&>(...);
.(godbolt.org can’t execute MSVC’s output, but you can see it in the disassembly.)
CppCon 2018: Arthur O’Dwyer “Return Value Optimization: Harder Than It Looks”
Offtopic: Raymond’s full interview with Dave drops on YouTube tomorrow…