On harmful overuse of `std::move`

Raymond Chen

The C++ std::move function casts its parameter to an rvalue reference, which enables its contents to be consumed by another operation. But in your excitement about this new expressive capability, take care not to overuse it.

std::string get_name(int id)
{
    std::string name = std::to_string(id);
    /* assume other calculations happen here */
    return std::move(name);
}

You think you are giving the compiler some help by saying “Hey, like, I’m not using my local variable name after this point, so you can just move the string into the return value.”

Unfortunately, your help is actually hurting. Adding a std::move causes the return statement to fail to satisfy the conditions for copy elision (commonly known as Named Return Value Optimization, or NRVO): The thing being returned must be the name of a local variable with the same type as the function return value.

The added std::move prevents NRVO, and the return value is move-constructed from the name variable.

std::string get_name(int id)
{
    std::string name = std::to_string(id);
    /* assume other calculations happen here */
    return name;
}

This time, we return name directly, and the compiler can now elide the copy and put the name variable directly in the return value slot with no copy. (Compilers are permitted but not required to perform this optimization, but in practice, all compilers will do it if all code paths return the same local variable.)

The other half of the overzealous std::move is on the receiving end.

extern void report_name(std::string name);

void sample1()
{
    std::string name = std::move(get_name());
}

void sample2()
{
    report_name(std::move(get_name()));
}

In these two sample functions, we take the return value from get_name and explicitly std::move it into a new local variable or into a function parameter. This is another case of trying to be helpful and ending up hurting.

Constructing a value (either a local variable or a function parameter) from a matching value of the same type will be elided: The matching value is stored directly into the local variable or parameter without a copy. But adding a std::move prevents this optimization from occurring, and the value will instead be move-constructed.

extern void report_name(std::string name);

void sample1()
{
    std::string name = get_name();
}

void sample2()
{
    report_name(get_name());
}

What’s particularly exciting is when you combine both mistakes. In that case, you took what would have been a sequence that had no copy or move operations at all and converted it into a sequence that creates two extra temporaries, two extra move operations, and two extra destructions.

#include <memory>

struct S
{
    S();
    S(S const&);
    S(S &&);
    ~S();
};

extern void consume(S s);

// Bad version
S __declspec(noinline) f1()
{
    S s;
    return std::move(s);
}

void g1()
{
    consume(std::move(f1()));
}

Here’s the compiler output for msvc:

; on entry, rcx says where to put the return value
f1:
    mov     qword ptr [rsp+8], rcx
    push    rbx
    sub     rsp, 48
    mov     rbx, rcx

    ; construct local variable s on stack
    lea     rcx, qword ptr [rsp+64]
    call    S::S()

    ; copy local variable to return value
    lea     rdx, qword ptr [rsp+64]
    mov     rcx, rbx
    call    S::S(S &&)

    ; destruct the local variable s
    lea     rcx, qword ptr [rsp+64]
    call    S::~S()

    ; return the result
    mov     rax, rbx
    add     rsp, 48
    pop     rbx
    ret

g1:
    sub     rsp, 40

    ; call f1 and store into temporary variable
    lea     rcx, qword ptr [rsp+56]
    call    f1()

    ; copy temporary to outbound parameter
    mov     rdx, rax
    lea     rcx, qword ptr [rsp+48]
    call    S::S(S &&)

    ; call consume with the outbound parameter
    mov     rcx, rax
    call    consume(S)

    ; clean up the temporary
    lea     rcx, qword ptr [rsp+56]
    call    S::~S()

    ; return
    add     rsp, 40
    ret

Notice that calling g1 resulted in the creation of a total of two extra copies of S, one in f1 and another to hold the return value of f1.

By comparison, if we use copy elision:

// Good version
S __declspec(noinline) f2()
{
    S s;
    return s;
}

void g2()
{
    consume(f2());
}

then the msvc code generation is

; on entry, rcx says where to put the return value
f2:
    push    rbx
    sub     rsp, 48
    mov     rbx, rcx

    ; construct directly into return value (still in rcx)
    call    S::S()

    ; and return it
    mov     rax, rbx
    add     rsp, 48
    pop     rbx
    ret

g2:
    sub     rsp, 40

    ; put return value of f1 directly into outbound parameter
    lea     rcx, qword ptr [rsp+48]
    call    f2()

    ; call consume with the outbound parameter
    mov     rcx, eax
    call    consume(S)

    ; return
    add     rsp, 40
    ret

You get similar results with gcc, clang, and icc icx.

In gcc, clang, and icx, you can enable the pessimizing-move warning to tell you when you make these mistakes.

Author

Raymond Chen

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

10 comments

Discussion is closed. Login to edit/delete existing comments.

Jonathan Duncan November 26, 2023 · Edited

Is there some side-effect or other reason I can’t see why the return std::move(name); case isn’t possible to elide? Or is this just a case of the standards missing an opportunity and compilers being bound to obey the standards?
- Kevin Norris November 26, 2023 · Edited
  
  The standard requires elision if the return value is a prvalue ("pure rvalue"), but std::move(foo) is an xvalue ("expiring lvalue"). The standard allows elision in the NVRO case described by Raymond, as well as a few oddball scenarios involving exceptions and coroutines. In all other cases, elision is only permitted under the as-if rule, which requires the compiler to prove that there is no difference in observable behavior - in practice, this means it has to prove that constructing the temporary, calling the copy/move constructor, and calling the temporary's destructor, all do not affect program behavior and may be replaced...
  Read more
  The standard requires elision if the return value is a prvalue (“pure rvalue”), but std::move(foo) is an xvalue (“expiring lvalue”). The standard allows elision in the NVRO case described by Raymond, as well as a few oddball scenarios involving exceptions and coroutines. In all other cases, elision is only permitted under the as-if rule, which requires the compiler to prove that there is no difference in observable behavior – in practice, this means it has to prove that constructing the temporary, calling the copy/move constructor, and calling the temporary’s destructor, all do not affect program behavior and may be replaced with an in-place construction. This turns out to be fairly challenging.
  
  Elision in the generalized xvalue case might not be valid. It’s possible that you are moving from some object that already existed before the function was called, and so it is not possible to construct it in-place without optimizing across the function call boundary. Worse, the object might be visible to other parts of the program, or even to other threads executing concurrently. Such objects should not have their destructors skipped, because it would wreak havoc on all manner of standard-conforming programs that exist today, so the standard really shouldn’t authorize an elision in that case. The standard probably could make a special case for “NVRO, but somebody used std::move() incorrectly,” but it’s easier to just tell people not to do that.
  
  Read less
José Silva November 25, 2023

Raymond, now we now that you are six months ahead. So, maybe you can give us a glimpse about the future? Pleeease. 🙂
Solomon Ucko November 25, 2023

In contrast, except for trivially copiable types (`Copy`), Rust has (trivial) move semantics by default, and requires you to explicitly opt in to copying (`Clone::clone`). It looks like Rust does have a form of NRVO: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/nrvo/struct.RenameReturnPlace.html
- Simon Farnsworth November 28, 2023
  
  Note, though, that unlike the C++17 and later NRVO, Rust’s one is entirely optional. In C++17 and later, when a set of conditions are met – Prvalue semantics (“guaranteed copy elision”) – the compiler is required to perform NRVO.
  
  In Rust, NRVO is entirely optional; indeed, the optimization you linked has been disabled due to https://github.com/rust-lang/rust/issues/111005 in rustc versions up to 1.74, and has not been re-enabled in nightly yet, since it was unsound in some circumstances.
Kevin Norris November 24, 2023

What I'm getting from this: std::move is a cast. It should be treated with exactly the same suspicion as std::static_cast. Only use it if you can articulate what the cast is formally doing (in particular: Why do you expect the compiler to treat the value differently when it is cast to rvalue reference?), why the use case is not covered by copy elision (which includes, but is not limited to, NVRO), and what will become of the moved-from value (in particular: You should be reasonably confident that the moved-from value is never used again).

Read more
What I’m getting from this: std::move is a cast. It should be treated with exactly the same suspicion as std::static_cast. Only use it if you can articulate what the cast is formally doing (in particular: Why do you expect the compiler to treat the value differently when it is cast to rvalue reference?), why the use case is not covered by copy elision (which includes, but is not limited to, NVRO), and what will become of the moved-from value (in particular: You should be reasonably confident that the moved-from value is never used again).

Read less
紅樓鍮 November 24, 2023 · Edited
It gets even spicier when you combine abuse with abuse: whereas
<code>
will just needlessly create a reference that will inhibit NRVO,
<code>
will actually create a dangling reference, because C++ will not extend the lifetime of a temporary if there's a function call between it and the local reference you're declaring.

The punchline, though, is that this will again create a valid reference:
<code>
You may already know that is just implemented as , and yet substituting the former for the latter can introduce a use-after-free!

Finally, I suspect that NRVO too may be restored if you replace...
Read more
It gets even spicier when you combine std::move abuse with auto && abuse: whereas
```
auto &&name = get_name();
```
will just needlessly create a reference that will inhibit NRVO,
```
auto &&name = std::move(get_name());
```
will actually create a dangling reference, because C++ will not extend the lifetime of a temporary if there’s a function call between it and the local reference you’re declaring.

The punchline, though, is that this will again create a valid reference:
```
auto &&name = static_cast<std::string &&>(get_name());
```
You may already know that std::move is just implemented as static_cast<T &&>, and yet substituting the former for the latter can introduce a use-after-free!

Finally, I suspect that NRVO too may be restored if you replace the std::move in the first example with static_cast<std::string &&>; unfortunately godbolt.org effectively can’t be used on mobile, so I’ll have to try it for myself later.
Read less
- Neil Rashbrook November 25, 2023
  
  Assuming I did it correctly, you don’t get the optimisation if you move or cast the value.
  - 紅樓鍮 November 25, 2023 · Edited
    
    z/qhhj18rKT
    
    Interestingly it seems like MSVC can optimize out the move with return static_cast<T &&>(...); but neither GCC nor Clang can. And while both GCC and Clang issue a warning about return std::move(...);, they don’t for return static_cast<T &&>(...);.
    
    (godbolt.org can’t execute MSVC’s output, but you can see it in the disassembly.)
skSdnW November 24, 2023

CppCon 2018: Arthur O’Dwyer “Return Value Optimization: Harder Than It Looks”

Offtopic: Raymond’s full interview with Dave drops on YouTube tomorrow…