More on harmful overuse of `std::move`

Raymond Chen

Some time ago, I wrote about harmful overuse of std::move. Jonathan Duncan asked,

Is there some side-effect or other reason I can’t see return std::move(name); case isn’t possible to elide? Or is this just a case of the standards missing an opportunity and compilers being bound to obey the standards?

In the statement return std::move(name);, what the compiler sees is return f(...); where f(...) is some mysterious function call that returns an rvalue. For all it knows, you could have written return object.optional_name().value();, which is also a mysterious function call that returns an rvalue. There is nothing in the expression std::move(name) that says, “Trust me, this rvalue that I return is an rvalue of a local variable from this very function!”

Now, you might say, “Sure, the compiler doesn’t know that, but what if we made it know that?” Make the function std::move a magic function, one of the special cases where the core language is in cahoots with the standard library.

This sort of in-cahoots-ness is not unheard of. For example, the compiler has special understanding of std::launder, so that it won’t value-propagate memory values across it, and the compiler has special understanding of memory barriers, so that it won’t optimize loads and stores across them.

So why not add std::move to the list of functions that the compiler has special understanding of? Technically, this is already permitted by the standard, because the standard requires that any specialization of a templated standard library function “meets the standard library requirements for the original template,” so you can’t write a specialization of std::move that, say, returns a copy of the object. However, I think it’s still legal for the specialization to send angry email to your boss¹ before returning the rvalue reference.

Okay, so we add a new clause to the standard that says that specializations of std::move are disallowed.

This does leave in the lurch alternate implementations of std::move. For example, the Windows Implementation Library (WIL) has its own implementation of std::move called wistd::move. It does this because some of the components that use WIL operate under a constraint that C++ exceptions are disallowed, which means that they cannot #include <memory>. But it would also mean that wistd::move is no longer a drop-in replacement for std::move: The compiler would recognize std::move as special, but not wistd::move.

Okay, so we tell those people, “Oh, stop being such a stick in the mud. Come on in, the water’s fine! Use std::move!”

If we operated naïvely, we would say, “Sure you can return the std::move of a local variable, and we’ll reuse the return value slot.” But that would be wrong, because that would be move-constructing an object from another object that resides at the same address, which is not something that happens in normal C++, and I suspect that a lot of move constructors don’t handle that case. (Not that I expect them to.)

So the C++ language would have to disavow the move constructor at all. It could say that if the return statement takes the form return std::move(name) where name is the name of a local variable eligible for NRVO, then the std::move may be elided.

And maybe to accommodate those people who are afraid of exception-infested waters, you could expand the rule to say that if the compiler can determine that the returned value is an rvalue to a local variable that is eligible for NRVO, then it can be rewritten as returning that local variable via NRVO (while still preserving any other observable behaviors of the relevant expression).

I mean, you could do this. Maybe you can even write up a proposal and see what the language committee thinks.

Oh wait, somebody already wrote that proposal! Stop Forcing std::move to Pessimize, which was presented to the C++ standard committee in November 2023, and the response was “Weak consensus, needs more work“.

Bonus viewing: CppCon 2018: Arthur O’Dwyer “Return Value Optimization: Harder Than It Looks“.

¹ More practical examples would be “doing performance logging” or “doing debug logging” rather than “sending angry email to your boss”.

Author

Raymond Chen

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

9 comments

Discussion is closed. Login to edit/delete existing comments.

Gerald Squelart June 4, 2024

I thought overloading `namespace std` was already UB? (With some exceptions, of course there are exceptions!) https://en.cppreference.com/w/cpp/language/extending_std
And in fact, I’ve discovered that recent Clang versions actually ignore user-provided `std::move` overloads!

So based on that, compilers should be allowed to special-case `std::move`… But then in the case of `return std::move(name)`, this would actually change the behavior (probably to what the user actually intended, but the compiler couldn’t be certain), so maybe *that* is not allowed when optimizing?
- Raymond Chen Author June 11, 2024
  
  From that linked page: “It is allowed to add template specializations for any standard library function template to the namespace std only if the declaration depends on at least one program-defined type and the specialization satisfies all requirements for the original template, except where such specializations are prohibited.” So you can specialize std::move of your custom type, provided it returns an rvalue reference to its parameter. But it appears that this permission was rescinded in C++20.
Adam Jensen June 3, 2024 · Edited

Hmm now that we have [[msvc::intrinsic]], can compiler optimize this?
- Faheem Sarwar June 4, 2024
  
  Now that we have [[msvc::intrinsic]], can the compiler optimize this?
紅樓鍮 June 3, 2024 · Edited
Random thoughts: because
<code>
is valid in C++ (that is, is already in scope in its own initializer), a constructor of a class type must not assume the pointer is unaliased; e. g. in
<code>
writing of to must not be reordered after the call to . Here we can see if the reorder happened, the lambda would access before it gets initialized.

Similarly, an NRVO function must not assume its return slot is unaliased:
<code>
(I zero-initialize so that the first doesn't touch uninitialized memory.) must construct its return value exactly...
Read more
Random thoughts: because
```
int x = f(&x);
```
is valid in C++ (that is, x is already in scope in its own initializer), a constructor of a class type must not assume the this pointer is unaliased; e. g. in
```
struct S {
  int a;
  template <typename F>
  S(F f) : a(42) { f(); }
};

void test() {
  S s([&] { println("{}", s.a); });
}
```
writing of 42 to a must not be reordered after the call to f. Here we can see if the reorder happened, the lambda would access s.a before it gets initialized.

Similarly, an NRVO function must not assume its return slot is unaliased:
```
template <typename F, typename G>
Huge make_huge(F f, G g) {
  f();
  Huge result(some_args...); // NRVO
  g();
  return result;
}

void test() {
  alignas(Huge) std::byte
      storage[sizeof(Huge)]{};
  new(storage) Huge(
      [&] { inspect_bytes(storage); },
      [&] { inspect_bytes(storage); });
}
```
(I zero-initialize storage so that the first inspect_bytes doesn’t touch uninitialized memory.) make_huge must construct its return value exactly in between f() and g(), since f() and g() can and will observe memory writes to the return slot that is result. (I’m not saying that we don’t want this specific behavior; read on.)

And now we get to the punchline: if we change make_huge to
```
template <typename F, typename G>
Huge make_huge(F f, G g) {
  f();
  Huge local(some_args...);
  g();
  return std::move(local); // !!!
}
```
make_huge now must construct its return value exactly after g()! Because the initialization of the return value is now performed in the return statement after g(), and the compiler cannot reorder it above g() for the same reason why it couldn’t move the initialization around in the previous example. But local‘s constructor has to be called before g() (unless the compiler knows the constructor is pure, in which case it can reorder it at will), and since the constructor must not be called on the return slot (it must not be touched before g() returns), the only way to call the constructor is call it on a Huge chunk of locally allocated stack memory.

And there you have it: unless we change the abstract machine semantics, not just NRVO, but any form of RVO is outright made impossible by the fact that C++ doesn’t restrict aliasing of under-construction objects sufficiently.
Read less
- Raymond Chen Author June 3, 2024
  
  The standard already solved this problem. It simply says that NRVO is allowed, without any requirement that it preserve observable behavior.
  - 紅樓鍮 June 3, 2024
    
    Do you think more forms of RVO can be made possible by making it undefined behavior to alias an under-construction object from outside its constructor? And if so, does it break enough code to make it problematic in practice?
  - Kevin Norris June 4, 2024
    
    Any alias, of any partially constructed object? No, there’s no way you can do that without breaking all sorts of things. Any function call from the constructor, which passes the this pointer or a pointer/reference to any field as an argument, would invoke UB, and that’s far too wide a net. You would break logging, object registration, probably some kinds of caching/interning, possibly even some kinds of dependency injection, and presumably a whole pile of other stuff.
  - 紅樓鍮 June 4, 2024
    
    My description was inaccurate; I meant changing the constructor's pointer to something similar to Rust's exclusive references () or C's pointers, which can be used to derive child pointers but cannot be aliased by pointers that are not transitively derived from it.
    
    Notwithstanding that, I've realized adding the rule above is not sufficient to enable RVO because logging, object registration, etc. can be in cahoots with and smuggle the pointer to somewhere can access, which will again enable to observe writes to the return slot.
    
    Read more
    My description was inaccurate; I meant changing the constructor’s this pointer to something similar to Rust’s exclusive references (&mut) or C’s restrict pointers, which can be used to derive child pointers but cannot be aliased by pointers that are not transitively derived from it.
    
    Notwithstanding that, I’ve realized adding the rule above is not sufficient to enable RVO because logging, object registration, etc. can be in cahoots with g() and smuggle the pointer to somewhere g() can access, which will again enable g() to observe writes to the return slot.
    
    Read less