April 27th, 2023

What’s up with this new memory_order_consume memory order?

C++20 introduces a new atomic memory order: std::memory_order::consume, more commonly known as std::memory_order_consume, What is this guy?

The consume memory order is a weaker form of acquire. Whereas acquire prevents all future memory accesses from being ordered ahead of the load, the consume order only prevents dependent future memory accesses from being reorder ahead of the load.

In all the examples, let’s assume global variables declared and initialized as

int v1 = 1;
int v2 = 2;
std::atomic<int*> p{ &v1 };

Okay, let’s do some consuming.

auto sample_consume()
{
    auto q = p.load(std::memory_order_consume);
    return *q + v2;
}

The compiler is required to read the value from p into q, and any future calculations depending on that value must occur after the load.

This reordering is allowed:

auto sample_consume_allowed()
{
    auto prefetch2 = v2;
    auto q = p.load(std::memory_order_consume);
    return *q + prefetch2;
}

The value of v2 is not dependent on what was loaded from p. Therefore, the compiler and processor are permitted to advance the fetch of v2 ahead of the load of p. Note that an acquire load of p would have prohibited this reordering, since acquire loads block all future memory access, even if unrelated to the value being acquired.

However, this reordering of the above code is not allowed:

auto sample_consume_disallowed()
{
    auto speculate1 = v1;
    auto q = p.load(std::memory_order_consume);
    if (q == &v1) return speculate1 + v2;
    return *q + v2;
}

This speculation lets the code hide the memory latency of accessing v1 behind the load of p, and a compiler might choose to take advantage of this based on profiling feedback, and a processor might do it unilaterally because processors like to do speculative things nowadays. This would be allowed if the load from p were relaxed.

However, the consume memory order prohibits this transformation: The value loaded from p is dereferenced, and that dereference operation is dependent upon the value that was loaded, so the consume memory order requires that the dereference occur after the load.

Here’s a table, because people like tables.

Ordering Relaxed Consume Acquire
Load v2 before p Allowed Allowed Prohibited
Dereference p before load Allowed Prohibited Prohibited

The consume memory order is not used much. Atomic variables are typically tied to other variables in ways that don’t show up in expression dependency graphs, such as for use as mutual exclusion locks. The acquire memory order is much more commonly used than consume.

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

2 comments

Discussion is closed. Login to edit/delete existing comments.

  • George Tokmaji

    Tiny nitpick: According to cppreference, `std::memory_order::consume` and Release-Consume ordering have been around since C++11, although some semantic revisions (and the std::memory_order_consume alias) were added with C++20.

    • Simon Farnsworth

      If your compiler predates C++20, however, the consume memory order is almost certainly treated as acquire (legal, just not faster). C++17 temporarily discouraged the use of the consume order because the semantics were impossible to use correctly, and thus all compilers were implementing it as acquire.

      The revisions in C++20 took some very smart people a lot of effort, and changed the definitions subtly so that you can use consume and get weaker memory ordering than...

      Read more