October 14th, 2020

A brief introduction to C++ structured binding

C++17 introduced a feature known as structured binding. It allows a single source object to be taken apart:

std::pair<int, double> p{ 42, 0.0 };
auto [i, d] = p;
// int i = 42;
// double d = 0.0;

It seems that no two languages agree on what to call this feature. C# calls it deconstructing. JavaScript calls it destructuring. (Python doesn’t seem to have a specific name for this concept, although the common case where the source is a list does have the name list comprehension.) Python calls it unpacking. My guess is that C++ avoided both of these terms to avoid confusion with the word destructor.

There is a subtlety in the way structured binding works: Binding qualifiers on the auto apply to how the source is bound, not on how the destination is bound.¹

For example,

auto&& [i, d] = p;

becomes (approximately)¹

If p.get<N> exists If p.get<N> does not exist
auto&& hidden = p;
decltype(auto) i = p.get<0>();
decltype(auto) d = p.get<1>();
auto&& hidden = p;
decltype(auto) i = get<0>(p);
decltype(auto) d = get<1>(p);

where hidden is a hidden variable introduced by the compiler. The declarations of i and d are inferred from the get method or free function.²

(In a cruel twist of fate, if hidden is const or a const reference, then that const-ness propagates to the destinations. But the reference-ness of hidden does not propagate.)

The decltype(auto) means that the reference-ness of the return type is preserved rather than decayed. If get returns a reference, that reference or qualifier is preserved. This differs from auto which will decay references to copies.

All of this comes into play when you want to make your own objects available to structured binding, which we’ll look at next time.

Bonus chatter: There is no way to specify that you want only selected pieces. You must bind all the pieces.

¹ In reality, the bound variables have underlying type std::tuple_element_t<N, T> (where N is the zero-based index and T is the type of the source), possibly with references added. But in practice, these types match the return types of get, so it’s easier just to say that they come from get.

² My reading of the language specification is that the destination variables are always references:

[dcl.struct.bind]
3. … [E]ach vᵢ is a variable of type “reference to Tᵢ” initialized with the initializer, where the reference is an lvalue reference if the initializer is an lvalue and an rvalue reference otherwise.

and therefore the expansion of the structured binding would be

auto&& i = get<0>(p);
auto&& d = get<1>(p);

However, in practice, the compilers declare the destination variables as matching the return value of get, as I noted above.

So I must be reading the specification wrong.

(The text was revised for C++20, but even in the revision, it’s still a reference.)

¹ That’s because a structured binding really is a hidden variable plus a bunch of references to the pieces of that hidden variable. That’s why the qualifiers apply to the hidden variable, not to the aliases.

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

7 comments

Discussion is closed. Login to edit/delete existing comments.

  • Buster C

    In the table, “hidden” is declared and not used. It looks like the second and third “p” in two of the cells should be “hidden”.

  • Julien Oster

    This is called “Pattern Matching” in ML, Haskell, and many other functional languages, which have had this feature as a central concept for many decades.

    And I’m absolutely not arguing that this means it should be called “Pattern Matching” anywhere else, since that’s just plain confusing in today’s world. You quickly tire of saying “pattern matching… no, not the regular expression kind, the data type kind”.

  • Dwayne Robinson

    I love this addition to the language, but I don't love the zoo of when to use [] vs () vs {} now :/. Historically a heterogeneous group (tuple) of arguments has been expressed via either parentheses (e.g. function calls and constructor calls: foo(a, b, c) or MyClass myClass(42, 3.14159f)) or braces (myStruct = {42, 3.14159f};), and so one would expect either...

    <code>
    <code>

    ...but instead we get something that (without spaces) can easily look more like...

    Read more
  • Adam Rosenfield

    Rust follows JavaScript’s terminology and calls this feature destructuring as well.

    (Side note: there are two separate footnotes #1 in this article, which are made even more confusing since the order seen in the article is 1, 1, 2, while the order in the trailer is 1, 2, 1.)

    • Kalle Niemitalo

      Common Lisp likewise calls it destructuring; see Macro DESTRUCTURING-BIND.

  • Alexandre Brault

    Python uses the term unpacking (usually tuple unpacking) for its equivalent feature. List comprehensions are a different thing

    • Raymond ChenMicrosoft employee Author

      Thanks. I’ve updated the article.