How can I prevent myself from accessing a lambda captured variable or a parameter after I’m done with it?

Raymond

We saw last time that you can have your cake and eat it too: You can have a capturing lambda coroutine provided you copy the captures to the coroutine frame before the coroutine’s first suspension point. From that point onward, you use the copies in the frame instead of the original captured variables, because the original captured variables may no longer exist!

You therefore must exercise willpower to resist the temptation to access those captured variables, since they are now poison. Is there a way to help stop yourself from doing a bad thing? Like, put a “Do not touch” sticker on the captured variable?

We can do this by taking advantage of name hiding, which also goes by the name shadowing. Name hiding is normally something you are scared of, because it often happens by mistake:

int open_or_alt(char const* name, char const* altname)
{
    int fd = open(name, 0);
    if (fd == -1 && errno == ENOENT) {
        // accidentally shadowed outer "fd" variable
        int fd = open(altname, 0);
    }
    return fd;
}

In our case, we can use it on purpose. Suppose we define some type, call it hide_name, and we use it to hide the names of things we don’t want to see any more.

void RegisterClickHandler(Button const& button, int key)
{
    button.Click([key](auto&&, auto&&)
        -> winrt::fire_and_forget
        {
            auto copiedKey = key;
            hide_name key; // "key is now dead to me"
            co_await winrt::resume_background();
            NotifyClick(key); // want this to be an error
        });
}

What we’re doing is declaring a local variable named key which hides the captured variable with the same name. Any attempt to use key after the point of declaration will use the local variable version and not the captured variable version. Let’s assume that hide_name is carefully defined so that using it instead of the hidden variable has maximum likelihood of generating a compiler error. (We’ll look at how we can do this later.)

You can also hide parameters, which is handy if you have a coroutine that must accept parameters by reference (say, for some compatibility reason), but needs to access them after a suspension point.

// This used to refresh synchronously, but now we refresh
// asynchronously. The parameter is still passed by reference
// for compatibility.
winrt::fire_and_forget Refresh(winrt::com_ptr<Widget> const& widget)
{
    auto copiedWidget = widget;
    hide_name widget;
    co_await winrt::resume_background();
    copiedWidget->Reset();
    copiedWidget->Reload();
}

Note that you can hide names from outer scopes. You cannot hide a name from the same scope.

winrt::fire_and_forget GiveAwayTheWidget()
{
    Widget widget;
    widget.Initialize();
    GiveAway(std::move(widget));

    // Try to prevent accidentally using a moved-from variable.
    hide_name widget; // error

    ...
}

Since hide_name blah; is a declaration, you can hide multiple names by separate them with commas.

    hide_name a, b, c; // all three names are hidden

Okay, so the big question is “What should hide_name be in order to maximize the likelihood that using it will result in a compiler error?”

I came up with this:

using hide_name = void(struct hidden_name);

This says that hide_name is an alias for a function that accepts an incomplete type called hidden_name and which returns a void. When you write

hide_name a;

what you’re really saying is “a is a function that accepts an object of type hidden_name and which returns void.” It’s just a function declaration, not a variable declaration. But function names are still names, so they participate in name hiding.

Using the name a by itself will decay to a function pointer, which is not convertible to anything interesting. In particular, you can’t even do this:

void* p = &a;

Function pointers are not compatible with data pointers, so this catches the case where somebody accidentally tried to pass the address of a captured variable to something that accepts a void*.

You can’t do arithmetic on function pointers, so those will generate errors. And you can’t call the function pointer because its sole parameter is hidden_name which is an incomplete type.

Functions cannot be assigned to, so that blocks any attempt to write to a.

If somehow somebody gets past all these barriers (say by explicitly casting to uintptr_t on an implementation that supports it), they will fail at link time with an unresolved external complaining that there is no function named a that accepts a hidden_name as a parameter. (No such function can exist because hidden_name is an incomplete type.)

Here’s what I found in my experiments:

Accessing the variable in various ways.

void test(int);
int i;
hide_name a;

// All accept this. "v" is a pointer to function.
// You will probably get errors when you try to use "v", though.
auto v = a;

// msvc: C2440: cannot convert from 'hide_name (__cdecl *)' to 'int'
// gcc: invalid conversion from 'void (*)(hidden_name)' to 'int' in assignment
// clang: assigning to 'int' from incompatible type 'hide_name'
// icc: a value of type "hide_name *" cannot be assigned to an entity of type "int"
i = a;

// msvc: C2659: '=': function as left operand
// gcc: assignment of function 'void a(hidden_name)'
// clang: non-object type 'hide_name' is not assignable
// icc: expression must be a modifiable lvalue
a = i;

// msvc: cannot convert argument 1 from 'hide_name (__cdecl *)' to 'int'
// gcc: nvalid conversion from 'void (*)(hidden_name)' to 'int'
// clang: no matching function for call to 'test'
// icc: argument of type "hide_name *" is incompatible with parameter of type "int"
test(a);

// msvc: C2296: '+': illegal, left operand has type 'hide_name (__cdecl *)'
// gcc: pointer to a function used in arithmetic
// clang: arithmetic on a pointer to the function type 'hide_name'
// icc: a value of type "hide_name *" cannot be assigned to an entity of type "int"
i = a + 1;

// msvc: C2296: '+': illegal, left operand has type 'hide_name (__cdecl *)'
// gcc: pointer to a function used in arithmetic
// clang: arithmetic on a pointer to the function type 'hide_name'
// icc: NO ERROR AT COMPILE TIME! (Fails to link.)¹
auto p = a + 1;

// msvc: C2070: 'hide_name': illegal sizeof operand
// gcc: ISO C++ forbids applying 'sizeof' to an expression of function type²
// clang: invalid application of 'sizeof' to a function type
// icc: warning: operand of sizeof may not be a function
auto size = sizeof(a);

// msvc: C228: left of '.method' must have class/struct/union
// gcc: request for member 'method' in 'a', which is of non-class type 'hide_name'
// clang: member reference base type 'hide_name' is not a structure or union
// icc: expression must have class type
a.method();

Furthermore, since this is a function declaration and not a variable declaration, you are less likely to get “unused local variable” errors. gcc, clang, and icc ignore the unused function declaration.

Unfortunately, the place where this trick is most useful is in coroutines, and that’s where compilers today are the weakest. MSVC 19.28 was okay with it, but it seems that MSVC 19.29 does complain about unused function declarations in coroutines. As of this writing, a fix is pending release. gcc, on the other hand, totally freaks out about function declarations in coroutines: Version 11 produces the inscrutable error “data member ‘enclosing_function()::_Z18enclosing_functionv.frame::__a.2.3’ invalid declared function.” Version 12 blows up with an internal compiler error.

So maybe you should hold off on this for a bit longer.

¹ For some reason, icc treats the size of a function as 1. Sometimes it warns you that it’s making this nonstandard decision. Other times, it just does it without telling you.

² I find it amusing that one of the gcc error messages says “ISO C++ forbids…”, as if it’s saying “Hey, I totally would let you get away with this, but those pesky C++ standard folks tell me I’m not allowed to. So don’t blame me!”

12 comments

Comments are closed. Login to edit/delete your existing comments

  • Stuart Ballard

    I’m guessing there’s some reason why you can’t just use the copy you’re already creating to do the hiding?

    void RegisterClickHandler(Button const& button, int key)
    {
        button.Click([key](auto&&, auto&&)
            -> winrt::fire_and_forget
            {
                auto key = key;
                co_await winrt::resume_background();
                NotifyClick(key);
            });
    }
    • Jacob Manaker

      In

      auto key=key;

      the right-hand key refers to the new variable you’re defining. (I recently ran into this teaching my students.) On the bright side, this allows you to use the size of a variable in its creation:

      DRAWTEXTPARAMS params = { sizeof(params), /*other values omitted*/ };
      • Stuart Ballard

        Good old C++, where there’s always a syntax rule to make impossible things easy and easy things impossible. I’m assuming that the end result of the rule you describe is that

        auto key = key;

        won’t even compile because of infinite recursion in deriving the implicit type.

        From my outsider’s knowledge of C++ I’m scratching my head trying to figure out if there’s some weird trick to get the equivalent result, but having followed this blog for a while, I imagine that if there is it requires a bunch of obscure template black magic, especially if you want to cover every possible type. I’m imagining some fantastical sorcery where the value becomes part of the declared datatype so it can appear *before* the newly-declared name to get the right scope!

        capture_the_value_of<key> key;
      • Harry Johnston

        The obvious thing to try instead would be

        auto temp = key;
        auto key = temp;

        and I would assume the compiler would optimize away the extra variable?

        (Maybe not, for non-trivial classes. C++ is too complicated for me.)

    • Neil Rashbrook

      I think there are some languages such as LISP that do let you do something similar (in LISP it would be something like (let ((foo foo)) (...))).

      • 紅樓鍮

        I think most functional (or modern) languages allow it. let input = sanitize input is a widely taught idiom in OCaml.

  • Neil Rashbrook

    auto p = a + 1; looks like an ICC bug; it generates the same code as it does for auto p = a; (according to my basic testing on godbolt).

  • wm w

    Interesting!

    Based on your example, I came up with the following alternative for `hide_name`:

    #define shadow(val) [[maybe_unused]] auto val = [](){ \
            struct shadowed_name{ \
                shadowed_name() = default; \
                shadowed_name(const shadowed_name&) = delete; \
                shadowed_name& operator= (const shadowed_name&) = delete; \
            }; \
            return shadowed_name{}; \
        }()

    This uses a ‘voldemort type’ with deleted constructors/assignment to ensure that there is nothing useful we can do with the type.
    While calling this does require the usage of parentheses and does not allow shadowing multiple things at once
    (although a variadic macro that does that could be added if you really want), the compiler errors are a lot more readable than with `hide_name`.

    The only test this does not prevent, is the `sizeof(a)` one.

    ——

    The alternative which some people noted where it would be nicer to make a copy directly rather than hiding the name combined with introducing a new name, might be implemented as follows:

    #define NAME_CONCAT_IMPL(a, b) a ## b
    #define NAME_CONCAT(a, b) NAME_CONCAT_IMPL(a, b)
    #define co_safe_capture(val) auto NAME_CONCAT(temp, __LINE__){val}; auto val{NAME_CONCAT(temp, __LINE__)};
    

    This could then be called, using one of your examples, like so:

    void RegisterClickHandler(Button const& button, int key)
    {
        button.Click([key](auto&&, auto&&)
            -> winrt::fire_and_forget
            {
                co_safe_capture(key);
                co_await winrt::resume_background();
                NotifyClick(key); // want this to be an error
            });
    }
    
    • Raymond ChenMicrosoft employee

      The `shadow()` macro does have another hole: I can accidentally do memcpy(&val, new_value, sizeof(new_value)); and corrupt memory. Using a function type avoids this problem because function pointers cannot be implicitly cast to data pointers.

      Your co_safe_capture could be ... auto& val{...}; to avoid a second copy.