Earlier, we improved our simple coroutine promise by delaying the resumption of awaiting coroutines until local variables have destructed. This time, we’ll look at another improvement.
Recall that our coroutine is structured like this:
Coroutine state | Caller | |||||
---|---|---|---|---|---|---|
bookkeeping | ||||||
promise |
|
holder |
→ | result_ state |
← | holder |
stack frame |
There are two allocations, one for the coroutine state, and one for the shared state internal to the result_
. But what if we put the result_
shared state inside the promise? In other words, what if we made the promise be the result_
shared state?¹
This trick takes advantage of the fact that you are permitted to suspend in the final_
. This lets you pause the coroutine execution before it gets to the point where it destroys the coroutine state.
The idea is that we move into the promise object all of the result_
shared state, including the reference count hiding inside the shared_ptr
.
Let’s make the original diagram a bit more honest about the shared pointer control block. Recall that a shared_ptr
is a pair of pointers, one to a control block and one to the shared data, and the control block consists of two reference counts, one for strong references and one for weak references.
Coroutine state | Caller | |||||
---|---|---|---|---|---|---|
bookkeeping | ||||||
promise |
|
holder |
→ → |
refcountsresult_ state |
← ← |
holder |
stack frame |
What we’re doing is moving the shared pointer control block and the shared state into the promise.
Coroutine state | Caller | |||
---|---|---|---|---|
bookkeeping | ||||
promise |
|
refcount | ← | holder |
result_ state |
||||
stack frame |
We don’t need to support weak references at all, so we are down to just one reference count.
A running coroutine has a reference to its own state, and any outstanding holder
objects also have a reference to the coroutine state. Only when all references go away do we destroy the coroutine state.
We’re going to have to rewrite a bunch of stuff basically from scratch, seeing as we’re abandoning the entire shared_ptr
model that we had been using up until now. Let’s hope it’s worth it.
Bonus chatter: I figured I’d do the whole shared_ptr
thing first, since it makes the several-week-long path to this point easier to follow. If I had started directly with the “result holder state embedded in the coroutine state”, it would probably have been too confusing.
¹ Thanks to Gor Nishanov for providing this inspiration.
I was thinking of the opposite: overload
promise_type::operator new
to usemake_shared_for_overwrite
to allocate for the coroutine state.