{"id":108387,"date":"2023-07-03T07:00:00","date_gmt":"2023-07-03T14:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=108387"},"modified":"2023-06-19T06:47:17","modified_gmt":"2023-06-19T13:47:17","slug":"20230703-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20230703-00\/?p=108387","title":{"rendered":"How to wait for multiple C++ coroutines to complete before propagating failure, symmetric transfer"},"content":{"rendered":"<p>Last time, <a title=\"How to wait for multiple C++ coroutines to complete before propagating failure, custom promise\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20230630-00\/?p=108382\"> we wrote a simple coroutine promise<\/a> to help us with our <code>when_<wbr \/>all_<wbr \/>completed<\/code> function. One obvious refinement we can make is to avoid stack build-up by using symmetric transfer.<\/p>\n<p>Observe that in both of the <code>await_suspend<\/code> flows, we resume another coroutine. In our initial implementation, we accomplished this by calling the <code>resume()<\/code> method on the desired coroutine handle. However, this results in stack build-up: The caller awaits the coroutine by calling <code>resume()<\/code> on the coroutine, and when the coroutine finishes, it returns control to the caller by calling <code>resume()<\/code> on the caller&#8217;s coroutine handle. So we&#8217;re two frames deep.<\/p>\n<p>This cycle repeats for each awaitable passed to the <code>when_<wbr \/>all_<wbr \/>completed<\/code> function, and there could be quite a few of them.<\/p>\n<p>We can use symmetric transfer to avoid the stack build-up, since the last thing each function does is resume some other coroutine.<\/p>\n<p>First, we&#8217;ll use symmetric transfer when starting the coroutine:<\/p>\n<pre>struct all_completed_result\r\n{\r\n    all_completed_promise&amp; promise;\r\n    bool await_ready() noexcept { return false; }\r\n    <span style=\"border: solid 1px currentcolor;\">auto<\/span> await_suspend(\r\n        std::coroutine_handle&lt;&gt; handle) noexcept;\r\n    std::exception_ptr await_resume() noexcept;\r\n};\r\n\r\n<span style=\"border: solid 1px currentcolor;\">auto<\/span> all_completed_result::\r\n    await_suspend(std::coroutine_handle&lt;&gt; handle)\r\n    noexcept\r\n{\r\n    promise.awaiting_coroutine = handle;\r\n    <span style=\"border: solid 1px currentcolor;\">return<\/span> promise.coroutine();\r\n}\r\n<\/pre>\n<p>When lazy-starting the coroutine, we return the coroutine&#8217;s handle instead of manually resuming it. This activates symmetric transfer, so the compiler can use a tail call to jump directly to the coroutine, avoiding a stack frame.<\/p>\n<p>Doing the same thing when the coroutine finishes takes a little more work because the symmetric transfer happens in <code>await_suspend<\/code>, but our original version was resuming the caller in <code>final_suspend<\/code>. We&#8217;ll have to arrange for the caller&#8217;s handle to be returned from <code>await_suspend<\/code>.<\/p>\n<pre>struct all_completed_promise\r\n{\r\n    ...\r\n\r\n    <span style=\"border: solid 1px currentcolor; border-bottom: none;\">auto final_suspend() noexcept {                      <\/span>\r\n    <span style=\"border: 1px currentcolor; border-style: none solid;\">    struct awaiter : std::suspend_always             <\/span>\r\n    <span style=\"border: 1px currentcolor; border-style: none solid;\">    {                                                <\/span>\r\n    <span style=\"border: 1px currentcolor; border-style: none solid;\">        std::coroutine_handle&lt;&gt; other;               <\/span>\r\n    <span style=\"border: 1px currentcolor; border-style: none solid;\">        auto await_suspend(std::coroutine_handle&lt;&gt;) {<\/span>\r\n    <span style=\"border: 1px currentcolor; border-style: none solid;\">            return other;                            <\/span>\r\n    <span style=\"border: 1px currentcolor; border-style: none solid;\">        }                                            <\/span>\r\n    <span style=\"border: 1px currentcolor; border-style: none solid;\">    };                                               <\/span>\r\n    <span style=\"border: solid 1px currentcolor; border-top: none;\">    return awaiter{{}, awaiting_coroutine};          <\/span>\r\n    }\r\n};\r\n<\/pre>\n<p>This is the actual symmetric transfer part: We save the coroutine handle we want to resume in the awaiter, so that the awaiter can return it from <code>await_suspend<\/code>. Again, symmetric transfer allows the resumption of the awaiting coroutine to happen as a tail call, avoiding a stack frame.<\/p>\n<p>But we&#8217;re not done yet.<\/p>\n<p>We suspended the promise&#8217;s coroutine, so it remains allocated in memory. We need to destroy it after we extract the <code>eptr<\/code> in the <code>await_resume<\/code> that returns it to the caller.<\/p>\n<pre>std::exception_ptr all_completed_result::\r\n    await_resume() noexcept\r\n{\r\n    <span style=\"border: solid 1px currentcolor; border-bottom: none;\">auto eptr = promise.eptr;     <\/span>\r\n    <span style=\"border: 1px currentcolor; border-style: none solid;\">promise.coroutine().destroy();<\/span>\r\n    <span style=\"border: solid 1px currentcolor; border-top: none;\">return eptr;                  <\/span>\r\n}\r\n<\/pre>\n<p>Okay, so that reduces the likelihood of stack exhaustion issues when awaiting a whole bunch of awaitables inside <code>when_<wbr \/>all_<wbr \/>completed<\/code>.<\/p>\n<p>But wait, we haven&#8217;t addressed the <code>std::<wbr \/>bad_alloc<\/code> problem that we identified a while back. We got distracted with all the simplifications that a custom promise offered, but forgot why why we wrote our own custom promise in the first place. Let&#8217;s return to that next time.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Avoiding stack build-up.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-108387","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>Avoiding stack build-up.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/108387","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=108387"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/108387\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=108387"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=108387"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=108387"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}