{"id":104922,"date":"2021-03-03T07:00:00","date_gmt":"2021-03-03T15:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=104922"},"modified":"2021-03-16T20:11:20","modified_gmt":"2021-03-17T03:11:20","slug":"20210303-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20210303-00\/?p=104922","title":{"rendered":"Creating a co_await awaitable signal that can be awaited multiple times, part 3"},"content":{"rendered":"<p>Last time, we <a title=\"Creating a co_await awaitable signal that can be awaited multiple times, part 2\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20210302-00\/?p=104918\"> created an awaitable signal that can be awaited multiple times<\/a>, but noted that it took kernel transitions a lot. Let&#8217;s implement the entire thing in user mode.<\/p>\n<pre>struct awaitable_event\r\n{\r\n  void set() const { shared-&gt;set(); }\r\n\r\n  auto await_ready() const noexcept\r\n  {\r\n    return shared-&gt;await_ready();\r\n  }\r\n\r\n  auto await_suspend(\r\n    std::experimental::coroutine_handle&lt;&gt; handle) const\r\n  {\r\n    return shared-&gt;await_suspend(handle);\r\n  }\r\n\r\n  auto await_resume() const noexcept\r\n  {\r\n    return shared-&gt;await_resume();\r\n  }\r\n\r\nprivate:\r\n  struct state\r\n  {\r\n    std::atomic&lt;bool&gt; signaled = false;\r\n    winrt::slim_mutex mutex;\r\n    std::vector&lt;std::experimental::coroutine_handle&lt;&gt;&gt; waiting;\r\n\r\n    void set()\r\n    {\r\n      std::vector&lt;std::experimental::coroutine_handle&lt;&gt;&gt; ready;\r\n      {\r\n        auto guard = winrt::slim_lock_guard(mutex);\r\n        signaled.store(true, std::memory_order_relaxed);\r\n        std::swap(waiting, ready);\r\n      }\r\n      for (auto&amp;&amp; handle : ready) handle();\r\n    }\r\n\r\n    bool await_ready() const noexcept\r\n    { return signaled.load(std::memory_order_relaxed); }\r\n\r\n    bool await_suspend(\r\n      std::experimental::coroutine_handle&lt;&gt; handle)\r\n    {\r\n      auto guard = winrt::slim_lock_guard(mutex);\r\n      if (signaled.load(std::memory_order_relaxed)) return false;\r\n      waiting.push_back(handle);\r\n      return true;\r\n    }\r\n\r\n    void await_resume() const noexcept { }\r\n  };\r\n\r\n  std::shared_ptr&lt;state&gt; shared = std::make_shared&lt;state&gt;();\r\n};\r\n<\/pre>\n<p>The <code>awaitable_<wbr \/>event<\/code> contains a <code>shared_<wbr \/>ptr<\/code> to an internal <code>state<\/code> object, which is where all the work really happens. Operations on the <code>awaitable_<wbr \/>event<\/code> are all forwarded to the <code>state<\/code> object, so all of the public methods are relatively uninteresting. The excitement happens in the <code>state<\/code> object, so let&#8217;s focus on that.<\/p>\n<p>To wait for the <code>awaitable_<wbr \/>event<\/code>, we begin with <code>await_<wbr \/>ready<\/code>, which returns whether the event is already signaled. If it is already signaled, then <code>await_<wbr \/>ready<\/code> returns <code>true<\/code>, which bypasses the suspension entirely. An event that represents &#8220;initialization complete&#8221; will spend nearly all of its time in the signaled state, and this short-circuit gives an optimized path for the compiler so it doesn&#8217;t have to spill register variables in the case that the event is already signaled.<\/p>\n<p>If the event is not signaled, then we get to <code>await_<wbr \/>suspend<\/code>. We take the lock and check a second time whether the event has been signaled. If so, then we return <code>false<\/code> meaning &#8220;I reject the suspension. Keep running.&#8221;\u00b9<\/p>\n<p>On the other hand, if the event is truly not signaled, then we push the coroutine handle onto our list of waiting coroutine handles, and we&#8217;re done.<\/p>\n<p>To signal the event, we take the lock, mark the event as signaled, and swap out the vector of waiting coroutine handles for an empty list. These coroutine handles are now ready: We iterate over the vector and resume each one.<\/p>\n<p>This works relatively well, except that once you have a large number of waiting coroutines (say, because initialization is taking a really long time), the <code>push_back<\/code> on the vector might take a long time if the vector needs to be reallocated. The operation is still amortized <var>O<\/var>(1), but the per-instance cost can be as high as <var>O<\/var>(<var>n<\/var>).<\/p>\n<p>Furthermore, the <code>push_back<\/code> can throw an exception due to low memory (note that <code>await_suspend<\/code> is not marked <code>noexcept<\/code>).<\/p>\n<p>We&#8217;ll address both of these issues next time.<\/p>\n<p>\u00b9 I always have to pause to think whenever I get to the <code>return<\/code> statements in the <code>await_<wbr \/>ready<\/code> and <code>await_<wbr \/>suspend<\/code> methods, because the return values have opposite sense. I have to remember that you want to &#8220;suspend if not ready&#8221;.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Doing it all ourselves, without any need to go into kernel mode.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-104922","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>Doing it all ourselves, without any need to go into kernel mode.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/104922","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=104922"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/104922\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=104922"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=104922"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=104922"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}