{"id":107770,"date":"2023-01-31T07:00:00","date_gmt":"2023-01-31T15:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=107770"},"modified":"2023-01-30T12:11:46","modified_gmt":"2023-01-30T20:11:46","slug":"20230131-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20230131-00\/?p=107770","title":{"rendered":"Inside C++\/WinRT: Apartment switching: Error reporting"},"content":{"rendered":"<p>So far, we&#8217;ve been looking at how C++\/WinRT handles apartment switching, and I noted that everything works when it works. But what if it doesn&#8217;t work?<\/p>\n<p>Recall that the core of the apartment-switching code is this function:<\/p>\n<pre>void resume_apartment_sync(\r\n    com_ptr&lt;IContextCallback&gt; const&amp; context,\r\n    std::coroutine_handle&lt;&gt; handle)\r\n{\r\n    com_callback_args args{};\r\n    args.data = handle.address();\r\n\r\n    check_hresult(\r\n        context-&gt;ContextCallback(resume_apartment_callback,\r\n            &amp;args,\r\n            guid_of&lt;ICallbackWithNoReentrancyToApplicationSTA&gt;(),\r\n            5, nullptr));\r\n}\r\n<\/pre>\n<p>If the <code>ContextCallback<\/code> method fails, <code>check_hresult<\/code> will throw a C++\/WinRT exception to whoever is calling.<\/p>\n<p>In the case of <code>co_await<\/code>&#8216;ing an <code>apartment_<wbr \/>context<\/code>, the caller is the <code>await_<wbr \/>suspend()<\/code> that is running in the context of the calling coroutine, so the caller can handle (or not handle) the exception as it sees fit.<\/p>\n<p>The case that doesn&#8217;t work is the case where we are trying to return to the original apartment context when an awaited coroutine completes. In that case, the exception is thrown from the completion handler, which runs in the context of the completed coroutine, rather than the context of the resuming coroutine. That means that a failure to return to the original context is not catchable by the caller:<\/p>\n<pre>winrt::IAsyncAction Outer()\r\n{\r\n    co_await Inner();\r\n}\r\n<\/pre>\n<p>After <code>Inner()<\/code> completes, we try to return to the original COM context of <code>Outer()<\/code>, but if that fails, the exception is thrown in the <code>Completed<\/code> handler that we passed to <code>Inner<\/code>. The <code>Outer<\/code> never gets to see it. The <code>Outer<\/code> coroutine never resumes, which manifests itself as a hung coroutine (that is also leaked).<\/p>\n<p>To fix this, we need to resume the <code>Outer<\/code> coroutine, and then throw the exception as part of the execution of <code>Outer<\/code>. The <code>resume_<wbr \/>apartment_<wbr \/>sync()<\/code> function is not running in the context of the <code>Outer<\/code>, so it can&#8217;t throw the exception yet. It has to save the error, so it can be thrown later.<\/p>\n<pre>inline void resume_apartment_sync(\r\n    com_ptr&lt;IContextCallback&gt; const&amp; context,\r\n    coroutine_handle&lt;&gt; handle,\r\n    <span style=\"color: #08f;\">int32_t* failure<\/span>)\r\n{\r\n    com_callback_args args{};\r\n    args.data = handle.address();\r\n    <span style=\"color: #08f;\">auto result =<\/span>\r\n        context-&gt;ContextCallback(resume_apartment_callback,\r\n            &amp;args,\r\n            guid_of&lt;ICallbackWithNoReentrancyToApplicationSTA&gt;(),\r\n            5, nullptr);\r\n\r\n    <span style=\"color: #08f;\">if (result &lt; 0) {\r\n        *failure = result;\r\n        handle();\r\n    }<\/span>\r\n}\r\n<\/pre>\n<p>If we are unable to resume the coroutine in the correct apartment, then we record the failure in the caller-provided location and then <i>resume the coroutine anyway<\/i> on the wrong thread. The expectation is that upon resumption, the coroutine will check that location and see that the apartment-switch failed and re-throw the exception, this time while inside the execution context of the <code>Outer<\/code>.<\/p>\n<p><b>Exercise<\/b>: Why does <code>resume_<wbr \/>apartment_<wbr \/>sync()<\/code> update <code>*failure<\/code> only if <code>Context\u00adCallback<\/code> failed? Shouldn&#8217;t we update it on success, too?<\/p>\n<pre>    <span style=\"color: #08f;\">*failure =<\/span>\r\n        context-&gt;ContextCallback(resume_apartment_callback,\r\n            &amp;args,\r\n            guid_of&lt;ICallbackWithNoReentrancyToApplicationSTA&gt;(),\r\n            5, nullptr);\r\n    <span style=\"color: #08f;\">if (*failure &lt; 0) {\r\n        handle();\r\n    }<\/span>\r\n<\/pre>\n<p>The answer to the exercise is at the end of this article.<\/p>\n<p>We now need to teach our callers to call <code>resume_<wbr \/>apartment_<wbr \/>sync<\/code> in the new way:<\/p>\n<pre>struct apartment_awaiter\r\n{\r\n    apartment_context const&amp; context;\r\n    <span style=\"color: #08f;\">int32_t failure = 0;<\/span>\r\n\r\n    bool await_ready() const noexcept\r\n    {\r\n        return false;\r\n    }\r\n\r\n    void await_suspend(coroutine_handle&lt;&gt; handle)\r\n    {\r\n        apartment_context extend_lifetime = context;\r\n        resume_apartment(context.context, handle,\r\n            <span style=\"color: #08f;\">&amp;failure<\/span>);\r\n    }\r\n\r\n    void await_resume() const <span style=\"color: #c65353;\">\/\/ <span style=\"text-decoration: line-through;\">noexcept<\/span><\/span>\r\n    {\r\n        <span style=\"color: #08f;\">check_hresult(failure);<\/span>\r\n    }\r\n};\r\n<\/pre>\n<p>When the coroutine resumes, it calls <code>await_resume()<\/code>, and that is where we check whether the apartment switch was successful. If not, we throw an exception from <code>await_resume()<\/code>, which is running in the context of <code>Outer<\/code> and therefore can be caught and reported like any other exception that occurs in a coroutine.<\/p>\n<p>We do the same thing for coroutine resumption after <code>co_await<\/code>&#8216;ing a Windows Runtime asynchronous operation.<\/p>\n<pre>template &lt;typename Async&gt;\r\nstruct await_adapter\r\n{\r\n    await_adapter(Async const&amp; async) : async(async) { }\r\n\r\n    Async const&amp; async;\r\n    <span style=\"color: #08f;\">int32_t failure = 0;<\/span>\r\n\r\n    bool await_ready() const noexcept\r\n    {\r\n        return false;\r\n    }\r\n\r\n    void await_suspend(coroutine_handle&lt;&gt; handle) const\r\n    {\r\n        auto extend_lifetime = async;\r\n        async.Completed([\r\n            handle,\r\n            <span style=\"color: #08f;\">this,<\/span>\r\n            context = resume_apartment_context()\r\n        ](auto&amp;&amp; ...)\r\n        {\r\n            resume_apartment(context.context, handle,\r\n                <span style=\"color: #08f;\">&amp;failure<\/span>);\r\n        });\r\n    }\r\n\r\n    auto await_resume() const\r\n    {\r\n        <span style=\"color: #08f;\">check_hresult(failure);<\/span>\r\n        return async.GetResults();\r\n    }\r\n};\r\n<\/pre>\n<p>Things are getting better, but there is still room for improvement. We&#8217;ll continue our study next time.<\/p>\n<p><b>Answer to exercise<\/b>: We cannot store the answer into <code>*failure<\/code>, because a successful call to <code>Context\u00adCallback<\/code> resumes the coroutine. When the coroutine resumes, it calls <code>await_resume()<\/code> and then destructs the awaiter. If we had updated <code>*failure<\/code> on success, we risk writing to an already-destructed object and corrupting memory.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you can&#8217;t get back to where you started, who you gonna call?<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-107770","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>If you can&#8217;t get back to where you started, who you gonna call?<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/107770","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=107770"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/107770\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=107770"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=107770"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=107770"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}