{"id":105815,"date":"2021-10-20T07:00:00","date_gmt":"2021-10-20T14:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=105815"},"modified":"2023-08-04T07:00:28","modified_gmt":"2023-08-04T14:00:28","slug":"20211020-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20211020-00\/?p=105815","title":{"rendered":"My code crashed when I asked WIL to convert an exception to an HRESULT, did I throw an improper exception type?"},"content":{"rendered":"<p>A customer used <a href=\"https:\/\/docs.microsoft.com\/en-us\/cpp\/cppcx\/wrl\/windows-runtime-cpp-template-library-wrl?view=msvc-160\"> WRL<\/a> to implement their COM objects, with the help of <a href=\"https:\/\/github.com\/microsoft\/wil\">wil<\/a>. Meanwhile, they also had client code that instantiated one of those COM objects. An exception was thrown and caught, but the code crashed trying to convert it to an <code>HRESULT<\/code>:<\/p>\n<pre>\/\/ Simplified\r\nint __cdecl wmain()\r\n{\r\n    CoInitialize(nullptr);\r\n    HRESULT result;\r\n    try {\r\n        auto widget = wil::CoCreateInstance&lt;ContosoWidget,\r\n                                            IContosoWidget&gt;();\r\n        result = widget-&gt;ReversePolarity();\r\n    } catch (...) {\r\n        result = wil::ResultFromCaughtException(); \/\/ crash here\r\n    }\r\n\r\n    if (SUCCEEDED(result)) {\r\n        return 0;\r\n    }   \r\n\r\n    DisplayErrorMessage(result);\r\n    return 1;\r\n}\r\n<\/pre>\n<p>The crash looked like this:<\/p>\n<pre>ucrtbase!FindHandler&lt;__FrameHandler4&gt;+0x2d3:\r\n00007ff9`c3ffec2f movsxd  r13,dword ptr [rcx+0Ch]\r\n                                    ds:00007ff9`abf00b94=????????\r\n\r\nucrtbase!FindHandler&lt;__FrameHandler4&gt;+0x2d3\r\nucrtbase!__InternalCxxFrameHandler&lt;__FrameHandler4&gt;+0x278\r\nucrtbase!__CxxFrameHandler4+0xa0\r\nntdll!RtlpExecuteHandlerForException+0xf\r\nntdll!RtlDispatchException+0x244\r\nntdll!RtlRaiseException+0x185\r\nKERNELBASE!RaiseException+0x69\r\nucrtbase!_CxxThrowException+0xad\r\nclient!wil::details::ResultFromCaughtExceptionInternal+0xad\r\nclient!wil::ResultFromCaughtException+0x33\r\nclient!wmain$catch$6+0x12\r\nucrtbase!_CallSettingFrame_LookupContinuationIndex+0x20\r\nucrtbase!__FrameHandler4::CxxCallCatchBlock+0x10d\r\nntdll!RcFrameConsolidation+0x6\r\nclient!wmain+0x99\r\nclient!invoke_main+0x22\r\nclient!__scrt_common_main_seh+0x10c\r\nkernel32!BaseThreadInitThunk+0x14\r\nntdll!RtlUserThreadStart+0x21\r\n<\/pre>\n<p>The default behavior of <code>wil::Result\u00adFrom\u00adCaught\u00adException<\/code> is to fail fast on an unrecognized exception. But the above looks like a crash, not a fail-fast exception. Is that what fail-fasts look like now? How can we dig out the exception type that went unrecognized?<\/p>\n<p>The crash we see is not a fail-fast exception. What happened is that we crashed while trying to decode the exception. We haven&#8217;t gotten to the point of rejecting it and failing fast; we don&#8217;t yet know what it is!<\/p>\n<p>Visual Studio comes with source code for the <code>FindHandler<\/code> function (in <code>frame.cpp<\/code>), so we can use that to help us figure out where things blew up. In fact, all we need is the function prototype:<\/p>\n<pre>template &lt;class T&gt;\r\nstatic void FindHandler(\r\n    EHExceptionRecord *pExcept, \/\/ Information for this (logical)\r\n                                \/\/   exception\r\n    EHRegistrationNode *pRN,    \/\/ ...\r\n    CONTEXT *pContext,          \/\/ Context info\r\n    \/* other parameters not interesting *\/\r\n)\r\n<\/pre>\n<p>The valuable information is the <code>pExcept<\/code>, which tells us the exception we&#8217;re trying to handle, and the <code>pContext<\/code> which tells us who threw it.<\/p>\n<p>The Windows x86-64 calling convention passes the first parameter in <code>rcx<\/code> and the third parameter in <code>r8<\/code>, so those are the ones we need to track.<\/p>\n<pre>0:000&gt; u .-2d3\r\nucrtbase!FindHandler&lt;__FrameHandler4&gt;\r\n00007ff9`c3ffe95c push    rbp\r\n00007ff9`c3ffe95e push    rbx\r\n00007ff9`c3ffe95f push    rsi\r\n00007ff9`c3ffe960 push    rdi\r\n00007ff9`c3ffe961 push    r12\r\n00007ff9`c3ffe963 push    r13\r\n00007ff9`c3ffe965 push    r14\r\n00007ff9`c3ffe967 push    r15\r\n00007ff9`c3ffe969 lea     rbp,[rsp-88h]\r\n00007ff9`c3ffe971 sub     rsp,188h\r\n00007ff9`c3ffe978 mov     rax,qword ptr [ucrtbase!__security_cookie (00007ff9`c40af450)]\r\n00007ff9`c3ffe97f xor     rax,rsp\r\n00007ff9`c3ffe982 mov     qword ptr [rbp+70h],rax\r\n00007ff9`c3ffe986 mov     rax,qword ptr [rbp+108h]\r\n00007ff9`c3ffe98d mov     r12,rdx\r\n00007ff9`c3ffe990 mov     r14,qword ptr [rbp+0F0h]\r\n00007ff9`c3ffe997 <span style=\"border: solid 1px currentcolor;\">mov     rbx,rcx<\/span> \/\/ exception\r\n00007ff9`c3ffe99a mov     qword ptr [rbp-60h],rdx\r\n00007ff9`c3ffe99e xor     r13b,r13b\r\n00007ff9`c3ffe9a1 mov     rcx,r14\r\n00007ff9`c3ffe9a4 <span style=\"border: solid 1px currentcolor;\">mov     qword ptr [rsp+70h],r8<\/span> \/\/ context\r\n00007ff9`c3ffe9a9 mov     rdx,r9\r\n00007ff9`c3ffe9ac mov     qword ptr [rbp-80h],rax\r\n00007ff9`c3ffe9b0 mov     rsi,r9\r\n<\/pre>\n<p>Okay, so the exception pointer is in <code>rbx<\/code> and the context pointer is on the stack at <code>rsp+70h<\/code>.<\/p>\n<pre>0:000&gt; .exr @rbx\r\nExceptionAddress: 00007ff9c3a64ed9 (KERNELBASE!RaiseException+0x0000000000000069)\r\n   ExceptionCode: e06d7363 (C++ EH exception)\r\n  ExceptionFlags: 00000001\r\nNumberParameters: 4\r\n   Parameter[0]: 0000000019930520\r\n   Parameter[1]: 000000638477c910\r\n   Parameter[2]: 00007ff9abf00b88\r\n   Parameter[3]: 00007ff9abec0000\r\n  pExceptionObject: 000000638477c910\r\n  _s_ThrowInfo    : 00007ff9abf00b88\r\n<\/pre>\n<p>The <code>.exr<\/code> command was kind enough to decode the parameters of the thrown C++ exception and give us the exception object. Let&#8217;s look at that exception object:<\/p>\n<pre>0:000&gt; dps 000000638477c910\r\n00000063`8477c910  00007ff9`abef3110 &lt;Unloaded_contoso.dll&gt;+0x33110\r\n00000063`8477c918  00000000`00000000\r\n00000063`8477c920  00000000`00000000\r\n00000063`8477c928  80070005`00000000\r\n<\/pre>\n<p>Uh-oh.<\/p>\n<p>What happened here is that <code>contoso.dll<\/code> threw an exception, and it escaped the module and was caught by <code>client!wmain<\/code>. And as the stack unwound, at some point the DLL got unloaded, so when <code>client!wmain<\/code> tried to inspect the object, it crashed trying to figure out <i>what it was<\/i>.<\/p>\n<p>The <code>ucrtbase!<wbr \/>FindHandler&lt;<wbr \/>__FrameHandler4&gt;<\/code> appears to be going through the same exercise I described some time ago when I explained how to <a href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20100730-00\/?p=13273\"> decode the parameters of a thrown C++ exception<\/a>. But it crashed before it could get to the type information.<\/p>\n<p>Notice the value <code>0x80070005<\/code>, which corresponds to <code>E_<wbr \/>ACCESS_<wbr \/>DENIED<\/code>. There&#8217;s a good chance that that is the exception being thrown.<\/p>\n<p>We hope that the exception that was thrown was a WIL exception. (Remember, debugging is an exercise in optimism.) Since <code>client.dll<\/code> also uses WIL, we can use <code>client.dll<\/code> debugging information to parse the <code>wil::<wbr \/>Result\u00adException<\/code>:<\/p>\n<pre>0:000&gt; dt client!wil::ResultException\r\n   +0x000 __VFN_table : Ptr64\r\n   +0x008 _Data            : __std_exception_data\r\n   +0x018 m_failure        : wil::StoredFailureInfo\r\n   +0x0b8 m_what           : wil::details::shared_buffer\r\n0:000&gt; ?? ((client!wil::ResultException*)0x00000063`8477c910)\r\nclass wil::ResultException * 0x00000063`8477c910\r\n   +0x000 __VFN_table : 0x00007ff9`abef3110\r\n   +0x008 _Data            : __std_exception_data\r\n   +0x018 m_failure        : wil::StoredFailureInfo\r\n   +0x0b8 m_what           : wil::details::shared_buffer\r\n0:000&gt; ?? ((client!wil::ResultException*)0x00000063`8477c910)-&gt;m_failure\r\nclass wil::StoredFailureInfo\r\n   +0x000 m_failureInfo    : wil::FailureInfo\r\n   +0x090 m_spStrings      : wil::details::shared_buffer\r\n0:000&gt; ?? ((client!wil::ResultException*)0x00000063`8477c910)-&gt;m_failure.m_failureInfo\r\nstruct wil::FailureInfo\r\n   +0x000 type             : 0 ( Exception )\r\n   +0x004 hr               : 80070005\r\n   +0x008 failureId        : 0n1\r\n   +0x010 pszMessage       : (null)\r\n   +0x018 threadId         : 0x7d830\r\n   +0x020 pszCode          : (null)\r\n   +0x028 pszFunction      : (null)\r\n   +0x030 pszFile          : 0x000001d5`679582e4  \"contoso\\widget\\connection.cpp\"\r\n   +0x038 uLineNumber      : 44\r\n   +0x03c cFailureCount    : 0n1\r\n   +0x040 pszCallContext   : (null)\r\n   +0x048 callContextOriginating : wil::CallContextInfo\r\n   +0x060 callContextCurrent : wil::CallContextInfo\r\n   +0x078 pszModule        : 0x000001d5`67958314  \"contoso.dll\"\r\n   +0x080 returnAddress    : 0x00007ff9`abed698c Void\r\n   +0x088 callerReturnAddress : 0x00007ff9`abed6654 Void\r\n<\/pre>\n<p>Things seem to line up pretty well. Line 44 of <code>connection.cpp<\/code> could indeed throw an exception:<\/p>\n<pre>Connection::Connection()\r\n{\r\n    ...\r\n    session = wil::CoCreateInstance&lt;ContosoUserSession,\r\n                                    IContosoUserSession&gt;();\r\n    ...\r\n}\r\n<\/pre>\n<p>Let&#8217;s tell the debugger to load symbols for <code>contoso.dll<\/code> based on its last known address. That will make those return addresses decodable.<\/p>\n<pre>0:000&gt; !reload \/unl contoso.dll\r\n\r\n0:000&gt; ln 0x00007ff9`abed698c\r\n(00007ff9`abed6864)   contoso!Microsoft::WRL::Details::\r\n             MakeAndInitialize&lt;ContosoWidget,IUnknown&gt;+0x128\r\n\r\n0:000&gt; ln 0x00007ff9`abed6654\r\n(00007ff9`abed6600)   contoso!Microsoft::WRL::\r\n    SimpleClassFactory&lt;ContosoWidget,0&gt;::CreateInstance+0x54\r\n<\/pre>\n<p>The exception was thrown from <code>MakeAndInitialize<\/code>, which strongly suggests that came from the inlined constructor of <code>ContosoWidget<\/code>. The <code>pContext<\/code> will help us confirm this theory. Recall that we learned that the context pointer is on the stack at <code>rsp+70h<\/code>.<\/p>\n<pre>0:000&gt; .cxr poi(@rsp+70)\r\nrax=00007ff9abef55c8 rbx=00007ff9abf00b88 rcx=000000638477c690\r\nrdx=0000000700000020 rsi=000000638477f0e0 rdi=000000638477c910\r\nrip=00007ff9c3a64ed9 rsp=000000638477c7a0 rbp=000000638477c8e0\r\n r8=000001d567966d52  r9=0000000000000000 r10=000000638477c179\r\nr11=0000000000000003 r12=0000000000000000 r13=0000000000000000\r\nr14=000000000000002c r15=00007ff9abef6710\r\niopl=0         nv up ei pl nz na po nc\r\ncs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b\r\nKERNELBASE!RaiseException+0x69:\r\n00007ff9`c3a64ed9 0f1f440000      nop     dword ptr [rax+rax]\r\n0:000&gt; k\r\n  *** Stack trace for last set context - .thread\/.cxr resets it\r\nChild-SP          Call Site\r\n00000063`8477c7a0 KERNELBASE!RaiseException+0x69\r\n00000063`8477c880 ucrtbase!_CxxThrowException+0xad\r\n00000063`8477c8f0 contoso!wil::details::ThrowResultExceptionInternal+0x25\r\n00000063`8477c9f0 contoso!wil::ThrowResultException+0x16\r\n00000063`8477ca20 contoso!wil::details::ReportFailure+0x174\r\n00000063`8477df90 contoso!wil::details::ReportFailure_Hr+0x44\r\n00000063`8477dff0 contoso!wil::details::in1diag3::_Throw_Hr+0x26\r\n(Inline Function) contoso!wil::details::in1diag3::Throw_IfFailed+0x6a\r\n(Inline Function) contoso!Connection::{ctor}+0x98\r\n00000063`8477e040 contoso!Microsoft::WRL::Details::MakeAndInitialize&lt;ContosoWidget,IUnknown&gt;+0x128\r\n00000063`8477e0b0 contoso!Microsoft::WRL::SimpleClassFactory&lt;ContosoWidget,0&gt;::CreateInstance+0x54\r\n00000063`8477e0f0 combase!CServerContextActivator::CreateInstance+0x1d4\r\n00000063`8477e270 combase!ActivationPropertiesIn::DelegateCreateInstance+0x90\r\n00000063`8477e300 combase!CApartmentActivator::CreateInstance+0x9c\r\n00000063`8477e3b0 combase!CProcessActivator::CCICallback+0x58\r\n00000063`8477e400 combase!CProcessActivator::AttemptActivation+0x40\r\n00000063`8477e450 combase!CProcessActivator::ActivateByContext+0x91\r\n00000063`8477e4e0 combase!CProcessActivator::CreateInstance+0x80\r\n00000063`8477e530 combase!ActivationPropertiesIn::DelegateCreateInstance+0x90\r\n00000063`8477e5c0 combase!CClientContextActivator::CreateInstance+0x17f\r\n00000063`8477e870 combase!ActivationPropertiesIn::DelegateCreateInstance+0x90\r\n00000063`8477e900 combase!ICoCreateInstanceEx+0x90a\r\n00000063`8477f7f0 combase!CComActivator::DoCreateInstance+0x169\r\n(Inline Function) combase!CoCreateInstanceEx+0xd1\r\n00000063`8477f950 combase!CoCreateInstance+0x10c\r\n00000063`8477f9f0 client!wil::CoCreateInstance&lt;ContosoWidget,IContosoWidget,wil::err_exception_policy&gt;+0x3f\r\n00000063`8477fa40 client!wmain+0x99\r\n(Inline Function) client!invoke_main+0x22\r\n00000063`8477fac0 client!__scrt_common_main_seh+0x10c\r\n00000063`8477fb00 kernel32!BaseThreadInitThunk+0x14\r\n00000063`8477fb30 ntdll!RtlUserThreadStart+0x21\r\n<\/pre>\n<p>This gives us a much better view of what&#8217;s going on. The <code>ContosoWidget<\/code> object has a member variable that is a <code>Connection<\/code>, and construction of the <code>Connection<\/code> failed with an exception. The exception propagated out of the constructor, which destructed the partially-constructed <code>ContosoWidget<\/code> and then propagates the exception past <code>Make\u00adAnd\u00adInitialize<\/code>, the COM infrastructure, and was finally caught back in the client.<\/p>\n<p>The WRL library operates at the COM ABI layer, which means that it generally requires that nothing throws exceptions. (There are some places where it does support exceptions, but in general it doesn&#8217;t.) You can see that it slaps <code>throw()<\/code> around all its COM methods, meaning &#8220;These COM method don&#8217;t throw any exceptions (<a href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20180928-00\/?p=99855\">and I&#8217;m trusting you to honor that rule, no enforcement<\/a>).&#8221;<\/p>\n<p>In this case, the exception escaped <code>contoso.dll<\/code> and unwound all the way across <code>combase.dll<\/code>, which violates the rule against <a href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20120910-00\/?p=6653\"> throwing exceptions across stack frames you don&#8217;t control<\/a>.<\/p>\n<p>What seems to have happened is that as the exception propagated out of COM, COM realized that something bad happened in <code>contoso.dll<\/code>. &#8220;Hey there buddy, you feeling okay? Do you want to go home?&#8221; COM called <code>contoso.dll<\/code>&#8216;s <code>Dll\u00adCan\u00adUnload\u00adNow<\/code> function, and since there were no active COM objects in <code>contoso.dll<\/code>, it said, &#8220;Yeah, I&#8217;m not needed here. You can unload me.&#8221;<\/p>\n<p>But there <i>was<\/i> an active object in <code>contoso.dll<\/code>: the exception object that it just threw!<\/p>\n<p>COM unloaded <code>contoso.dll<\/code> so it could go home and get some rest. And then when <code>client!wmain<\/code> caught the exception and tried to interrogate it, it crashed because the exception source had already been unloaded.<\/p>\n<p>The fix here is not to throw exceptions from constructors of COM objects that use <code>WRL<\/code> as their factory, because the WRL factory is just going to wave good-bye to the exception as it leaves the DLL. (It doesn&#8217;t have much choice, seeing as it has no way of interpreting the exception and to convert it to an <code>HRESULT<\/code>.)<\/p>\n<p>If you want to fail the creation of an object implemented in WRL, you can do so by moving all the potentially-failable things out of the constructor and into the <code>Runtime\u00adClass\u00adInitialize<\/code> method. You can have that method return a failure <code>HRESULT<\/code> when it is not happy.<\/p>\n<p>We went back to the code and found that in the time since this crash was identified, somebody else had already fixed the bug by accident! The member variable type was changed from <code>Connection<\/code> to <code>std::unique_ptr&lt;Connection&gt;<\/code>, and the <code>Connection<\/code> itself was created on demand rather than in the constructor. The reason for the change was commented as<\/p>\n<pre>    \/\/ Establish a connection on first use. This is done here instead of the\r\n    \/\/ constructor so we can return a meaningful HRESULT to the caller.\r\n<\/pre>\n<p>Moving the failure out of the constructor also has the nice benefit of not crashing.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Digging into the failure more closely.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-105815","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>Digging into the failure more closely.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/105815","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=105815"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/105815\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=105815"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=105815"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=105815"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}