{"id":110257,"date":"2024-09-13T07:00:00","date_gmt":"2024-09-13T14:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=110257"},"modified":"2024-09-13T09:13:11","modified_gmt":"2024-09-13T16:13:11","slug":"20240913-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20240913-00\/?p=110257","title":{"rendered":"The case of the fail-fast crashes coming from the power management system"},"content":{"rendered":"<p>A customer reported that they were seeing around four thousand crashes a day from an internal function <code>Rtlp\u00adHandle\u00adInvalid\u00adUser\u00adCall\u00adTarget<\/code>. Here&#8217;s one of their crash dumps:<\/p>\n<pre style=\"line-height: normal;\">Child-SP          RetAddr               Call Site\r\n0000009b`c5bfefc8 00007ffd`e32ed7ad     ntdll!RtlFailFast2\r\n0000009b`c5bfefd0 00007ffd`e327c798     ntdll!RtlpHandleInvalidUserCallTarget+0x5d\r\n0000009b`c5bff000 00007ffd`dfdf2dee     ntdll!LdrpHandleInvalidUserCallTarget+0x38\r\n(Inline Function) --------`--------     powrprof!PowerpNotifyCallbackSafe+0x13\r\n<\/pre>\n<p>The <code>Rtlp\u00adHandle\u00adInvalid\u00adUser\u00adCall\u00adTarget<\/code> function is used by <a href=\"https:\/\/learn.microsoft.com\/windows\/win32\/secbp\/control-flow-guard\"> Control Flow Guard<\/a> when it detects that somebody is trying to call an invalid function pointer. So what is the invalid pointer?<\/p>\n<p>Since debugging is an exercise in optimism, let&#8217;s hope that the pointer is still in one of the registers.<\/p>\n<pre style=\"line-height: normal;\">0:103&gt; r\r\nrax=0000000000000000 rbx=<span style=\"border: solid 1px currentcolor;\">00007ffd5fe8be40<\/span> rcx=000000000000000a\r\nrdx=<span style=\"border: solid 1px currentcolor;\">00007ffd5fe8be40<\/span> rsi=0000000000000004 rdi=0000000000000000\r\nrip=00007ffde3292350 rsp=0000009bc5bfefc8 rbp=0000000000000000\r\n r8=0000000000000000  r9=0000000000000003 r10=0000000000000001\r\nr11=0000000000000000 r12=0000000000000000 r13=0000000000000000\r\nr14=0000009bc5bffa98 r15=000000007ffe03b0\r\nntdll!RtlFailFast2:\r\n0:103&gt;\r\n<\/pre>\n<p>There are only two things that look like possible function pointers,\u00b9 and they are both equal, so let&#8217;s see if we&#8217;re lucky.<\/p>\n<pre style=\"line-height: normal;\">0:103&gt; u @rbx L1\r\n&lt;Unloaded_ContosoVirtualCamera.dll&gt;+0x7be40:\r\n00007ffd`5fe8be40 ??              ???\r\n<\/pre>\n<p>Bingo. Got it in one.<\/p>\n<p>It kind of makes sense that we&#8217;d find the function pointer in the <code>rdx<\/code> register, since that holds the second function parameter. (The first function parameter is <code>rcx<\/code>, which holds the fail-fast code <code>0x0000000A<\/code>:<\/p>\n<pre style=\"line-height: normal;\">#define FAST_FAIL_GUARD_ICALL_CHECK_FAILURE         10\r\n<\/pre>\n<p>which tells us that we have a CFG failure. So it&#8217;s not too surprising that the second parameter is the pointer that failed validation.)<\/p>\n<p>If we wanted to be more methodical about it, we could look where the function pointer got saved. Let&#8217;s look at the code in <code>Rtlp\u00adHandle\u00adInvalid\u00adUser\u00adCall\u00adTarget<\/code> up to the point where it called <code>Rtl\u00adFail\u00adFast2<\/code> and see if we can follow where the function pointer went. The goal is to find a path from the start of the function to the <code>Rtl\u00adFail\u00adFast2<\/code>, so I&#8217;ll highlight that path and de-emphasize the rest.<\/p>\n<pre style=\"line-height: normal;\">ntdll!RtlpHandleInvalidUserCallTarget:\r\n<span style=\"border: solid 1px currentcolor; border-bottom: none;\">    push    rbx                           <\/span>\r\n<span style=\"border: 1px currentcolor; border-style: none solid;\">    sub     rsp,20h                       <\/span>\r\n<span style=\"border: 1px currentcolor; border-style: none solid;\">    cmp     byte ptr [00007ffd`e33712a2],0<\/span>\r\n<span style=\"border: 1px currentcolor; border-style: none solid;\">    mov     rbx,rcx \u2190 saved rcx in rbx    <\/span>\r\n<span style=\"border: solid 1px currentcolor; border-top: none;\">    je      00007ffd`e32ed77f             <\/span>\r\n<span style=\"opacity: 50%;\">    call    ntdll!RtlpGuardIsSuppressedAddress (00007ffd`e32ed720)<\/span>\r\n<span style=\"opacity: 50%;\">    test    al,al<\/span>\r\n<span style=\"opacity: 50%;\">    je      00007ffd`e32ed77f<\/span>\r\n<span style=\"opacity: 50%;\">    mov     edx,1<\/span>\r\n<span style=\"opacity: 50%;\">    mov     rcx,rbx<\/span>\r\n<span style=\"opacity: 50%;\">    call    ntdll!RtlpGuardGrantSuppressedCallAccess (00007ffd`e32375b8)<\/span>\r\n<span style=\"opacity: 50%;\">00007ffd`e32ed778:<\/span>\r\n<span style=\"opacity: 50%;\">    add     rsp,20h<\/span>\r\n<span style=\"opacity: 50%;\">    pop     rbx<\/span>\r\n<span style=\"opacity: 50%;\">    ret<\/span>\r\n<span style=\"opacity: 50%;\">    int     3<\/span>\r\n00007ffd`e32ed77f:\r\n<span style=\"border: solid 1px currentcolor; border-bottom: none;\">    call    ntdll!LdrControlFlowGuardEnforcedWithExportSuppression (00007ffd`e32234e8)<\/span>\r\n<span style=\"border: 1px currentcolor; border-style: none solid;\">    test    eax,eax                                                                   <\/span>\r\n<span style=\"border: solid 1px currentcolor; border-top: none;\">    je      00007ffd`e32ed7a0                                                         <\/span>\r\n<span style=\"opacity: 50%;\">    mov     rcx,rbx<\/span>\r\n<span style=\"opacity: 50%;\">    call    ntdll!RtlGuardIsExportSuppressedAddress (00007ffd`e323765c)<\/span>\r\n<span style=\"opacity: 50%;\">    test    al,al<\/span>\r\n<span style=\"opacity: 50%;\">    je      00007ffd`e32ed7a0<\/span>\r\n<span style=\"opacity: 50%;\">    mov     rcx,rbx<\/span>\r\n<span style=\"opacity: 50%;\">    call    ntdll!RtlpUnsuppressForwardReferencingCallTarget (00007ffd`e32ed7b4)<\/span>\r\n<span style=\"opacity: 50%;\">    test    eax,eax<\/span>\r\n<span style=\"opacity: 50%;\">    jns     00007ffd`e32ed778<\/span>\r\n00007ffd`e32ed7a0:\r\n<span style=\"border: solid 1px currentcolor; border-bottom: none;\">    mov     rdx,rbx \u2190 rbx moved to rdx            <\/span>\r\n<span style=\"border: 1px currentcolor; border-style: none solid;\">    mov     ecx,0Ah                               <\/span>\r\n<span style=\"border: solid 1px currentcolor; border-top: none;\">    call    ntdll!RtlFailFast2 (00007ffd`e3292350)<\/span>\r\n<\/pre>\n<p>By following the flow, we see that the inbound <code>rcx<\/code> was saved in <code>rbx<\/code>, and then copied back to <code>rdx<\/code> for the fail-fast. So that&#8217;s where the function pointer is, and that also explains why we see the same value in both <code>rbx<\/code> and <code>rdx<\/code>.<\/p>\n<p>The conclusion, therefore, is that the Contoso virtual camera driver registered a power management callback (hard to tell which one, but it&#8217;s going to be <code>Power\u00adRegister\u00adSuspend\u00adResume\u00adNotification<\/code> or something like that), and they forgot to unregister it before their DLL unloaded. And then the power event occurred, and the power management system calls a callback that points to an unloaded DLL.<\/p>\n<p>So the next step here is to reach out to Contoso and let them know about the crashing bug in their virtual camera driver. Meanwhile, the customer can put the buggy versions of the Contoso virtual camera driver on their &#8220;do not use&#8221; list.<\/p>\n<p>\u00b9 Well, three if you count <code>rip<\/code>, but that&#8217;s not interesting because that&#8217;s the current instruction pointer!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Understanding why it decided to fail fast.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-110257","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>Understanding why it decided to fail fast.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/110257","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=110257"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/110257\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=110257"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=110257"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=110257"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}