{"id":43293,"date":"2014-12-26T07:00:00","date_gmt":"2014-12-26T07:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/2014\/12\/26\/debugging-walkthrough-access-violation-on-nonsense-instruction\/"},"modified":"2014-12-26T07:00:00","modified_gmt":"2014-12-26T07:00:00","slug":"debugging-walkthrough-access-violation-on-nonsense-instruction","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20141226-00\/?p=43293","title":{"rendered":"Debugging walkthrough: Access violation on nonsense instruction"},"content":{"rendered":"<p>\nA colleague of mine asked for help puzzling out a mysterious crash\ndump which arrived via Windows Error Reporting.\n<\/p>\n<pre>\nrax=00007fff219c5000 rbx=00000000023c8380 rcx=00000000023c8380\nrdx=0000000000000000 rsi=00000000043f0148 rdi=0000000000000000\nrip=00007fff21af2d22 rsp=000000000392e518 rbp=000000000392e580\n r8=00000000276e4639  r9=00000000043b2360 r10=00000000ffffffff\nr11=0000000000000000 r12=0000000000000001 r13=0000000000000000\nr14=000000000237cfc0 r15=00000000023d3ea0\niopl=0         nv up ei pl zr na po nc\ncs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010246\nnosebleed!CNosebleed::OnFrimble+0x1f891a:\n00007fff`21af2d22 30488b xor byte ptr [rax-75h],cl ds:00007fff`219c4f8b=41\n<\/pre>\n<p>\nWell that&#8217;s a pretty strange instruction.\nEspecially since it doesn&#8217;t match up with the source code at all.\n<\/p>\n<pre>\nvoid CNosebleed::OnFrimble(...)\n{\n    ...\n    if (CanFrumble(...))\n    {\n        ...\n    }\n    else\n    {\n        hr = pCereal-&gt;AddMilk(pCarton);\n        if (SUCCEEDED(hr))\n        {\n            pCereal-&gt;Snap();\n            pCereal-&gt;Crackle(false);\n            if (SUCCEEDED(pCereal-&gt;Pop(uId)) \/\/ &larr; crash here\n            {\n                ....\n            }\n        }\n    }\n    ....\n}\n<\/pre>\n<p>\nThere is no bit-toggling in the actual code.\nThe method calls to\n<a HREF=\"http:\/\/www.ricekrispies.com\/snap-crackle-pop\">\n<code>Snap<\/code>,\n<code>Crackle<\/code>, and\n<code>Pop<\/code><\/a>\nare all interface calls and therefore should be vtable calls.\nWe are clearly in a case of a bogus return address,\npossibly a stack smash (and therefore cause for concern from\na security standpoint).\n<\/p>\n<p>\nMy approach was to try to figure out what was happening\njust before the crash.\nAnd that meant figuring out how we ended up in the middle of\nan instruction.\n<\/p>\n<p>\nHere is the code surrounding the crash point.\n<\/p>\n<pre>\n00007fff`21af2d17 ff90d0020000    call    qword ptr [rax+2D0h]\n00007fff`21af2d1d 488b03          mov     rax,qword ptr [rbx]\n00007fff`21af2d20 8b5530          mov     edx,dword ptr [rbp+30h]\n00007fff`21af2d23 488bcb          mov     rcx,rbx\n<\/pre>\n<p>\nNotice that the code that crashed is actually the last byte of\nthe <code>mov edx, dword ptr [rbp+30h]<\/code> (the <code>30<\/code>)\nand the first two bytes of the <code>mov rcx, rbx<\/code> (the\n<code>488b<\/code>).\n<\/p>\n<p>\nDisassembling backward is a tricky business on a processor with\nvariable-length\ninstructions, so\nto get my bearings, I looked for the call to\n<code>Can&shy;Frumble<\/code>:\n<\/p>\n<pre>\n0:011&gt; #CanFrumble nosebleed!CNosebleed::OnFrimble\nnosebleed!CNosebleed::OnFrimble+0x1f883b\n00007fff`21af2c43 e8e0e40f00 call nosebleed!CNosebleed::CanFrumble\n<\/pre>\n<p>\nThe <code>#<\/code> command means\n&#8220;Start disassembling at the specified location and stop when you\nsee the string I passed.&#8221;\nThis is an automated way of just hitting <code>u<\/code> until you\nget to the thing you are looking for.\n<\/p>\n<p>\nNow that I am at some known good code, I can disassemble forward:\n<\/p>\n<pre>\n00007fff`21af2c48 488bcb          mov     rcx,rbx\n00007fff`21af2c4b 84c0            test    al,al\n00007fff`21af2c4d 0f849a000000    je      nosebleed!CNosebleed::OnFrimble+0x1f88e5 (00007fff`21af2ced)\n<\/pre>\n<p>\nThe above instructions check whether the <code>Can&shy;Frumble<\/code>\nreturned <code>true<\/code>, and if not, it jumps to\n<code>00007fff`21af2ced<\/code>.\nSince we know that we are in the <code>false<\/code> path,\nwe follow the jump.\n<\/p>\n<pre>\n\/\/ Make a vtable call into pCereal-&gt;AddMilk()\n00007fff`21af2ced 488b03          mov     rax,qword ptr [rbx] ; vtable\n00007fff`21af2cf0 498bd7          mov     rdx,r15 ; pCarton\n00007fff`21af2cf3 ff9068010000    call    qword ptr [rax+168h] ; call\n00007fff`21af2cf9 8bf8            mov     edi,eax ; save to hr\n00007fff`21af2cfb 85c0            test    eax,eax ; succeeded?\n00007fff`21af2dfd 0f880dffffff    js      nosebleed!CNosebleed::OnFrimble+0x1f8808 (00007fff`21af2c10)\n\/\/ Now call Snap()\n00007fff`21af2d03 488b03          mov     rax,qword ptr [rbx] ; vtable\n00007fff`21af2d06 488bcb          mov     rcx,rbx ; \"this\"\n00007fff`21af2d09 ff9070020000    call    qword ptr [rax+270h] ; Snap\n\/ Now call Crackle\n00007fff`21af2d0f 488b03          mov     rax,qword ptr [rbx] ; vtable\n00007fff`21af2d12 33d2            xor     edx,edx ; parameter: false\n00007fff`21af2d14 488bcb          mov     rcx,rbx ; \"this\"\n00007fff`21af2d17 ff90d0020000    call    qword ptr [rax+2D0h] ; Crackle\n\/\/ Get ready to Pop\n00007fff`21af2d1d 488b03          mov     rax,qword ptr [rbx] ; vtable\n00007fff`21af2d20 8b5530          mov     edx,dword ptr [rbp+30h] ; uId\n00007fff`21af2d23 488bcb          mov     rcx,rbx ; \"this\"\n<\/pre>\n<p>\nBut we never got to execute the <code>Pop<\/code>\nbecause our return address from <code>Crackle<\/code> got messed up.\n<\/p>\n<p>\nLet&#8217;s follow the call into <code>Crackle<\/code>.\n<\/p>\n<pre>\n0:011&gt; dps @rbx l1\n00000000`02b4b790  00007fff`219c50a0 nosebleed!CCereal::`vftable'\n0:011&gt; dps 00007fff`219c50a0+2d0 l1\n00007fff`219c5370  00007fff`21aa5c28 nosebleed!CCereal::Crackle\n0:011&gt; u 00007fff`21aa5c28\nnosebleed!CCereal::Crackle:\n00007fff`21aa5c28 889163010000    mov     byte ptr [rcx+163h],dl\n00007fff`21aa5c2e c3              ret\n<\/pre>\n<p>\nSo at least the <code>pCereal<\/code> pointer seems to be okay.\nIt has a vtable and the slot in the vtable points to the function\nwe expect.\nThe <code>Crackle<\/code> method merely stashes the\n<code>bool<\/code> parameter\ninto a member variable.\nNo stack corruption here because <code>rbx<\/code> is nowhere near\n<code>rsp<\/code>.\n<\/p>\n<pre>\n0:012&gt; db @rbx+163 l1\n00000000`02b4b8f3  ??                                               ?\n<\/pre>\n<p>\nSadly, the byte in question was not captured in the dump,\nso we cannot verify whether the call actually was made.\nSimilarly, the members of <code>CCereal<\/code> manipulated by the\n<code>Snap<\/code> method were also not captured in the dump,\nso we can&#8217;t verify that either.\n(The only member of <code>CCereal<\/code> captured in the dump\nis the vtable itself.)\n<\/p>\n<p>\nSo we can&#8217;t find any evidence one way or the other as to whether\nany of the calls leading up to <code>Pop<\/code> actually occurred.\nMaybe we can try to figure out how many misaligned instructions\nwe managed to execute before we crashed, see if that reveals\nanything.\nTo do this,\nI&#8217;m going to disassemble at varying incorrect offsets and see which\nones lead to the instruction that crashed.\n<\/p>\n<pre>\n0:011&gt; u .-1 l2\nnosebleed!CNosebleed::OnFrimble+0x1f8919:\n00007fff`21af2d21 55              push    rbp\n00007fff`21af2d22 30488b          xor     byte ptr [rax-75h],cl\n\/\/ ^^ this looks interesting; we'll come back to it\n0:011&gt; u .-3 l2\nnosebleed!CNosebleed::OnFrimble+0x1f8917:\n00007fff`21af2d1f 038b5530488b    add     ecx,dword ptr [rbx-74B7CFABh]\n00007fff`21af2d25 cb              retf\n\/\/ ^^ this doesn't lead to the crashed instruction\n0:011&gt; u .-4 l2\nnosebleed!CNosebleed::OnFrimble+0x1f8916:\n00007fff`21af2d1e 8b03            mov     eax,dword ptr [rbx]\n00007fff`21af2d20 8b5530          mov     edx,dword ptr [rbp+30h]\n\/\/ ^^ this doesn't lead to the crashed instruction\n0:012&gt; u .-5 l3\nnosebleed!CNosebleed::OnFrimble+0x1f8914:\n00007fff`21af2d1c 00488b          add     byte ptr [rax-75h],cl\n00007fff`21af2d1f 038b5530488b    add     ecx,dword ptr [rbx-74B7CFABh]\n00007fff`21af2d25 cb              retf\n\/\/ ^^ this doesn't lead to the crashed instruction\n0:012&gt; u .-6 l3\nnosebleed!CNosebleed::OnFrimble+0x1f8913:\n00007fff`21af2d1b 0000            add     byte ptr [rax],al\n00007fff`21af2d1d 488b03          mov     rax,qword ptr [rbx]\n00007fff`21af2d20 8b5530          mov     edx,dword ptr [rbp+30h]\n\/\/ ^^ this doesn't lead to the crashed instruction\n<\/pre>\n<p>\n<b>Exercise<\/b>: Why didn&#8217;t I bother checking\n<code>.-2<\/code>?\n<\/p>\n<p>\nYou only need to test as far back as the maximum instruction length,\nand in practice you can give up much sooner because the maximimum\ninstruction length involves a lot of prefixes which are unlikely to\noccur in real code.\n<\/p>\n<p>\nThe only single-instruction rewind that makes sense is the\n<code>push rbp<\/code>.\nLet&#8217;s see if it matches.\n<\/p>\n<pre>\n0:011&gt; ?? @rbp\nunsigned int64 0x453e700\n0:011&gt; dps @rsp l1\n00000000`0453e698  00000000`0453e700\n<\/pre>\n<p>\nYup, it lines up.\nThis wayward push is also consistent with the stack frame layout\nfor the function.\n<\/p>\n<pre>\nnosebleed!CNosebleed::OnFrimble:\n00007fff`218fa408 48895c2410      mov     qword ptr [rsp+10h],rbx\n00007fff`218fa40d 4889742418      mov     qword ptr [rsp+18h],rsi\n00007fff`218fa412 55              push    rbp\n00007fff`218fa413 57              push    rdi\n00007fff`218fa414 4154            push    r12\n00007fff`218fa416 4156            push    r14\n00007fff`218fa418 4157            push    r15\n00007fff`218fa41a 488bec          mov     rbp,rsp\n00007fff`218fa41d 4883ec60        sub     rsp,60h\n<\/pre>\n<p>\nThe values of <code>rbp<\/code> and <code>rsp<\/code>\nshould differ by <code>0x60<\/code>.\n<\/p>\n<pre>\n0:012&gt; ?? @rbp-@rsp\nunsigned int64 0x68\n<\/pre>\n<p>\nThe difference is in error by 8 bytes, exactly the size of the\n<code>rbp<\/code> register that was pushed.\n<\/p>\n<p>\nIt therefore seems highly likely that the\n<code>push rbp<\/code> was executed.\n<\/p>\n<p>\nRepeating the exercise to find the instruction before the\n<code>push rbp<\/code> shows that no instruction fell through to\nthe <code>push rbp<\/code>.\nTherefore, execution jumped to\n<code>00007fff`21af2d21<\/code> somehow.\n<\/p>\n<p>\nAnother piece of data is that <code>rax<\/code> matches the value\nwe expect it to have, sort of.\nHere are some selected lines from earlier in the debug session:\n<\/p>\n<pre>\n\/\/ What we expected to have executed\n00007fff`21af2d1e 8b03            mov     eax,dword ptr [rbx]\n\/\/ The value we expected to have fetched\n0:011&gt; dps @rbx l1\n00000000`02b4b790  00007fff`219c50a0 nosebleed!CCereal::`vftable'\n\/\/ The value in the rax register\nrax=00007fff219c5000 ...\n<\/pre>\n<p>\nThe value we expect is\n<code>00007fff`219c50a0<\/code>,\nbut the value in the register has the bottom eight bits cleared.\n<\/p>\n<p>\nPutting this all together, my theory is that the CPU executed\nthe instruction at\n<code>00007fff`21af2d1e<\/code>,\nand then due to some sort of hardware failure,\ninstead of incrementing the <code>rip<\/code> register by two,\nit (1)&nbsp;incremented it by three, and then\n(2)&nbsp;as part of its confusion,\nzeroed out the bottom byte of <code>rax<\/code>.\nThe erroneous <code>rip<\/cODE> led\nto the rogue <code>push rbp<\/code> and the crash\non the nonsensical <code>xor<\/code>.\n<\/p>\n<p>\nIt's not a great theory, but it's all I got.\n<\/p>\n<p>\nAs to what sort of hardware failure could have occurred:\nThis particular failure was reported twice,\nso a cosmic ray is less likely to be the culprit\n(because you have to get lightning to strike twice)\nthan overheating or\n<a HREF=\"http:\/\/blogs.msdn.com\/b\/oldnewthing\/archive\/2005\/04\/12\/407562.aspx\">\noverclocking<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A colleague of mine asked for help puzzling out a mysterious crash dump which arrived via Windows Error Reporting. rax=00007fff219c5000 rbx=00000000023c8380 rcx=00000000023c8380 rdx=0000000000000000 rsi=00000000043f0148 rdi=0000000000000000 rip=00007fff21af2d22 rsp=000000000392e518 rbp=000000000392e580 r8=00000000276e4639 r9=00000000043b2360 r10=00000000ffffffff r11=0000000000000000 r12=0000000000000001 r13=0000000000000000 r14=000000000237cfc0 r15=00000000023d3ea0 iopl=0 nv up ei pl zr na po nc cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010246 nosebleed!CNosebleed::OnFrimble+0x1f891a: 00007fff`21af2d22 30488b [&hellip;]<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[26],"class_list":["post-43293","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-other"],"acf":[],"blog_post_summary":"<p>A colleague of mine asked for help puzzling out a mysterious crash dump which arrived via Windows Error Reporting. rax=00007fff219c5000 rbx=00000000023c8380 rcx=00000000023c8380 rdx=0000000000000000 rsi=00000000043f0148 rdi=0000000000000000 rip=00007fff21af2d22 rsp=000000000392e518 rbp=000000000392e580 r8=00000000276e4639 r9=00000000043b2360 r10=00000000ffffffff r11=0000000000000000 r12=0000000000000001 r13=0000000000000000 r14=000000000237cfc0 r15=00000000023d3ea0 iopl=0 nv up ei pl zr na po nc cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010246 nosebleed!CNosebleed::OnFrimble+0x1f891a: 00007fff`21af2d22 30488b [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/43293","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=43293"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/43293\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=43293"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=43293"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=43293"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}