{"id":93635,"date":"2016-06-09T07:00:00","date_gmt":"2016-06-09T21:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/?p=93635"},"modified":"2019-03-13T11:50:50","modified_gmt":"2019-03-13T18:50:50","slug":"20160609-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20160609-00\/?p=93635","title":{"rendered":"Investigating an app compat problem: Part 2: Digging in"},"content":{"rendered":"<p>We left our story with the conclusion that the program crashed because its TLS slot was null. But how can we figure out who sets the TLS slot and why it failed to set the TLS slot? <\/p>\n<p>Let&#8217;s hope that the reason is close to the failure (because debugging is an exercise in optimism) and see if we can find the code that is supposed to set the TLS value and figure out why it failed. <\/p>\n<p>This is where we roll up our sleeves and get our hands dirty. <\/p>\n<p>Here is the function that crashed. Let&#8217;s do some reverse-compilation. My personal convention is as follows: <\/p>\n<ul>\n<li>Register-sized variables are left untyped until I figure out what type it really is. If I must specify a type for a variable declaration, I use <code>int<\/code> or <code>void*<\/code>. (If the type turns out really to be an <code>int<\/code>, I use <code>int32_t<\/code>.) <\/li>\n<li>Local variables are named <code>localXX<\/code> where <code>XX<\/code> is the offset of the variable relative to the frame pointer.<\/li>\n<li>Member variables are named <code>m_XX<\/code> where <code>XX<\/code> is the offset of the member relative to the start of the object.<\/li>\n<li>Functions are named <code>f_XXXXXXXX<\/code> where <code>XXXXXXXX<\/code> is the address of the first instruction.<\/li>\n<\/ul>\n<pre>\ncontoso!ContosoInitialize+0x4d40:\n314259a0 push    ebp\n314259a1 mov     ebp, esp\n314259a3 sub     esp, 10h                   \/\/ 16 bytes of local variables\n314259a6 mov     dword ptr [ebp-10h], ecx   \/\/ local10 = this\n314259a9 mov     eax, dword ptr [ebp+8]     \/\/ arg1\n314259ac mov     dword ptr [ebp-8], eax     \/\/ local8 = arg1\n314259af lea     ecx, [ebp-0Ch]             \/\/ &amp;localc\n314259b2 push    ecx\n314259b3 lea     edx, [ebp-4]               \/\/ &amp;local4\n314259b6 push    edx\n314259b7 mov     eax, dword ptr [ebp-8]     \/\/ local8\n314259ba push    eax\n314259bb call    contoso!ContosoInitialize+0x4db0 (31425a10)\n314259c0 add     esp, 0Ch\n314259c3 mov     edx, 1\n314259c8 mov     ecx, dword ptr [ebp-0Ch]   \/\/ localc\n314259cb shl     edx, cl                    \/\/ 1 &lt;&lt; localc\n314259cd mov     eax, dword ptr [ebp-4]     \/\/ local4\n314259d0 mov     ecx, dword ptr [ebp-10h]   \/\/ this\n314259d3 mov     eax, dword ptr [ecx+eax*4] \/\/ this-&gt;m_0[local4]\n314259d6 and     eax, edx                   \/\/ this-&gt;m_0[local4] &amp; (1 &lt;&lt; localc)\n314259d8 test    eax, eax\n314259da je      contoso!ContosoInitialize+0x4d83 (314259e3) \/\/ jump if bit was clear\n314259dc mov     eax, 1                     \/\/ return 1\n314259e1 jmp     contoso!ContosoInitialize+0x4da3 (31425a03)\n314259e3 mov     edx, 1\n314259e8 mov     ecx, dword ptr [ebp-0Ch]   \/\/ localc\n314259eb shl     edx, cl                    \/\/ 1 &lt;&lt; localc\n314259ed mov     eax, dword ptr [ebp-4]     \/\/ local4\n314259f0 mov     ecx, dword ptr [ebp-10h]   \/\/ this\n314259f3 mov     eax, dword ptr [ecx+eax*4] \/\/ this-&gt;m_0[local4]\n314259f6 or      eax, edx                   \/\/ this-&gt;m_0[local4] | (1 &lt;&lt; localc)\n314259f8 mov     ecx, dword ptr [ebp-4]     \/\/ local4\n314259fb mov     edx, dword ptr [ebp-10h]   \/\/ this\n314259fe mov     dword ptr [edx+ecx*4], eax \/\/ this-&gt;m_0[local4] = this-&gt;m_0[local4] | (1 &lt;&lt; localc)\n31425a01 xor     eax, eax                   \/\/ return 0\n31425a03 mov     esp, ebp\n31425a05 pop     ebp\n31425a06 ret     4\n0:000&gt;<\/pre>\n<p>\nThe lack of common subexpression elimination and the frequent\nspilling and reloading of registers tells me that this code was compiled\nwith optimizations disabled.\nBad for performance, but it makes reverse-engineering so much easier.\nWe end up with this, after renaming some variables and propagating stores.\n<\/p>\n<pre>\nBOOL Class1::f_314259a0(int arg1)\n{\n    int elementIndex;\n    int relativeBitIndex;\n    f_31425a10(arg1, &amp;elementIndex, &amp;relativeBitIndex);\n    if (this-&gt;m_0[elementIndex] &amp; (1 &lt;&lt; relativeBitIndex))\n    {\n        return TRUE;\n    }\n    else\n    {\n        this-&gt;m_0[elementIndex] =\n        this-&gt;m_0[elementIndex] | (1 &lt;&lt; relativeBitIndex);\n        return FALSE;\n    }    \n}\n<\/pre>\n<p>This function calculates a bit in a buffer, and if the bit is not set, it sets the bit. The function then returns the previous state of the bit. Let&#8217;s look at the function that calculates which bit to set. <\/p>\n<pre>\ncontoso!ContosoInitialize+0x4db0:\n31425a10 push    ebp\n31425a11 mov     ebp,esp\n31425a13 mov     eax,dword ptr [ebp+8]      \/\/ arg1\n31425a16 shr     eax,5                      \/\/ arg1 \/ 32 (unsigned)\n31425a19 mov     ecx,dword ptr [ebp+0Ch]    \/\/ arg3\n31425a1c mov     dword ptr [ecx],eax        \/\/ *arg3 = arg1 \/ 32\n31425a1e mov     eax,dword ptr [ebp+8]      \/\/ arg1\n31425a21 xor     edx,edx                    \/\/ zero-extend to 64 bits\n31425a23 mov     ecx,20h\n31425a28 div     eax,ecx                    \/\/ arg1 \/ 32\n31425a2a mov     eax,dword ptr [ebp+10h]    \/\/ arg2\n31425a2d mov     dword ptr [eax],edx        \/\/ *arg2 = arg1 \/ 32\n31425a2f pop     ebp\n31425a30 ret\n<\/pre>\n<p>Okay, so the bit index is nothing fancy. The buffer at <code>m_0<\/code> is treated as a giant bit array, and this function figures out which element holds that bit and where that bit is. We also learned that the incoming and outgoing parameters are unsigned 32-bit integers because the arithmetic operations are consistent with unsigned operations rather than signed. We don&#8217;t know how big the bit array is, but at least we can give the function a nicer name. <\/p>\n<p>We can capture what we&#8217;ve learned as follows: <\/p>\n<pre>\nclass SomeBitArrayClass1\n{\npublic:\n    BOOL SetBit(uint32_t bitIndex);\n\nprivate:\n    static void CalcBitPosition(\n        uint32_t bitIndex,\n        uint32_t* elementIndex,\n        uint32_t* relativeBitIndex);\n\n    uint32_t buffer[unknown_size];\n};\n\nBOOL SomeBitArrayClass1::SetBit(uint32_t bitIndex)\n{\n    uint32_t elementIndex;\n    uint32_t relativeBitIndex;\n    CalcBitPosition(bitIndex, &amp;elementIndex, &amp;relativeBitIndex);\n    if (this-&gt;buffer[elementIndex] &amp; (1 &lt;&lt; relativeBitIndex))\n    {\n        return TRUE;\n    }\n    else\n    {\n        this-&gt;buffer[elementIndex] =\n        this-&gt;buffer[elementIndex] | (1 &lt;&lt; relativeBitIndex);\n        return FALSE;\n    }    \n}\n<\/pre>\n<p>Sure, the code that sets the bit could have been written as <\/p>\n<pre>\nthis-&gt;buffer[elementIndex] |= (1 &lt;&lt; relativeBitIndex);\n<\/pre>\n<p>but I&#8217;m just repeating the code that was written, and what they wrote calculates the indexed element address twice. <\/p>\n<p>We&#8217;re off to a good start, but we haven&#8217;t really learned much yet. Much more interesting is the function that produced the null pointer that caused us to crash. <\/p>\n<p>We&#8217;ll pick that up next time. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Understanding the scenario a little more.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-93635","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>Understanding the scenario a little more.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/93635","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=93635"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/93635\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=93635"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=93635"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=93635"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}