{"id":6463,"date":"2012-09-28T07:00:00","date_gmt":"2012-09-28T07:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/2012\/09\/28\/data-in-crash-dumps-are-not-a-matter-of-opinion\/"},"modified":"2012-09-28T07:00:00","modified_gmt":"2012-09-28T07:00:00","slug":"data-in-crash-dumps-are-not-a-matter-of-opinion","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20120928-00\/?p=6463","title":{"rendered":"Data in crash dumps are not a matter of opinion"},"content":{"rendered":"<p>\nA customer reported a problem with the\n<code>System&shy;Time&shy;To&shy;Tz&shy;Specific&shy;Local&shy;Time<\/code>\nfunction.\n(Gosh, why couldn&#8217;t they have reported a problem with a function with\na shorter name!\nNow I have to type that thing over and over again.)\n<\/p>\n<blockquote CLASS=\"q\">\n<p>\nWe&#8217;re having a problem with the\n<code>System&shy;Time&shy;To&shy;Tz&shy;Specific&shy;Local&shy;Time<\/code>\nfunction.\nWe call it like this:\n<\/p>\n<pre>\ns_pTimeZones-&gt;SystemTimeToTzSpecificLocalTime((BYTE)timeZoneId,\n                                 &amp;sysTime, &amp;localTime);\n<\/pre>\n<p>\nOn some but not all of our machines,\nour program crashes with the following call stack:\n<\/p>\n<pre>\nExceptionAddress: 77d4a0d0 (kernel32!SystemTimeToTzSpecificLocalTime+0x49)\n   ExceptionCode: c0000005 (Access violation)\n  ExceptionFlags: 00000000\nNumberParameters: 2\n   Parameter[0]: 00000000\n   Parameter[1]: 000000ac\nAttempt to read from address 000000ac\nkernel32!SystemTimeToTzSpecificLocalTime+0x49\nContoso!CTimeZones::SystemTimeToTzSpecificLocalTime+0x26\nContoso!CContoso::ResetTimeZone+0xc0\nContoso!ResetTimeZoneThreadProc+0x32\n<\/pre>\n<p>\nThis problem appears to occur only with the release build;\nthe debug build does not have this problem.\nAny ideas?\n<\/p>\n<\/blockquote>\n<p>\nNotice that in the line of code the customer provided,\nthey are <i>not<\/i> calling\n<code>System&shy;Time&shy;To&shy;Tz&shy;Specific&shy;Local&shy;Time<\/code>;\nthey are instead calling some application-defined method\nwith the same name,\nwhich takes different parameters from the system function.\n<\/p>\n<p>\nThe customer apologized and included the\nsource file they were using, as well as a crash dump file.\n<\/p>\n<pre>\nvoid CContoso::ResetTimeZone()\n{\n SYSTEMTIME sysTime, localTime;\n GetLastModifiedTime(&amp;sysTime);\n for (int timeZoneId = 1;\n      timeZoneId &lt; MAX_TIME_ZONES;\n      timeZoneId++) {\n  if (!s_pTimeZones-&gt;SystemTimeToTzSpecificLocalTime((BYTE)timeZoneId,\n                                  &amp;sysTime, &amp;localTime)) {\n    LOG_ERROR(\"...\");\n    return;\n  }\n  ... do something with localTime ...\n }\n}\nBOOL CTimeZones::SystemTimeToTzSpecificLocalTime(\n    BYTE bTimeZoneID,\n    LPSYSTEMTIME lpUniversalTime,\n    LPSYSTEMTIME lpLocalTime)\n{\n    return ::SystemTimeToTzSpecificLocalTime(\n        &amp;m_pTimeZoneInfo[bTimeZoneID],\n        lpUniversalTime, lpLocalTime);\n}\n<\/pre>\n<p>\nAccording to the crash dump,\nthe first parameter passed to\n<code>CTime&shy;Zones::System&shy;Time&shy;To&shy;Tz&shy;Specific&shy;Local&shy;Time<\/code>\nwas <code>1<\/code>,\nand the <code>m_pTimeZoneInfo<\/code> member was <code>nullptr<\/code>.\nAs a result, a bogus non-null pointer was passed as the first\nparameter to\n<code>System&shy;Time&shy;To&shy;Tz&shy;Specific&shy;Local&shy;Time<\/code>,\nwhich resulted in a crash when the function tried to dereference it.\n<\/p>\n<p>\nThis didn&#8217;t require any special secret kernel knowledge;\nall I did was look at the stack trace and the value of the\nmember variable.\n<\/p>\n<p>\nSo far, it was just a case of a lazy developer who didn&#8217;t know\nhow to debug their own code.\nBut the reply from the customer was most strange:\n<\/p>\n<blockquote CLASS=\"q\">\n<p>\nI don&#8217;t think so,\nfor two reasons.\n<\/p>\n<ol>\n<li>The exact same build on another machine does not crash,\n    so it must be a machine-specific or OS-specific bug.<\/p>\n<li>The code in question has not changed in several months,\n    so if the problem were in that code,\n    we would have encountered it much earlier.\n<\/ol>\n<\/blockquote>\n<p>\nI was momentarily left speechless by this response.\nIt sounds like the customer simply refuses to believe the\ninformation that&#8217;s right there in the crash dump.\n&#8220;La la la, I can&#8217;t hear you.&#8221;\n<\/p>\n<p>\nMemory values are not a matter of opinion.\nIf you look in memory and find that the value 5\nis on the stack,\nthen the value 5 is on the stack.\nYou can&#8217;t say, &#8220;No it isn&#8217;t; it&#8217;s 6.&#8221;\nYou can have different opinions on how the value 5\nended up on the stack, but the fact that the value is 5\nis beyond dispute.\n<\/p>\n<p>\nIt&#8217;s like a box of cereal that has been spilled on the floor.\nPeople may argue over who spilled the cereal,\nor who placed the box in such a precarious position in the first place,\nbut to take the position\n&#8220;There is no cereal on the floor&#8221; is a pretty audacious move.\n<\/p>\n<p>\nWhether you like it or not, the value is not correct.\nYou can&#8217;t deny what&#8217;s right there in the dump file.\n(Well, unless you think the dump file itself is incorrect.)\n<\/p>\n<p>\nA colleague studied the customer&#8217;s code more closely\nand pointed out a race condition where the thread which calls\n<code>CContoso::ResetTimeZone<\/code>\nmay do so before the <code>CTimeZone<\/code> object has\nallocated the <code>m_pTimeZoneInfo<\/code> array.\nAnd it wasn&#8217;t anything particularly subtle either.\nIt went like this, in pseudocode:\n<\/p>\n<pre>\nCreateThread(ResetTimeZoneThreadProc);\ns_pTimeZones = new CTimeZones;\ns_pTimeZones-&gt;Initialize();\n\/\/ the CTimeZones::Initialize method allocates m_pTimeZoneInfo\n\/\/ among other things\n<\/pre>\n<p>\nThe customer never wrote back once the bug was identified.\nPerhaps the sheer number of impossible things all happening\nat once caused their head to explode.\n<\/p>\n<p>\nI discussed this incident later with another colleague,\nwho remarked\n<\/p>\n<blockquote CLASS=\"m\">\n<p>\nFrequently, some problem&nbsp;X will\noccur, and the people debugging it will say,\n&#8220;The only way that problem&nbsp;X to occur is if\nwe are in situation&nbsp;Y,\nbut we know that situation&nbsp;Y is impossible,\nso we didn&#8217;t bother investigating that possibility.\nCan you suggest another idea?&#8221;\n<\/p>\n<p>\nYeah, I can suggest another idea.\n&#8220;The computer is always right.&#8221;\nYou already saw that problem&nbsp;X occurred.\nIf the only way that problem&nbsp;X can occur is if\nyou are in situation&nbsp;Y,\nthen the first thing you should do is assume that\nyou are in situation&nbsp;Y and work from there.&#8221;\n<\/p>\n<p>\nTeaching people to follow this simple axiom has\navoid a lot of fruitless misdirected speculative\ndebugging.\nPeople seem hard-wired to\n<a HREF=\"http:\/\/blogs.msdn.com\/b\/oldnewthing\/archive\/2008\/07\/23\/8765362.aspx\">\nprefer speculation<\/a>,\nthough, and it&#8217;s common to slip back into forgetting\nsimple logic.\n<\/p>\n<\/blockquote>\n<p>\nTo put it another way:\n<\/p>\n<ul>\n<li>If X, then Y.\n<li>X is true.\n<li>Y cannot possible be true.\n<\/ul>\n<p>\nIn order for these three statements to hold simultaneously,\nyou must have found a fundamental flaw in the underlying axioms\nof logic as they have been understood for thousands of years.\n<\/p>\n<p>\nThis is unlikely to be the case.\n<\/p>\n<p>\nGiven that you have X right in front of you,\nX&nbsp;is true by observation.\nThat leaves the other two statements.\nMaybe there&#8217;s a case where X does not guarantee Y.\nMaybe Y&nbsp;is true after all.\n<\/p>\n<p>\nAs Sherlock Holmes is famous for saying,\n&#8220;<a HREF=\"http:\/\/en.wikiquote.org\/wiki\/Sherlock_Holmes#The_Sign_of_the_Four_.281890.29\">When you have eliminated the impossible,\nwhatever remains,\nhowever improbable,\nmust be the truth<\/a>.&#8221;\nBut before you rule out the impossible,\nmake sure it&#8217;s actually impossible.\n<\/p>\n<p>\n<b>Bonus chatter<\/b>:\nNow that I&#8217;ve told you that the debugger never lies,\nI get to confuse you in a future entry by debugging a crash\nwhere the debugger lied.\n(Or at least wasn&#8217;t telling the whole truth.)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A customer reported a problem with the System&shy;Time&shy;To&shy;Tz&shy;Specific&shy;Local&shy;Time function. (Gosh, why couldn&#8217;t they have reported a problem with a function with a shorter name! Now I have to type that thing over and over again.) We&#8217;re having a problem with the System&shy;Time&shy;To&shy;Tz&shy;Specific&shy;Local&shy;Time function. We call it like this: s_pTimeZones-&gt;SystemTimeToTzSpecificLocalTime((BYTE)timeZoneId, &amp;sysTime, &amp;localTime); On some but not [&hellip;]<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[26],"class_list":["post-6463","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-other"],"acf":[],"blog_post_summary":"<p>A customer reported a problem with the System&shy;Time&shy;To&shy;Tz&shy;Specific&shy;Local&shy;Time function. (Gosh, why couldn&#8217;t they have reported a problem with a function with a shorter name! Now I have to type that thing over and over again.) We&#8217;re having a problem with the System&shy;Time&shy;To&shy;Tz&shy;Specific&shy;Local&shy;Time function. We call it like this: s_pTimeZones-&gt;SystemTimeToTzSpecificLocalTime((BYTE)timeZoneId, &amp;sysTime, &amp;localTime); On some but not [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/6463","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=6463"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/6463\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=6463"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=6463"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=6463"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}