{"id":5943,"date":"2012-11-30T07:00:00","date_gmt":"2012-11-30T07:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/2012\/11\/30\/the-debugger-lied-to-you-because-the-cpu-was-still-juggling-data-in-the-air\/"},"modified":"2012-11-30T07:00:00","modified_gmt":"2012-11-30T07:00:00","slug":"the-debugger-lied-to-you-because-the-cpu-was-still-juggling-data-in-the-air","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20121130-00\/?p=5943","title":{"rendered":"The debugger lied to you because the CPU was still juggling data in the air"},"content":{"rendered":"<p>\nA colleague was studying a very strange failure,\nwhich I&#8217;ve simplified for expository purpose.\n<\/p>\n<p>\nThe component in question has the following\nbasic shape, ignoring error checking:\n<\/p>\n<pre>\n\/\/ This is a multithreaded object\nclass Foo\n{\npublic:\n void BeginUpdate();\n void EndUpdate();\n \/\/ These methods can be called at any time\n int GetSomething(int x);\n \/\/ These methods can be called only between\n \/\/ BeginUpdate\/EndUpdate.\n void UpdateSomething(int x);\nprivate:\n Foo() : m_cUpdateClients(0), m_pUpdater(nullptr) { ... }\n LONG m_cUpdateClients;\n Updater *m_pUpdater;\n};\n<\/pre>\n<p>\nThere are two parts of the <code>Foo<\/code> object.\nOne part that is essential to the object&#8217;s task,\nand another part that is needed only when updating.\nThe parts related to updating are expensive, so the\n<code>Foo<\/code> object sets them up only when\nan update is active.\nYou indicate that an update is active by calling\n<code>Begin&shy;Update<\/code>, and you indicate\nthat you are finished updating by calling\n<code>End&shy;Update<\/code>.\n<\/p>\n<pre>\n<i>\/\/ Code in italics is wrong\nvoid Foo::BeginUpdate()\n{\n LONG cClients = InterlockedIncrement(&amp;m_cUpdateClients);\n if (cClients == 1) {\n  \/\/ remember, error checking has been elided\n  m_pUpdater = new Updater();\n }\n \/\/ else, we are already initialized for updating,\n \/\/ so nothing to do\n}\nvoid Foo::EndUpdate()\n{\n LONG cClients = InterlockedDecrement(&amp;m_cUpdateClients);\n if (cClients == 0) {\n  \/\/ last update client has disconnected\n  delete m_pUpdater;\n  m_pUpdater = nullptr;\n }\n}<\/i>\n<\/pre>\n<p>\nThere are a few race conditions here,\nand one of them manifested itself in a crash.\n(If two threads call <code>Begin&shy;Update<\/code> at the same time,\none of them will increment the client count to 1 and the other\nwill increment it to 2.\nThe one which increments it to 1 will get to work initializing\n<code>m_pUpdater<\/code>,\nwhereas the second one will run ahead on the assumption that the\nupdater is fully-initialized.)\n<\/p>\n<p>\nWhat we saw in the crash dump was that <code>Update&shy;Something<\/code>\ntries to use <code>m_pUpdater<\/code> and crashed on a null pointer.\nWhat made the crash dump strange was that if you actually looked\nat the <code>Foo<\/code> object in memory, the <code>m_pUpdater<\/code>\nwas non-null!\n<\/p>\n<pre>\n    mov ecx, [esi+8] \/\/ load m_pUpdater\n    mov eax, [ecx]   \/\/ load vtable -- crash here\n<\/pre>\n<p>\nIf you actually looked at the memory pointed-to by\n<code>ESI+8<\/code>,\nthe value there was not null,\nyet in the register dump, <code>ECX<\/code> was zero.\n<\/p>\n<p>\nWas the CPU hallucinating?\nThe value in memory is nonzero.\nThe CPU loaded a value from memory.\nBut the value it read was zero.\n<\/p>\n<p>\nThe CPU wasn&#8217;t hallucinating.\nThe value it read from memory was in fact zero.\nThe reason why you saw the nonzero value in memory was\nthat in the time it took the null pointer exception to be raised,\nthen caught by the debugger,\nthe other thread managed to finish calling <code>new Updater()<\/code>,\nstore the result back into memory,\nand then return back to its caller and proceed as if everything\nwere just fine.\nThus, when the debugger went to capture the memory dump,\nit captured a non-zero value in the dump,\nand the code which updated <code>m_pUpdater<\/code> was long gone.\n<\/p>\n<p>\nThis type of race condition is more likely to manifest on multi-core\nmachines, because on those types of machines, the two CPUs can have\ndifferent views of memory.\nThe thread doing the initialization can update\n<code>m_pUpdater<\/code> in memory,\nand other CPUs may not find out about it until some time later.\nThe updated value was still in flight when the crash occurred.\nBefore the debugger can get around to capturing the\n<code>m_pUpdater<\/code> member in the crash dump,\nthe in-flight value lands, and what you see in the crash dump\ndoes not match what the crashing CPU saw.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A colleague was studying a very strange failure, which I&#8217;ve simplified for expository purpose. The component in question has the following basic shape, ignoring error checking: \/\/ This is a multithreaded object class Foo { public: void BeginUpdate(); void EndUpdate(); \/\/ These methods can be called at any time int GetSomething(int x); \/\/ These methods [&hellip;]<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[26],"class_list":["post-5943","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-other"],"acf":[],"blog_post_summary":"<p>A colleague was studying a very strange failure, which I&#8217;ve simplified for expository purpose. The component in question has the following basic shape, ignoring error checking: \/\/ This is a multithreaded object class Foo { public: void BeginUpdate(); void EndUpdate(); \/\/ These methods can be called at any time int GetSomething(int x); \/\/ These methods [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/5943","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=5943"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/5943\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=5943"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=5943"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=5943"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}