{"id":40363,"date":"2004-03-08T07:00:00","date_gmt":"2004-03-08T15:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/2004\/03\/08\/c-scoped-static-initialization-is-not-thread-safe-on-purpose\/"},"modified":"2004-03-08T07:00:00","modified_gmt":"2004-03-08T15:00:00","slug":"c-scoped-static-initialization-is-not-thread-safe-on-purpose","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20040308-00\/?p=40363","title":{"rendered":"C++ scoped static initialization is not thread-safe, on purpose!"},"content":{"rendered":"<p>\n[<b>Note<\/b>:\nAfter this article was written,\nthe C++ standard has been revised.\nStarting in C++11,\nscoped static initialization is now thread-safe,\nbut it comes with a cost: Reentrancy now invokes undefined behavior.]\n<\/p>\n<p>\nThe rule for static variables at block scope\n(as opposed to static variables with global scope)\nis that they are initialized the first time execution\nreaches their declaration.\n<\/p>\n<p>\nFind the race condition:\n<\/p>\n<pre>\nint ComputeSomething()\n{\n  static int cachedResult = ComputeSomethingSlowly();\n  return cachedResult;\n}\n<\/pre>\n<p>\nThe intent of this code is\nto compute something expensive the first time the\nfunction is called, and then cache the result to be\nreturned by future calls to the function.\n<\/p>\n<p>\nA variation on this basic technique is\n<a HREF=\"http:\/\/users.utu.fi\/~sisasa\/oasis\/cppfaq\/ctors.html#[10.9]\">\nis advocated by this web site to avoid the &#8220;static initialization\norder fiasco&#8221;<\/a>.\n(Said fiasco is well-described on that page so I encourage you\nto read it and understand it.)\n<\/p>\n<p>\nThe problem is that this code is not thread-safe.  Statics\nwith local scope are internally converted by the compiler into\nsomething like this:<\/p>\n<pre>\nint ComputeSomething()\n{\n<font COLOR=\"blue\">  static bool cachedResult_computed = false;\n  static int cachedResult;\n  if (!cachedResult_computed) {\n    cachedResult_computed = true;\n    cachedResult = ComputeSomethingSlowly();\n  }<\/font>\n  return cachedResult;\n}\n<\/pre>\n<p>\nNow the race condition is easier to see.\n<\/p>\n<p>\nSuppose two threads both call this function for the first time.\nThe first thread gets as far as setting\ncachedResult_computed&nbsp;=&nbsp;true,\nand then gets pre-empted.\nThe second thread now sees that cachedResult_computed is true\nand skips over the body of the &#8220;if&#8221; branch and returns\nan uninitialized variable.\n<\/p>\n<p>\nWhat you see here is not a compiler bug.\nThis behavior is <strong>required by the C++ standard<\/strong>.\n<\/p>\n<p>\nYou can write variations on this theme to create even worse\nproblems:\n<\/p>\n<pre>\nclass Something { ... };\nint ComputeSomething()\n{\n  static Something s;\n  return s.ComputeIt();\n}\n<\/pre>\n<p>\nThis gets rewritten internally as\n(this time, using pseudo-C++):\n<\/p>\n<pre>\nclass Something { ... };\nint ComputeSomething()\n{\n<font COLOR=\"blue\">  static bool s_constructed = false;\n  static uninitialized Something s;\n  if (!s_constructed) {\n    s_constructed = true;\n    new(&amp;s) Something; \/\/ construct it\n    atexit(DestructS);\n  }<\/font>\n  return s.ComputeIt();\n}\n<font COLOR=\"blue\">\/\/ Destruct s at process termination\nvoid DestructS()\n{\n ComputeSomething::s.~Something();\n}<\/font>\n<\/pre>\n<p>\nNotice that there are multiple race conditions here.\nAs before, it&#8217;s possible for one thread to run ahead of the\nother thread and use &#8220;s&#8221; before it has been constructed.\n<\/p>\n<p>\nEven worse, it&#8217;s possible for the first thread to get\npre-empted immediately after testing s_constructed\nbut <strong>before<\/strong> setting it to &#8220;true&#8221;.\nIn this case, the object s gets <strong>double-constructed<\/strong>\nand <strong>double-destructed<\/strong>.\n<\/p>\n<p>\nThat can&#8217;t be good.\n<\/p>\n<p>\nBut wait, that&#8217;s not all.  Not look at what happens if you\nhave <em>two<\/em> runtime-initialized local statics:\n<\/p>\n<pre>\nclass Something { ... };\nint ComputeSomething()\n{\n  static Something s(0);\n  static Something t(1);\n  return s.ComputeIt() + t.ComputeIt();\n}\n<\/pre>\n<p>\nThis is converted by the compiler into the following\npseudo-C++:\n<\/p>\n<pre>\nclass Something { ... };\nint ComputeSomething()\n{\n<font COLOR=\"blue\">  static char constructed = 0;\n  static uninitialized Something s;\n  if (!(constructed &amp; 1)) {\n    constructed |= 1;\n    new(&amp;s) Something; \/\/ construct it\n    atexit(DestructS);\n  }\n  static uninitialized Something t;\n  if (!(constructed &amp; 2)) {\n    constructed |= 2;\n    new(&amp;t) Something; \/\/ construct it\n    atexit(DestructT);\n  }<\/font>\n  return s.ComputeIt() + t.ComputeIt();\n}\n<\/pre>\n<p>\nTo save space, the compiler placed the two\n&#8220;x_constructed&#8221; variables into a bitfield.\nNow there are multiple\n<strong>non-interlocked<\/strong>\nread-modify-store operations on the variable\n&#8220;constructed&#8221;.\n<\/p>\n<p>\nNow consider what happens if one thread\nattempts to execute &#8220;constructed |= 1&#8221;\nat the same time another thread attempts\nto execute &#8220;constructed |= 2&#8221;.\n<\/p>\n<p>\nOn an x86, the statements likely assemble into<\/p>\n<pre>\n  or constructed, 1\n...\n  or constructed, 2\n<\/pre>\n<p>without any &#8220;lock&#8221; prefixes.\nOn multiprocessor machines, it is possible\nfor the two stores both to read the old value\nand clobber each other with conflicting values.\n<\/p>\n<p>\nOn ia64 and alpha, this clobbering is much more\nobvious since they do not have a single\nread-modify-store instruction; the three\nsteps must be explicitly coded:\n<\/p>\n<pre>\n  ldl t1,0(a0)     ; load\n  addl t1,1,t1     ; modify\n  stl t1,1,0(a0)   ; store\n<\/pre>\n<p>\nIf the thread gets pre-empted between the load\nand the store, the value stored may no longer\nagree with the value being overwritten.\n<\/p>\n<p>\nSo now consider the following insane sequence of execution:\n<\/p>\n<ul>\n<li>Thread A tests &#8220;constructed&#8221; and finds it zero and prepares\n    to set the value to 1, but it gets pre-empted.<\/p>\n<li>Thread B enters the same function, sees &#8220;constructed&#8221; is zero\n    and proceeds to construct both &#8220;s&#8221; and &#8220;t&#8221;, leaving\n    &#8220;constructed&#8221; equal to 3.<\/p>\n<li>Thread A resumes execution and completes its load-modify-store\n    sequence, setting &#8220;constructed&#8221; to 1, then constructs &#8220;s&#8221;\n    (a second time).<\/p>\n<li>Thread A then proceeds to construct &#8220;t&#8221; as well (a second time)\n    setting &#8220;constructed&#8221; (finally) to 3.\n<\/ul>\n<p>\nNow, you might think you can wrap the runtime initialization\nin a critical section:\n<\/p>\n<pre>\nint ComputeSomething()\n{\n EnterCriticalSection(...);\n static int cachedResult = ComputeSomethingSlowly();\n LeaveCriticalSection(...);\n return cachedResult;\n}\n<\/pre>\n<p>\nBecause now you&#8217;ve placed the one-time initialization inside\na critical section and made it thread-safe.\n<\/p>\n<p>\nBut what if the second call comes from within the same thread?\n(&#8220;We&#8217;ve traced the call; it&#8217;s coming from inside the thread!&#8221;)\nThis can happen if ComputeSomethingSlowly() itself calls\nComputeSomething(), perhaps indirectly.\nSince that thread already owns the critical section, the code\nenter it just fine and you once again\nend up returning an uninitialized variable.\n<\/p>\n<p>\nConclusion: When you see runtime initialization of a local static\nvariable, be very concerned.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>How the design of the C++ language subverts thread safety.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-40363","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>How the design of the C++ language subverts thread safety.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/40363","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=40363"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/40363\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=40363"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=40363"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=40363"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}