{"id":11003,"date":"2011-04-07T07:00:00","date_gmt":"2011-04-07T07:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/2011\/04\/07\/lock-free-algorithms-the-one-time-initialization\/"},"modified":"2011-04-07T07:00:00","modified_gmt":"2011-04-07T07:00:00","slug":"lock-free-algorithms-the-one-time-initialization","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20110407-00\/?p=11003","title":{"rendered":"Lock-free algorithms: The one-time initialization"},"content":{"rendered":"<p>\nA special case of\n<a HREF=\"http:\/\/blogs.msdn.com\/b\/oldnewthing\/archive\/2011\/04\/06\/10150261.aspx\">\nthe singleton constructor<\/a>\nis simply lazy-initializing a bunch of variables.\nIn a single-threaded application you can do something like this:\n<\/p>\n<pre>\n\/\/ suppose that any valid values for a and b stipulate that\n\/\/ a &ge; 0 and b &ge; a. Therefore, -1 is never a valid value,\n\/\/ and we use it to mean \"not yet initialized\".\nint a = -1, b = -1;\nvoid LazyInitialize()\n{\n if (a != -1) return; \/\/ initialized already\n a = calculate_nominal_a();\n b = calculate_nominal_b();\n \/\/ Adjust the values to conform to our constraint.\n a = max(0, a);\n b = max(a, b);\n}\n<\/pre>\n<p>\nThis works fine in a single-threaded program, but if the\nprogram is multi-threaded, then two threads might end up\ntrying to lazy-initialize the variables, and there are\nrace conditions which can result in one thread using\nvalues before they have been initialized:\n<\/p>\n<table>\n<tbody>\n<tr>\n<th>Thread 1<\/th>\n<th>Thread 2<\/th>\n<\/tr>\n<tr>\n<td><code>if (a != -1)<\/code> [not taken]<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><code>a = calculate_nominal_a();<\/code> \/\/ returns 2<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><code>if (a != -1) return;<\/code> \/\/ premature return!<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\nObserve that if the first thread is pre-empted after\nthe value for <code>a<\/code> is initially set,\nthe second thread will think that everything is initialized\nand may end up using an uninitialized&nbsp;<code>b<\/code>.\n<\/p>\n<p>\n&#8220;Aha,&#8221; you say, &#8220;that&#8217;s easy to fix.\nInstead of <code>a<\/code>,\nI&#8217;ll just use <code>b<\/code> to tell if initialization is complete.&#8221;\n<\/p>\n<pre>\nvoid LazyInitialize()\n{\n if (<font COLOR=\"red\">b<\/font> != -1) return; \/\/ initialized already (test b, not a)\n a = calculate_nominal_a();\n b = calculate_nominal_b();\n \/\/ Adjust the values to conform to our constraint.\n a = max(0, a);\n b = max(a, b);\n}\n<\/pre>\n<p>\nThis still suffers from a race condition:\n<\/p>\n<table>\n<tbody>\n<tr>\n<th>Thread 1<\/th>\n<th>Thread 2<\/th>\n<\/tr>\n<tr>\n<td><code>if (b != -1)<\/code> [not taken]<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><code>a = calculate_nominal_a();<\/code> \/\/ returns 2<\/td>\n<\/tr>\n<tr>\n<td><code>b = calculate_nominal_b();<\/code> \/\/ returns 1<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><code>if (b != -1) return;<\/code> \/\/ premature return!<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\n&#8220;I can fix that too.\nI&#8217;ll just compute the values of <code>a<\/code> and <code>b<\/code>\nin local variables, and update the globals only after the final\nvalues have been computed.\nThat way, the second thread won&#8217;t see partially-calculated values.&#8221;\n<\/p>\n<pre>\nvoid LazyInitialize()\n{\n if (b != -1) return; \/\/ initialized already\n <font COLOR=\"red\">\/\/ perform all calculations in temporary variables first<\/font>\n <font COLOR=\"red\">int temp_a<\/font> = calculate_nominal_a();\n <font COLOR=\"red\">int temp_b<\/font> = calculate_nominal_b();\n \/\/ Adjust the values to conform to our constraint.\n <font COLOR=\"red\">temp_a<\/font> = max(0, <font COLOR=\"red\">temp_a<\/font>);\n <font COLOR=\"red\">temp_b<\/font> = max(<font COLOR=\"red\">temp_a<\/font>, <font COLOR=\"red\">temp_b<\/font>);\n <font COLOR=\"red\">\/\/ make the temporary values permanent\n a = temp_a;\n b = temp_b;<\/font>\n}\n<\/pre>\n<p>\nNearly there,\nbut there is <i>still<\/i> a race condition:\n<\/p>\n<table>\n<tbody>\n<tr>\n<th>Thread 1<\/th>\n<th>Thread 2<\/th>\n<\/tr>\n<tr>\n<td><code>if (b != -1)<\/code> [not taken]<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><code>temp_a = calculate_nominal_a();<\/code> \/\/ returns 2<\/td>\n<\/tr>\n<tr>\n<td><code>temp_b = calculate_nominal_b();<\/code> \/\/ returns 1<\/td>\n<\/tr>\n<tr>\n<td><code>temp_a = max(0, temp_a);<\/code> \/\/ temp_a = 2<\/td>\n<\/tr>\n<tr>\n<td><code>temp_b = max(temp_a, temp_b);<\/code> \/\/ temp_b = 2<\/td>\n<\/tr>\n<tr>\n<td><code>a = temp_a;<\/code> \/\/ store issued to memory<\/td>\n<\/tr>\n<tr>\n<td><code>b = temp_b;<\/code> \/\/ store issued to memory<\/td>\n<\/tr>\n<tr>\n<td>store of <code>b<\/code> completes to memory<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><code>if (b != -1) return;<\/code> \/\/ premature return!<\/td>\n<\/tr>\n<tr>\n<td>store of <code>a<\/code> completes to memory<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\nThere is no guarantee that the assignment <code>b = 2<\/code> will\nbecome visible to other processors after the assignment\n<code>a = 2<\/code>.\nEven though the store operations are issued in that order,\nthe memory management circuitry might\n<a HREF=\"http:\/\/en.wikipedia.org\/wiki\/Memory_barrier#An_illustrative_example\">\ncomplete the memory operations in the opposite order<\/a>,\nresulting in Thread&nbsp;2 seeing <code>a = -1<\/code> and <code>b = 2<\/code>.\n<\/p>\n<p>\nTo prevent this from happening, the store to <code>b<\/code> must\nbe performed with\n<a HREF=\"http:\/\/blogs.msdn.com\/oldnewthing\/archive\/2008\/10\/03\/8969397.aspx\">\nRelease semantics<\/a>,\nindicating that all prior memory stores must complete before\nthe store to <code>b<\/code> can be made visible to other processors.\n<\/p>\n<pre>\nvoid LazyInitialize()\n{\n if (b != -1) return; \/\/ initialized already\n \/\/ perform all calculations in temporary variables first\n int temp_a = calculate_nominal_a();\n int temp_b = calculate_nominal_b();\n \/\/ Adjust the values to conform to our constraint.\n temp_a = max(0, temp_a);\n temp_b = max(temp_a, temp_b);\n \/\/ make the temporary values permanent\n a = temp_a;\n <font COLOR=\"red\">\/\/ since we use \"b\" as our indication that\n \/\/ initialization is complete, we must store it last,\n \/\/ and we must use release semantics.\n InterlockedCompareExchangeRelease(\n    reinterpret_cast&lt;LONG*&gt;&amp;b, temp_b, -1);<\/font>\n}\n<\/pre>\n<p>\nIf you look at the final result,\nyou see that this is pretty much a re-derivation of the\nsingleton initialization pattern:\nDo a bunch of calculations off to the side, then\npublish the result with a single\n<code>Interlocked&shy;Compare&shy;Exchange&shy;Release<\/code>\noperation.\n<\/p>\n<p>\nThe general pattern is therefore\n<\/p>\n<pre>\nvoid LazyInitializePattern()\n{\n if (global_signal_variable != sentinel_value) return;\n ... calculate values into local variables ...\n globalvariable1 = temp_variable1;\n globalvariable2 = temp_variable2;\n ...\n globalvariableN = temp_variableN;\n \/\/ publish the signal variable last, and with release\n \/\/ semantics to ensure earlier values are visible as well\n InterlockedCompareExchangeRelease(\n    reinterpret_cast&lt;LONG*&gt;&amp;global_signal_variable,\n    temp_signal_variable, sentinel_value);\n}\n<\/pre>\n<p>\nIf this all is too much for you\n(and given some of the subtlety of double-check-locking\nthat I messed up the first time through,\nit&#8217;s clearly too much for me),\nyou can let the Windows kernel team do the thinking\nand use the\n<a HREF=\"http:\/\/msdn.microsoft.com\/en-us\/library\/aa363808.aspx\">\none-time initialization functions<\/a>,\nwhich encapsulate all of this logic.\n(My pal\n<a HREF=\"http:\/\/blogs.msdn.com\/doronh\/\">\nDoron<\/a>\ncalled out the one-time initialization functions\n<a HREF=\"http:\/\/blogs.msdn.com\/doronh\/archive\/2006\/11\/29\/support-for-double-checked-locking.aspx\">\na while back<\/a>.)\nVersion 4 of the .NET Framework has corresponding functionality\nin the\n<a HREF=\"http:\/\/msdn.microsoft.com\/en-us\/library\/dd997286.aspx\">\n<code>Lazy&lt;T&gt;<\/code> class<\/a>.<\/p>\n<p>\n<b>Exercise<\/b>:\nWhat hidden assumptions are being made about the functions\n<code>calculate_nominal_a<\/code> and\n<code>calculate_nominal_b<\/code>?\n<\/p>\n<p>\n<b>Exercise<\/b>:\nWhat are the consequences if we use\n<code>Interlocked&shy;Exchange<\/code>\ninstead of <code>Interlocked&shy;Compare&shy;Exchange&shy;Release<\/code>?\n<\/p>\n<p>\n<b>Exercise<\/b>:\nIn the final version of <code>Lazy&shy;Initialize<\/code>, are the variables\n<code>temp_a<\/code> and <code>temp_b<\/code> really necessary,\nor are they just leftovers from previous attempts at fixing\nthe race condition?\n<\/p>\n<p>\n<b>Exercise<\/b>:\nWhat changes (if any) are necessary to the above pattern\nif the global variables are pointers? Floating point variables?\n<\/p>\n<p>\n<b>Update<\/b>: See discussion below\n<a HREF=\"http:\/\/blogs.msdn.com\/b\/oldnewthing\/archive\/2011\/04\/07\/10150728.aspx#10151639\">\nbetween Niall and Anon<\/a>\nregarding the need for acquire semantics on the initial read.\n<\/p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>A special case of the singleton constructor is simply lazy-initializing a bunch of variables. In a single-threaded application you can do something like this: \/\/ suppose that any valid values for a and b stipulate that \/\/ a &ge; 0 and b &ge; a. Therefore, -1 is never a valid value, \/\/ and we use [&hellip;]<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-11003","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>A special case of the singleton constructor is simply lazy-initializing a bunch of variables. In a single-threaded application you can do something like this: \/\/ suppose that any valid values for a and b stipulate that \/\/ a &ge; 0 and b &ge; a. Therefore, -1 is never a valid value, \/\/ and we use [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/11003","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=11003"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/11003\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=11003"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=11003"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=11003"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}