{"id":44103,"date":"2014-09-11T07:00:00","date_gmt":"2014-09-11T07:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/2014\/09\/11\/the-history-of-win32-critical-sections-so-far\/"},"modified":"2014-09-11T07:00:00","modified_gmt":"2014-09-11T07:00:00","slug":"the-history-of-win32-critical-sections-so-far","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20140911-00\/?p=44103","title":{"rendered":"The history of Win32 critical sections so far"},"content":{"rendered":"<p>\nThe <code>CRITICAL_SECTION<\/code> structure has gone through\na lot of changes since its introduction back oh so many decades ago.\nThe amazing thing is that as long as you stick to the documented API,\nyour code is completely unaffected.\n<\/p>\n<p>\nInitially, the critical section object had an owner field\nto keep track of which thread entered the critical section, if any.\nIt also had\na lock count to keep track of how many times the owner thread\nentered the critical section, so that the critical section would\nbe released when the matching number of\n<code>Leave&shy;Critical&shy;Section<\/code> calls was made.\nAnd there was an auto-reset event used to manage contention.\nWe&#8217;ll look more at that event later.\n(It&#8217;s actually more complicated than this, but the details\naren&#8217;t important.)\n<\/p>\n<p>\nIf you&#8217;ve ever looked at the innards of a critical section\n(for entertainment purposes only),\nyou may have noticed that the lock count was off by one:\nThe lock count was the number of active calls to\n<code>Enter&shy;Critical&shy;Section<\/code> <i>minus one<\/i>.\nThe bias was needed because the original version of the\ninterlocked increment and decrement operations\n<a HREF=\"http:\/\/blogs.msdn.com\/b\/oldnewthing\/archive\/2004\/05\/06\/127141.aspx\">\nreturned only the sign of the result, not the revised value<\/a>.\nBiasing the result by 1 means that all three states could be detected:\nUnlocked (negative), locked exactly once (zero), reentrant lock (positive).\n(It&#8217;s actually more complicated than this, but the details\naren&#8217;t important.)\n<\/p>\n<p>\nIf a thread tries to enter a critical section but can&#8217;t\nbecause the critical section is owned by another thread,\nthen it sits and waits on the contention event.\nWhen the owning thread releases all its claims on the critical section,\nit signals the event to say,\n&#8220;Okay, the door is unlocked.\nThe next guy can come in.&#8221;\n<\/p>\n<p>\nThe contention event is allocated only when contention occurs.\n(This is what older versions of MSDN meant when they said that\nthe event is &#8220;allocated on demand.&#8221;)\nWhich leads to a nasty problem:\nWhat if contention occurs,\nbut the attempt to create the contention event fails?\nOriginally,\nthe answer was &#8220;The kernel raises an out-of-memory exception.&#8221;\n<\/p>\n<p>\nNow you&#8217;d think that a clever program could catch this exception\nand try to recover from it, say, by unwinding everything that led\nup to the exception.\nUnfortunately, the weakest link in the chain is the critical section\nobject itself:\nIn the original version of the code,\nthe out-of-memory exception was raised while the critical section was\nin an unstable state.\nEven if you managed to catch the exception and unwind everything you could,\nthe critical section was itself irretrievably corrupted.\n<\/p>\n<p>\nAnother problem with the original design became apparent on multiprocessor\nsystems: If a critical section was typically held for a very brief time,\nthen by the time you called into kernel to wait on the contention event,\nthe critical section was already freed!\n<\/p>\n<pre>\nvoid SetGuid(REFGUID guid)\n{\n EnterCriticalSection(&amp;g_csGuidUpdate);\n g_theGuid = guid;\n LeaveCriticalSection(&amp;g_csGuidUpdate);\n}\nvoid GetGuid(GUID *pguid)\n{\n EnterCriticalSection(&amp;g_csGuidUpdate);\n *pguid = g_theGuid;\n LeaveCriticalSection(&amp;g_csGuidUpdate);\n}\n<\/pre>\n<p>\nThis imaginary code uses a critical section to protect accesses\nto a GUID.\nThe actual protected region is just nine instructions long.\nSetting up to wait on a kernel object is way,\nway more than nine instructions.\nIf the second thread immediately waited on the critical section\ncontention event,\nit would find that by the time the kernel got around to entering\nthe wait state, the event would say,\n&#8220;Dude, what took you so long? I was signaleded, like, a bazillion\ncycles ago!&#8221;\n<\/p>\n<p>\nWindows&nbsp;2000 added the\n<code>Initialize&shy;Critical&shy;Section&shy;And&shy;Spin&shy;Count<\/code>\nfunction,\nwhich lets you avoid the problem where waiting for a critical section\ncosts more than the code the critical section was protecting.\nIf you initialize with a spin count, then when a thread tries to\nenter the critical section and can&#8217;t,\nit goes into a loop trying to enter it over and over again,\nin the hopes that it will be released.\n<\/p>\n<p>\n&#8220;Are we there yet?\nHow about now?\nHow about now?\nHow about now?\nHow about now?\nHow about now?\nHow about now?\nHow about now?\nHow about now?\nHow about now?\nHow about now?\nHow about now?&#8221;\n<\/p>\n<p>\nIf the critical section is not released after the requested number\nof iterations,\nthen the old slow wait code is executed.\n<\/p>\n<p>\nNote that spinning on a critical section is helpful only on\nmultiprocessor systems,\nand only in the case where you know that all the protected code\nsegments are very short in duration.\nIf the critical section is held for a long time,\nthen spinning is wasteful since the odds that the critical section\nwill become free during the\n<a HREF=\"https:\/\/www.youtube.com\/watch?v=czTw2dS5dtE\">\nspin cycle<\/a>\nare very low,\nand you wasted a bunch of CPU.\n<\/p>\n<p>\nAnother feature added in Windows&nbsp;2000 is the ability to\npreallocate the contention event.\nThis avoids the dreaded\n&#8220;out of memory&#8221; exception in\n<code>Enter&shy;Critical&shy;Section<\/code>,\nbut at a cost of a kernel event for every critical section,\nwhether actual contention occurs or not.\n<\/p>\n<p>\nWindows&nbsp;XP solved the problem of the dreaded &#8220;out of memory&#8221;\nexception by using a fallback algorithm that could be used\nif the contention event could not be allocated.\nThe fallback algorithm is not as efficient, but at least\nit avoids the &#8220;out of memory&#8221; situation.\n(Which is a good thing, because nobody really expects\n<code>Enter&shy;Critical&shy;Section<\/code> to fail.)\nThis also means that requests for the contention event to be preallocated\nare now ignored, since the reason for preallocating\n(avoiding the &#8220;out of memory&#8221; exception) no longer exists.\n<\/p>\n<p>\n(And while they were there, the kernel folks also fixed\n<code>Initialize&shy;Critical&shy;Section<\/code> so that\na failed initialization left the critical section object in\na stable state so you could safely try again later.)\n<\/p>\n<p>\nIn Windows Vista, the internals of the critical section object\nwere rejiggered once again,\nthis time to add convoy resistance.\n<a HREF=\"http:\/\/msdn.microsoft.com\/en-us\/library\/windows\/hardware\/ff541979(v=vs.85).aspx\">\nThe internal bookkeeping completely changed<\/a>;\nthe lock count is no longer a 1-biased count of the number of\n<code>Enter&shy;Critical&shy;Section<\/code> calls which are pending.\nAs a special concession to backward compatibility with people\nwho violated the API contract and\n<a HREF=\"http:\/\/blogs.msdn.com\/b\/oldnewthing\/archive\/2013\/07\/12\/10433554.aspx#10435411\">\nlooked directly at the internal data structures<\/a>,\nthe new algorithm goes to some extra effort to ensure that\nif a program breaks the rules and\nlooks at a specific offset inside the critical section\nobject,\nthe value stored there is &minus;1 if and only if the critical section\nis unlocked.\n<\/p>\n<p>\nOften, people will remark that &#8220;your compatibility problems would go\naway if you just open-sourced the operating system.&#8221;\nI think there is some confusion over what &#8220;go away&#8221; means.\nIf you release the source code to the operating system,\nit makes it even <i>easier<\/i> for people to take undocumented\ndependencies on it,\nbecause they no longer have the barrier of &#8220;Well, I can&#8217;t find any\ndocumentation, so maybe it&#8217;s not documented.&#8221;\nThey can just read the source code and say,\n&#8220;Oh, I see that if the critical section is unlocked,\nthe <code>Lock&shy;Count<\/code> variable has the value &minus;1.&#8221;\nBoom, instant undocumented dependency.\nCompatibility is screwed.\n(Unless what people are saying &#8220;your compatibility problems would\ngo away if you just open-sourced <i>all applications<\/i>,\nso that these problems can be identified and fixed as soon as they\nare discovered.&#8221;)\n<\/p>\n<p>\n<b>Exercise<\/b>:\nWhy isn&#8217;t it important that the fallback algorithm be highly efficient?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The CRITICAL_SECTION structure has gone through a lot of changes since its introduction back oh so many decades ago. The amazing thing is that as long as you stick to the documented API, your code is completely unaffected. Initially, the critical section object had an owner field to keep track of which thread entered the [&hellip;]<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[2],"class_list":["post-44103","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-history"],"acf":[],"blog_post_summary":"<p>The CRITICAL_SECTION structure has gone through a lot of changes since its introduction back oh so many decades ago. The amazing thing is that as long as you stick to the documented API, your code is completely unaffected. Initially, the critical section object had an owner field to keep track of which thread entered the [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/44103","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=44103"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/44103\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=44103"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=44103"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=44103"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}