{"id":111054,"date":"2025-04-07T07:00:00","date_gmt":"2025-04-07T14:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=111054"},"modified":"2025-04-07T08:48:53","modified_gmt":"2025-04-07T15:48:53","slug":"20250407-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20250407-00\/?p=111054","title":{"rendered":"On priority inversion in the use of a spinlock to ensure atomic access to a <CODE>shared_ptr<\/CODE>"},"content":{"rendered":"<p>In <a title=\"Inside STL: The atomic shared_ptr\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20241219-00\/?p=110663\"> my discussion of the internal implementation of <code>std::atomic&lt;<wbr \/>std::shared_ptr&lt;T&gt;&gt;<\/code><\/a>, I noted that the use of a spinlock without a blocking fallback could result in a deadlock due to priority inversion. <a href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20241219-00\/?p=110663&amp;commentid=142170#comment-142170\"> Commenter Anton Siluanov noted<\/a>, &#8220;While priority inversion is a thing, from atomic I&#8217;d expect as quick as possible operation. And mutex impl can be done manually.&#8221;<\/p>\n<p>It&#8217;s true that the spinlock is not held for long, but even the tiniest window will get hit, and probably sooner than you would like. My colleague Larry Osterman phrases this as &#8220;<a href=\"https:\/\/learn.microsoft.com\/en-us\/archive\/blogs\/larryosterman\/one-in-a-million-is-next-tuesday\">One in a million is next Tuesday<\/a>.&#8221; James Hamilton (formerly of Microsoft, now at Amazon) described it more mundanely as &#8220;<a href=\"https:\/\/perspectives.mvdirona.com\/2017\/04\/at-scale-rare-events-arent-rare\/\">At scale, rare event&#8217;s aren&#8217;t rare<\/a>.&#8221;<\/p>\n<p>In this case, the race condition occurs if a higher priority thread tries to enter the spinlock while a lower priority thread holds it. And even though it&#8217;s a small race window by instruction count, it can actually be quite a long time if there is a poorly-timed context switch, and an even longer time if the control block has been paged out.<\/p>\n<table style=\"border-collapse: collapse;\" border=\"0\" cellspacing=\"0\" cellpadding=\"3\">\n<tbody>\n<tr>\n<td style=\"border: 1px currentcolor; border-style: none solid solid none;\">Thread 1 (low priority)<\/td>\n<td style=\"border-bottom: solid 1px currentcolor;\">Thread 2 (high priority)<\/td>\n<\/tr>\n<tr>\n<td style=\"border-right: solid 1px currentcolor;\">set lock bit<\/td>\n<td>&nbsp;<\/td>\n<\/tr>\n<tr>\n<td style=\"border-right: solid 1px currentcolor;\">increment refcount in control block<\/td>\n<td style=\"border: solid 1px currentcolor;\">danger zone<\/td>\n<\/tr>\n<tr>\n<td style=\"border-right: solid 1px currentcolor;\">clear the lock bit<\/td>\n<td>&nbsp;<\/td>\n<\/tr>\n<tr>\n<td style=\"border-right: solid 1px currentcolor;\">set lock bit<\/td>\n<td>&nbsp;<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>If the high priority thread runs during the danger zone, and the process has no other idle processors (say, because it&#8217;s a uniprocessor system, or the other processors are busy running medium-priority threads), then you have a priority inversion deadlock, because the high priority thread is consuming the resource that the low priority thread needs in order to release the lock.<\/p>\n<p>Yes, it is a narrow race window, but narrow windows will get hit, and if they hit in just the wrong way, it will ruin somebody&#8217;s day.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Priority inversion may be rare, but correctness doesn&#8217;t care about rarity.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-111054","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>Priority inversion may be rare, but correctness doesn&#8217;t care about rarity.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/111054","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=111054"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/111054\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=111054"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=111054"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=111054"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}