{"id":110732,"date":"2025-01-08T07:00:00","date_gmt":"2025-01-08T15:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=110732"},"modified":"2025-01-08T08:10:24","modified_gmt":"2025-01-08T16:10:24","slug":"20250108-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20250108-00\/?p=110732","title":{"rendered":"Inside STL: Waiting for a <CODE>std::atomic&lt;std::shared_ptr&lt;T&gt;&gt;<\/CODE> to change, part 1"},"content":{"rendered":"<p>Like other <code>std::atomic<\/code> specializations, <code>std::atomic&lt;<wbr \/>std::shared_ptr&lt;T&gt;&gt;<\/code> supports the <code>wait<\/code> and <code>notify_*<\/code> methods for waiting for the value to change and reporting that the value has changed. The definition of &#8220;changed&#8221; in the C++ language specification is that the value has changed if <i>either<\/i> the stored pointer <i>or<\/i> the control block pointer has changed. A shared pointer is implemented as a <i>pair<\/i> of pointers, but <code>Wait\u00adOn\u00adAddress<\/code> can wait on at most 8 bytes, and unix futexes can wait on only four bytes, so how does this work?\u00b9<\/p>\n<p>The Microsoft implementation waits for the stored pointer to change, and the <code>notify_*<\/code> methods signal the stored pointer. But wait, this fails to detect the case where the stored pointer stays the same and only the control block changes.<\/p>\n<pre>std::atomic&lt;std::shared_ptr&lt;int&gt;&gt; p =\r\n    std::make_shared&lt;int&gt;(42);\r\n\r\nvoid change_control_block()\r\n{\r\n    auto old = p.load();\r\n    auto empty = std::shared_ptr&lt;int&gt;();\r\n\r\n    \/\/ Replace with an <a title=\"Phantom and indulgent shared pointers\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20230818-00\/?p=108619\">indulgent<\/a> shared pointer\r\n    \/\/ with the same stored pointer.\r\n    p.store({ empty, old.get() });\r\n    p.notify_all();\r\n}\r\n\r\nvoid wait_for_change()\r\n{\r\n    auto old = p.load();\r\n    p.wait(old);\r\n}\r\n\r\n<\/pre>\n<p>We updated <code>p<\/code> with a <code>shared_ptr<\/code> that has the same stored pointer but a different control block. If the stored pointer is the same, how does the <code>p.wait()<\/code> wake up? The implementation of <code>p.wait()<\/code> waits for the stored pointer to change, but we didn&#8217;t change it.<\/p>\n<p>The answer is that msvc doesn&#8217;t wait indefinitely for the pointer to change. It waits with a timeout, and after the timeout, it checks whether either the stored pointer and control block have changed. If so, then the <code>wait()<\/code> method returns. Otherwise, msvc waits a little more. The wait <a href=\"https:\/\/github.com\/microsoft\/STL\/blob\/eaf7b316e4d3b6955dade2745563cd0551f9b960\/stl\/inc\/memory#L3998\"> starts at 16ms<\/a> and increases exponentially until it caps at around 17 minutes.<\/p>\n<p>So changes that alter only the control block pointer will still notify, thought they might be a little sluggish about it. In practice, there is very little sluggishness because <code>Wake\u00adBy\u00adAddress\u00adSingle<\/code> and <code>Wake\u00adBy\u00adAddress\u00adAll<\/code> will wake a <code>Wait\u00adOn\u00adAddress<\/code> that has entered the wait state, so the chances for a sluggish wake are rather slim.<\/p>\n<table border=\"0\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td>lock <tt>p<\/tt><br \/>\ndecide to wait<br \/>\nunlock <tt>p<\/tt><\/td>\n<\/tr>\n<tr>\n<td>prepare for wait \u2190 danger zone<\/td>\n<\/tr>\n<tr>\n<td>add to wait list<br \/>\nprepare to block (or early-exit)<br \/>\nblock thread<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>It&#8217;s only in that danger zone, when the <code>notify_*<\/code> tries to wake up a thread that hasn&#8217;t yet gone to sleep, where you get a sluggish wake.<\/p>\n<p>And you may recall from <a title=\"Spurious wakes, race conditions, and bogus FIFO claims: A peek behind the curtain of WaitOnAddress\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20160826-00\/?p=94185\"> a peek behind the curtain of <code>Wait\u00adOn\u00adAddress<\/code><\/a> that the system tries hard to close the gap of the &#8220;prepare to wait&#8221;, so in practice, the window of sluggishness is quite small.<\/p>\n<p>Next time, we&#8217;ll look at the libstdc++ implementation of <code>wait<\/code> and <code>notify_*<\/code>. The wild ride continues.<\/p>\n<p>\u00b9 <a title=\"P1644R0: Add wait\/notify to atomic&lt;shared_ptr&lt;T&gt;&gt;\" href=\"http:\/\/wg21.link\/p1644r0\"> The proposal to add <code>wait<\/code> and <code>notify_*<\/code> to <code>std::<wbr \/>atomic&lt;<wbr \/>std::<wbr \/>shared_ptr&lt;T&gt;&gt;<\/code><\/a> merely says that their omission was &#8220;due to oversight.&#8221; There was no discussion of whether the proposed <code>wait<\/code> and <code>notify_*<\/code> methods are actually implementable, seeing as shared pointers are twice the size of normal pointers and may exceed implementation limits for atomic operations. That is merely left as an exercise for the implementation. An exercise that <a href=\"https:\/\/github.com\/microsoft\/STL\/pull\/3655\"> msvc got wrong the first time<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Waiting on a single pointer, but checking for two.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-110732","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>Waiting on a single pointer, but checking for two.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/110732","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=110732"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/110732\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=110732"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=110732"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=110732"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}