{"id":26983,"date":"2007-05-04T10:00:00","date_gmt":"2007-05-04T10:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/2007\/05\/04\/how-my-lack-of-understanding-of-how-processes-exit-on-windows-xp-forced-a-security-patch-to-be-recalled\/"},"modified":"2007-05-04T10:00:00","modified_gmt":"2007-05-04T10:00:00","slug":"how-my-lack-of-understanding-of-how-processes-exit-on-windows-xp-forced-a-security-patch-to-be-recalled","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20070504-00\/?p=26983","title":{"rendered":"How my lack of understanding of how processes exit on Windows XP forced a security patch to be recalled"},"content":{"rendered":"<p>Last year, a Windows security update got a lot of flack for causing some machines to hang, and it was my fault. (This makes <a href=\"http:\/\/blogs.msdn.com\/larryosterman\/archive\/2006\/07\/31\/684327.aspx\">messing up a demo at the Financial Analysts Meeting<\/a> look like small potatoes.)\nThe security fix addressed a category of attacks wherein people could construct shortcut files or other items which specified a CLSID that was never intended to be used as a shell extension. As we saw earlier, <a href=\"http:\/\/blogs.msdn.com\/oldnewthing\/archive\/2004\/03\/26\/96777.aspx\">lots of people mess up <code>IUnknown::QueryInterface<\/code><\/a>, and if you pass the CLSID of one of these buggy implementations, Explorer would dutifully create it and try to use it, and then bad things would happen. The object might crash or hang or even corrupt memory and keep running (sort of).\nTo protect against buggy shell extensions, Explorer was modified to use a helper program called <code>verclsid.exe<\/code> whose job was to be the &#8220;guinea pig&#8221; and host the shell extension and do some preliminary sniffing around to make sure the shell extension passed some basic functionality tests before letting it run loose in Explorer. That way, if the shell extension went crazy, the victim would be the <code>verclsid.exe<\/code> process and not the main Explorer process.\nThe <code>verclsid.exe<\/code> program created a watchdog thread: If the preliminary sniffing took too long, the watchdog assumed that the shell extension was hung and the watchdog told Explorer, &#8220;Don&#8217;t use this shell extension.&#8221;\nI was one of the people brought in to study this new behavior, poke holes in its design, poke holes in its implementation, review every line of code that changed and make sure that it did exactly what it was supposed to do without introducing any new bugs along the way. We found some issues, testers found some other issues, and all the while, the clock was ticking since this was a security patch and people enjoy mocking Microsoft over how long it takes to put a security patch together.\nThe patch went out, and reports started coming in that machines were hanging. How could that be? We created a watchdog thread specifically to catch the buggy shell extensions that hung; why isn&#8217;t the watchdog thread doing its job?\nThat was a long set-up for today&#8217;s lesson.\nAfter running its sanity tests, the <code>verclsid.exe<\/code> program releases the shell extension, un-initializes COM, and then calls <code>ExitProcess<\/code> with a special exit code that means, &#8220;All tests passed.&#8221; If you read <!--backref: thread termination -->yesterday&#8217;s installment, you already know where I messed up.\nThe DLL that implemented the shell extension created a worker thread, so it did an extra <code>LoadLibrary<\/code> on itself so that it wouldn&#8217;t get unloaded when COM freed it as part of <code>CoUninitialize<\/code> tear-down. When the DLL got its <code>DLL_PROCESS_DETACH<\/code>, it shut down its worker thread by the common technique of setting a &#8220;clean up now&#8221; event that the worker thread listened for, and then waiting for the worker thread to respond with a &#8220;Okay, I&#8217;m all done&#8221; event.\nBut recall that the first stage in process exit is the termination of all threads other than the one that called <code>ExitProcess<\/code>. That means that the DLL&#8217;s worker thread no longer exists. After setting the event to tell the (nonexistent) thread to clean up, it then waited for the (nonexistent) thread to say that it was done. And since there was nobody around listening for the clean-up event, the &#8220;all done&#8221; event never got set. The DLL hung in its <code>DLL_PROCESS_DETACH<\/code>.\nWhy didn&#8217;t our watchdog thread save us? Because <strong>the watchdog thread got killed too<\/strong>!\nNow, the root cause for all this was a buggy shell extension that did bad things in its <code>DLL_PROCESS_DETACH<\/code>, but blaming the shell extension misses the point. After all, it was the fact that there existed buggy shell extensions that created the need for the <code>verclsid.exe<\/code> program in the first place.\n<b>Welcome Slashdot readers<\/b>. Since you won&#8217;t read the existing comments before posting your own, I&#8217;ll float some of the more significant ones here.\nThe buggy shell extension was included with a printer driver for a printer that is no longer manufactured. Good luck finding one of those in your test suite.<\/p>\n<p>The security update was recalled and reissued in a single action, which most people would call an <i>update<\/i> or <i>refresh<\/i>, but the word <i>recall<\/i> works better in a title. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Last year, a Windows security update got a lot of flack for causing some machines to hang, and it was my fault. (This makes messing up a demo at the Financial Analysts Meeting look like small potatoes.) The security fix addressed a category of attacks wherein people could construct shortcut files or other items which [&hellip;]<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-26983","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>Last year, a Windows security update got a lot of flack for causing some machines to hang, and it was my fault. (This makes messing up a demo at the Financial Analysts Meeting look like small potatoes.) The security fix addressed a category of attacks wherein people could construct shortcut files or other items which [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/26983","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=26983"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/26983\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=26983"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=26983"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=26983"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}