February 6th, 2009

A process shutdown puzzle: Answers

Last week, I posed a process shutdown puzzle in honor of National Puzzle Day. Let’s see how we did.

Part One asked us to explain why the ThreadFunction thread no longer exists. That’s easy. One of the things that happen inside ExitProcess is that all threads (other than the one calling ExitProcess) are forcibly terminated in the nastiest way possible. This happens before the DLL_PROCESS_DETACH notification is sent. Therefore, the code in StopWorkerThread that waits for the thread completion event waits forever because the ThreadFunction is no longer running. There is nobody around to see the shutdown event and respond by setting the completion event.

Okay, that was the easy part. Part Two asked us to criticize the replacement solution which replaced the completion event with a call to FreeLibraryAndExitThread and changed the StopWorkerThread function to wait for the thread handle to become signaled. This solution is also flawed.

Consider the case that the DLL is receiving its DLL_PROCESS_DETACH notification because the DLL is being unloaded by a call to FreeLibrary, rather than due to process termination. In that case, StopWorkerThread sets the shutdown event, and the ThreadFunction proceeds to clean up and call FreeLibraryAndExitThread. But one of the steps in thread shutdown is sending DLL_THREAD_DETACH notifications, which will not happen until the DLL_PROCESS_DETACH notifications are complete. The WaitForSingleObject waits indefinitely because it won’t complete until the thread exits, but the thread won’t exit until StopWorkerThread returns. Deadlock.

Finally, Part Three asks us to explain why the code doesn’t cause a problem in practice even though the code is flawed. The call to FreeLibraryAndExitThread implies that the code follows the “Worker thread retains its own reference on the DLL” model. After all, that’s why the last thing the thread does is free the library. But if that’s the case, then a call to FreeLibrary coming from the application won’t actually unload the DLL, because the DLL reference count is still nonzero: There is one reference still being held by the worker thread. Therefore, the DLL will never actually unload outside of process termination. All the flaws in the dynamic unload case are masked by the fact that the code never executes.

Led astray: Some of us mentioned that waiting on ThreadHandle returned immediately because the handle to a thread is automatically closed when the thread exits. This is wrong. Handles do not self-close. You have to call CloseHandle to close them. This is “obvious” if you apply the “imagine if the world actually worked this way” rule: Suppose thread handles were invalidated (and eligible for re-use) when a thread exited. Then how could you use a thread handle at all? Any time you use a thread handle, there would be an unavoidable race condition where the thread might have exited just before you used the handle. And it would be impossible to call GetExitCodeThread at all! (Since it only does anything interesting if you pass the handle to a thread that has exited.)

A handle to a thread remains valid until you close it. If the thread has exited, then a wait on the thread handle completes, but the handle is still valid because if it went invalid, programming would become impossible.

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

0 comments

Discussion are closed.