A customer reported that each time they created and exited a thread, their DLL reference count goes up by one. This was a problem because their code loads a DLL and then calls a function in it. That function creates a thread, waits for the thread to finish, and then returns. After the function returns, their main program tries to unload the DLL, but the DLL remains in memory. Their debugging (with the !dlls debugger extension) showed that when they created a thread, the DLL reference count went up by one, but it did not drop back down when the thread exited.
The customer asked why CreateÂThread
increments the DLL reference count, but ExitÂThread
doesn’t decrement it.
This question struck the DLL loader team as odd, because CreateÂThread
doesn’t increment the DLL reference count, and a quick sanity test confirmed their recollection: Creating a thread with CreateÂThread
does not increment the DLL reference count. Consequently, ExitÂThread
is correct in not decrementing it.
We went back to the customer to get some more information.
The customer said that they create the thread with _beginthreadex
. Switching to std::thread
didn’t help.
Okay, that explains it.¹ They are creating the thread with _beginthreadex
and ending it with ExitThread
.
The documentation for _beginthread
and _beginthreadex
has a big banner that says
For an executable file linked with Libcmt.lib, do not call the Win32
ExitThread
API so that you don’t prevent the run-time system from reclaiming allocated resources._endthread
and_endthreadex
reclaim allocated thread resources and then callExitThread
.
It is not CreateÂThread
that increments the DLL reference count. It’s _beginthreadex
. You can see it in the code, which is provided in the Windows Platform SDK under C:\
extern "C" uintptr_t __cdecl _beginthreadex( void* const security_descriptor, unsigned int const stack_size, _beginthreadex_proc_type const procedure, void* const context, unsigned int const creation_flags, unsigned int* const thread_id_result ) { _VALIDATE_RETURN(procedure != nullptr, EINVAL, 0); unique_thread_parameter parameter( create_thread_parameter(procedure, context)); ⟦ ... more stuff ... ⟧
where
static __acrt_thread_parameter* __cdecl create_thread_parameter(
void* const procedure,
void* const context
) throw()
{
unique_thread_parameter parameter(
_calloc_crt_t(__acrt_thread_parameter, 1).detach());
if (!parameter)
{
return nullptr;
}
parameter.get()->_procedure = procedure;
parameter.get()->_context = context;
// Attempt to bump the reference count of the module in which the user's
// thread procedure is defined, to ensure that the module will stay loaded
// as long as the thread is executing. We will release this HMDOULE [sic]
// when the thread procedure returns or _endthreadex is called.
GetModuleHandleExW(
GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS,
reinterpret_cast<LPCWSTR>(procedure),
¶meter.get()->_module_handle);
return parameter.detach();
}
So it’s not CreateÂThread
that is incrementing the DLL reference count. It’s _beginthreadex
. And the DLL reference count decrements when the thread procedure returns or when you call _endthreadex
:
static void __cdecl common_end_thread(unsigned int const return_code) throw()
{
⟦ ... other stuff not relevant here ... ⟧
if (parameter->_module_handle != INVALID_HANDLE_VALUE &&
parameter->_module_handle != nullptr)
{
FreeLibraryAndExitThread(parameter->_module_handle, return_code);
}
else
{
ExitThread(return_code);
}
}
extern "C" void __cdecl _endthread()
{
return common_end_thread(0);
}
extern "C" void __cdecl _endthreadex(unsigned int const return_code)
{
return common_end_thread(return_code);
}
If you call ExitThread
directly, then you bypass the cleanup code that _endthreadex
performs, which means that you leak a bunch of stuff, including the DLL reference.
The customer noted that they also tried std::
, and my guess is that they used ExitThread
to exit a std::thread
:
// Don't do this! auto thread = std::thread([] { ⟦ ... do some stuff ... ⟧ ExitThread(42); // all done });
When you exit the thread with ExitÂThread()
, the thread ends without allowing the active function calls to clean up. If we look at std::
‘s thread procedure,
template <class _Tuple, size_t... _Indices> static unsigned int __stdcall _Invoke(void* _RawVals) noexcept /* terminates */ { // adapt invoke of user's callable object to _beginthreadex's // thread procedure const unique_ptr<_Tuple> _FnVals(static_cast<_Tuple*>(_RawVals)); _Tuple& _Tup = *_FnVals.get(); // avoid ADL, handle incomplete types _STD invoke(_STD move(_STD get<_Indices>(_Tup))...); _Cnd_do_broadcast_at_thread_exit(); // TRANSITION, ABI return 0; }
we see that this bypasses the destructor of _FnVals
, which means that we leak the _Tuple
that holds the thread callable and parameters. We also bypass the call to _Cnd_
, which lets other threads know that this thread has exited. This is used by functions like std::
and std::
. Bypassing those calls means that things which are waiting for the thread to exit won’t ever run.
Moral of the story: If you use _beginthread
or _beginthreadex
, you must exit the thread either by returning from your thread function or by calling _endthread
or _endthreadex
. Don’t go straight to ExitÂThread()
.
And if you use std::
, then you must exit by returning from your thread callable. Don’t call _endthread
or _endthreadex
or ExitÂThread()
.
¹ It also points out that the customer’s question is misleading. They asked why CreateÂThread
increments the DLL reference count, but they aren’t calling CreateÂThread
themselves, so how do they know that it’s CreateÂThread
that’s doing it?
0 comments
Be the first to start the discussion.