December 4th, 2024

Why does my DLL reference count go up by one every time I create and exit a thread?

A customer reported that each time they created and exited a thread, their DLL reference count goes up by one. This was a problem because their code loads a DLL and then calls a function in it. That function creates a thread, waits for the thread to finish, and then returns. After the function returns, their main program tries to unload the DLL, but the DLL remains in memory. Their debugging (with the !dlls debugger extension) showed that when they created a thread, the DLL reference count went up by one, but it did not drop back down when the thread exited.

The customer asked why Create­Thread increments the DLL reference count, but Exit­Thread doesn’t decrement it.

This question struck the DLL loader team as odd, because Create­Thread doesn’t increment the DLL reference count, and a quick sanity test confirmed their recollection: Creating a thread with Create­Thread does not increment the DLL reference count. Consequently, Exit­Thread is correct in not decrementing it.

We went back to the customer to get some more information.

The customer said that they create the thread with _beginthreadex. Switching to std::thread didn’t help.

Okay, that explains it.¹ They are creating the thread with _beginthreadex and ending it with ExitThread.

The documentation for _beginthread and _beginthreadex has a big banner that says

For an executable file linked with Libcmt.lib, do not call the Win32 ExitThread API so that you don’t prevent the run-time system from reclaiming allocated resources. _endthread and _endthreadex reclaim allocated thread resources and then call ExitThread.

It is not Create­Thread that increments the DLL reference count. It’s _beginthreadex. You can see it in the code, which is provided in the Windows Platform SDK under C:\Program Files (x86)\Windows Kits\10\Source\〈version〉\ucrt\startup\thread.cpp:

extern "C" uintptr_t __cdecl _beginthreadex(
    void*                    const security_descriptor,
    unsigned int             const stack_size,
    _beginthreadex_proc_type const procedure,
    void*                    const context,
    unsigned int             const creation_flags,
    unsigned int*            const thread_id_result
    )
{
    _VALIDATE_RETURN(procedure != nullptr, EINVAL, 0);

    unique_thread_parameter parameter(
        create_thread_parameter(procedure, context));

    ⟦ ... more stuff ... ⟧

where

static __acrt_thread_parameter* __cdecl create_thread_parameter(
    void* const procedure,
    void* const context
    ) throw()
{
    unique_thread_parameter parameter(
        _calloc_crt_t(__acrt_thread_parameter, 1).detach());
    if (!parameter)
    {
        return nullptr;
    }

    parameter.get()->_procedure = procedure;
    parameter.get()->_context   = context;

    // Attempt to bump the reference count of the module in which the user's
    // thread procedure is defined, to ensure that the module will stay loaded
    // as long as the thread is executing.  We will release this HMDOULE [sic]
    // when the thread procedure returns or _endthreadex is called.
    GetModuleHandleExW(
        GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS,
        reinterpret_cast<LPCWSTR>(procedure),
        &parameter.get()->_module_handle);

    return parameter.detach();
}

So it’s not Create­Thread that is incrementing the DLL reference count. It’s _beginthreadex. And the DLL reference count decrements when the thread procedure returns or when you call _endthreadex:

static void __cdecl common_end_thread(unsigned int const return_code) throw()
{
    ⟦ ... other stuff not relevant here ... ⟧

    if (parameter->_module_handle != INVALID_HANDLE_VALUE &&
        parameter->_module_handle != nullptr)
    {
        FreeLibraryAndExitThread(parameter->_module_handle, return_code);
    }
    else
    {
        ExitThread(return_code);
    }
}

extern "C" void __cdecl _endthread()
{
    return common_end_thread(0);
}

extern "C" void __cdecl _endthreadex(unsigned int const return_code)
{
    return common_end_thread(return_code);
}

If you call ExitThread directly, then you bypass the cleanup code that _endthreadex performs, which means that you leak a bunch of stuff, including the DLL reference.

The customer noted that they also tried std::thread, and my guess is that they used ExitThread to exit a std::thread:

// Don't do this!
auto thread = std::thread([] {
    ⟦ ... do some stuff ... ⟧
    ExitThread(42); // all done
});

When you exit the thread with Exit­Thread(), the thread ends without allowing the active function calls to clean up. If we look at std::thread‘s thread procedure,

template <class _Tuple, size_t... _Indices>
static unsigned int __stdcall _Invoke(void* _RawVals)
    noexcept /* terminates */ {
    // adapt invoke of user's callable object to _beginthreadex's
    // thread procedure
    const unique_ptr<_Tuple> _FnVals(static_cast<_Tuple*>(_RawVals));
    _Tuple& _Tup = *_FnVals.get(); // avoid ADL, handle incomplete types
    _STD invoke(_STD move(_STD get<_Indices>(_Tup))...);
    _Cnd_do_broadcast_at_thread_exit(); // TRANSITION, ABI
    return 0;
}

we see that this bypasses the destructor of _FnVals, which means that we leak the _Tuple that holds the thread callable and parameters. We also bypass the call to _Cnd_do_broadcast_at_thread_exit(), which lets other threads know that this thread has exited. This is used by functions like std::notify_all_at_thread_exit and std::future::set_value_at_thread_exit. Bypassing those calls means that things which are waiting for the thread to exit won’t ever run.

Moral of the story: If you use _beginthread or _beginthreadex, you must exit the thread either by returning from your thread function or by calling _endthread or _endthreadex. Don’t go straight to Exit­Thread().

And if you use std::thread, then you must exit by returning from your thread callable. Don’t call _endthread or _endthreadex or Exit­Thread().

¹ It also points out that the customer’s question is misleading. They asked why Create­Thread increments the DLL reference count, but they aren’t calling Create­Thread themselves, so how do they know that it’s Create­Thread that’s doing it?

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

0 comments