A customer had a DLL, let’s call it CONTOSO.DLL
, and that DLL linked to another DLL, let’s call it WIDGET.DLL
. To improve DLL load time, they made WIDGET.DLL
a delay-loaded DLL via the /DELAYLOAD
linker option. This worked out great, except that sometimes their DLL crashed when shutting down.
When the WIDGET.DLL
was a static dependency, the loader made a note to ensure that WIDGET.DLL
was loaded and ready before calling CONTOSO.DLL
‘s initialization function, and made sure that WIDGET.DLL
remained valid until CONTOSO.DLL
completed its uninitialization.
Switching the WIDGET.DLL
to a /DELAYLOAD
DLL removes this static dependency, and the loader isn’t around to help any more.
When the process shuts down, the loader uninitializes the DLLs in an order that tries¹ to preserve the static dependencies, so that a DLL waits until all its dependents are uninitialized before itself uninitializing. However, the loader does not have insight into dynamically-created dependencies, and the DLLs may unload out of order.
What happened is that CONTOSO.DLL
initialized without WIDGET.DLL
, and then later somebody needed a widget, so it loaded WIDGET.DLL
and did some widget stuff, and then cached the widget so it wouldn’t have to go through all that nonsense again.
In the CONTOSO.DLL
module’s DLL_
code, it checks² if there is a cached widget, and if so, destroys it.PROCESS_
DETACH
WIDGET.DLL was a dynamic dependency, the module loader doesn’t take it into account when calculating the order in which modules should be uninitlalized. The loader sees no static dependency between CONTOSO.DLL
and WIDGET.DLL
, so the order in which they uninitialize is arbitrary.
And if the arbitrary decision ends up selecting WIDGET.DLL
to uninitialize first, then you have a crash when CONTOSO.DLL
tries to call into an already-uninitialized DLL.
Note that this problem occurs only at process shutdown. If CONTOSO.DLL
unloads via a runtime call to FreeÂLibrary
, it will still be able to call into WIDGET.DLL
because it hasn’t yet called FreeÂLibrary
on WIDGET.DLL
. But during process shutdown, the module loader needs to free all the things, and the outstanding LoadÂLibrary
won’t prevent that from happening.
The solution is to bypass widget cleanup if the DLL_
handler realizes that the process is terminating. Just leak the widget. The building is being demolished. You don’t need to sweep the floors.PROCESS_
DETACH
The DLL was able to start without the widget DLL. It should be able to finish without the widget DLL.
¹ I say “tries” because circular dependencies make such an effort impossible to achieve, but the loader does the best it can.
² It’s important to check for evidence of widgets before trying to clean up widget-related things. Otherwise, you may end up loading a DLL in your DLL_
handler, and that’s not good.PROCESS_
DETACH
Hi,
Sorry to bother you with this, but do you know if the loader is documented anywhere? I can't find anything. I have an issue of a third party application crashing on startup and all I have so far is that in ntdll!LdrInitializeThunk after a call to ntdll!NtTestAlert esi is 0xc0000139 (which apperantly means "STATUS_ENTRYPOINT_NOT_FOUND") and then is sent to ntdll!RtlRaiseStatus. When run from the debugger the application instead of showing the WER UI shows...
A typical cause for error 0xc0000139 is a DLL version mismatch, a particular instance of the dreaded "DLL Hell".
The message "The procedure entry point XXXX could not be located..." cannot be trusted anymore since Windows 8, I think. It currently shows the executable module that tried to reference the entry point, not the module in which the function entry point should actually be found as it was reported in previous versions of Windows. This...
It would be so helpful if descriptions of these OS-provided diagnostics were linked in the LoadLibrary or GetProcAddress function documentation, or the dllimport keyword documentation (in the toolchain), or pretty much anwhere at all having to do with building software that consumes DLL exports. Troubleshooting guides are nice but tracing loader operation should not be considered a separate activity from development and debugging.
This helped a lot, thank you.
After setting gflags, I could see "LdrpNameToOrdinal - WARNING: Procedure "??0SomeFunction@@QAE@H@Z" could not be located in DLL at base 0x10000000" in the debugger. The address 0x10000000 pointed at ApplicationLibrary3.dll. After checking with peviewer, indeed Application.exe tried to import ??0SomeFunction@@QAE@H@Z from ApplicationLibrary3.dll, but the library did not export it. The library was in a previous version (luckily the version in the file's metadata reflects application version). It probably wasn't replaced...
NtTestAlert is a red herring. The error occurred before the NtTestAlert. It’s weird that the executable imports functions from itself.
“a DLL waits until all its dependents are uninitialized before itself uninitializing”
Strike that, reverse it.
I know a lot of developers have balked at the idea of leaking memory or handles, because while running in the debugger they have telemetry tools that report memory or handles leaks – and they don’t want to obscure real memory leaks with the false positives. That’s why I wrap those Free functions in IsDebuggerPresent. This way I still get the quick shutdown, and my memory and GDI leak trackers don’t go insane.
Before using IsDebuggerPresent one should estimate how many undebuggable bugs will make her happy.