September 22nd, 2017

What does it mean when I get an access violation at a very low address when entering a critical section?

Warning: This article talks about implementation details which can change at any time. The information provided is for debugging and diagnostic purposes only.

A customer found that their server program occasionally crashes in the internal function Rtlp­Wait­On­Critical­Section trying to dereference the address 0x00000014.

7789dde3 ff4014          inc     dword ptr [eax+14h]

The dereference was due to a null pointer in the EAX register. This was particularly difficult to debug because the problem usually didn’t surface until the program had been running continuously for a week or more.

The customer chased the null pointer backwards and found that it came from the Debug­Info field of the RTL_CRITICAL_SECTION structure.

typedef struct _RTL_CRITICAL_SECTION
{
                                             // value in memory:
     PRTL_CRITICAL_SECTION_DEBUG DebugInfo;  // 0x00000000
     LONG LockCount;                         // 0xFFFFFFFC
     LONG RecursionCount;                    // 0x00000000
     PVOID OwningThread;                     // 0x00000000
     PVOID LockSemaphore;                    // 0x00005CDC
     ULONG SpinCount;                        // 0x00000000
} RTL_CRITICAL_SECTION, *PRTL_CRITICAL_SECTION;

The customer confirmed that, yes, the Debug­Info of the critical section they were trying to enter was indeed null.

Although the customer didn’t do it in their application (at least not knowingly), they did try a test application which passed the CRITICAL_SECTION_NO_DEBUG_INFO flag to the Initialize­Critical­Section­Ex function, in the hopes of inducing a null pointer for the Debug­Info, but it didn’t work. When initialized in that way, the Debug­Info was set to 0xFFFFFFFF.

Is it possible that this is a critical section that was initialized with the traditional Initialize­Critical­Section function, but the attempt to allocate the debug info failed, so the kernel left it null?

No, that’s not why the the Debug­Info is null. If a critical section has no debug info (either explicitly requested as such with the CRITICAL_SECTION_NO_DEBUG_INFO flag, or because the system couldn’t allocate any debug info), then the Debug­Info is set to the special value 0xFFFFFFFF. The Debug­Info for a valid initialized critical section is never null.

So what does it mean when the Debug­Info is null? The most likely reason is that you are using an uninitialized critical section. Either you never initialized it, or you deleted an initialized critical section (which resets it back to the uninitialized state).

Other evidence that you have an uninitialized critical section is that the critical section is locked, yet has no owner. Furthermore, the spin count is zero, which occurs only on uniprocessor systems. I suspect the server they are running the program on has more than one core. (Heck, my phone has more than one core.)

Bonus reading: Displaying a critical section in the debugger.

Related: I hope you werent using those undocumented critical section fields.

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

0 comments

Discussion are closed.

Feedback