August 13th, 2015

Crashes in the I/O stack tend to occur in programs which do the most I/O

A customer was diagnosing repeated blue screen errors on their system. They shared a few crash dumps, and they all had a similar profile: The crash occurred in the file system filter stack as the I/O request passed through the anti-virus software.

Some of the crashes reported PROCESS_NAME: ngen.exe. “Could ngen.exe be the problem?”

As a general rule, user-mode code cannot be responsible for blue-screen failures. It’s the job of the kernel to be resistant to misbehavior in user-mode. Failures of the form IRQL_NOT_LESS_THAN_OR_EQUAL and PAGE_FAULT_IN_NON_PAGED_AREA are typically driver bugs or faulty hardware (for example, due to overheating or overclocking).

The application that happened to be active at the time of the failure is not typically interesting in and of itself, although it can give a clue as to what part of the kernel is misbehaving. The fact that ngen appears is more an indication that ngen performs a lot of disk I/O, so if there’s a problem in the I/O stack, there’s a good chance that ngen was involved, simply because ngen is involved in a lot of I/O requests.

  • Bob goes to the beach very frequently.
  • Every time there is a shark attack, Bob is at the beach.
  • Conclusion: Bob causes shark attacks.

Blaming ngen for the kernel crash is like blaming Bob for the shark attacks.

Bonus chatter: Some of my colleagues came to different conclusions:

  • Conclusion: Bob should stop going to the beach.
  • Conclusion: Bob must be the shark.
Topics
Other

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

0 comments

Discussion are closed.