Why is the stack overflow exception raised before the stack has overflowed?

Raymond Chen

Consider this program we looked at last time.

#include <stdio.h>

int maxdepth = 0;

int f()
  return f();

int main()
  __try {
  } __except (GetExceptionCode() == STATUS_STACK_OVERFLOW ?
    printf("Overflow at %u\n", maxdepth);

As before, make sure to compile this program with optimizations disabled to ensure that the recursive call occupies stack and doesn’t get tail-call-optimized.

When you run this program, it will break into the debugger at the moment when the stack overflow occurs. And if you look at the stack pointer and the available stack space, you’ll see that the stack overflow exception was raised before the stack actually overflowed.

The exception is raised while the stack still has around 12KB of space left. What’s up with the extra 12KB? “I paid for that extra 12KB of memory, why won’t you let me use it?”

The kernel grows the stack when the stack guard page exception is triggered. (We’ll take a closer look at the stack guard page in a few months,) But when the kernel notices that the stack is nearly depleted, it raises the stack overflow exception. It does this before the stack is depleted so that there is still some stack on which to run the exception handlers.

The kernel is indeed letting you use that last 12KB of stack. It’s letting you use it to handle the stack overflow exception!

In the above program, for example, there is a stack overflow handler that wants to print a message and then break out of the infinite recursion. If the kernel didn’t raise the stack overflow exception until the stack was completely depleted, there wouldn’t be any stack for the stack overflow handler to run on.

Now, you (the person who wants to extract every last drop out of your stack before it overflows) might say, “Well, the kernel could just allocate a special emergency stack for stack overflow exceptions.”

But this would have to be a per-thread emergency stack, since multiple threads could be dealing with stack overflows simultaneously. And there’s already a convenient chunk of per-thread data: The stack itself!

You could say that the kernel does indeed reserve some space for a per-thread emergency stack: It allocates it from the end of the stack. And the stack overflow exception is raised when you reach the emergency stack.

Bonus chatter: Switching to an emergency stack would create a lot of problems. It would mess up stack traces, since the stack trace at the point of a stack overflow would stop when it reached the end of the emergency stack. Split stacks aren’t really a thing in Windows because the kernel needs to know the stack boundaries in order to detect stack buffer overflow attacks: If the chain of stack frames or the chain of exception records leaves the stack, then the kernel assumes that the stack is corrupted. This means that no exception handler from the main stack will be called when the system is running on the emergency stack. Switching to the emergency stack would be pointless, since there won’t be any handlers to run anyway.

I guess the kernel could add extra support for an emergency stack mode where it knows that it’s on an emergency stack and accept stack frames and exception handlers from the main stack. But that would be a lot of work compared to just carving the emergency stack out of the end of the regular stack. (It would also break any code that used Get­Thread­Stack­Limits.)


Comments are closed. Login to edit/delete your existing comments

  • Joshua Hudson

    I do recall one system that had emergency stack, and it worked rather differently. All kernel-mode callbacks were raised on it including stackoverflow. Guess what happened if you overflowed the emergency stack. (I never found out but probably just dead.) This did have an upside of trivial red-zone though. The emergency stack was pretty small; you were supposed to set some flags and potentially call longjmp() to get out of there or exit() to terminate.

  • Joshua Hudson

    On a completely unrelated note, it’s not documented anywhere how much stack you’re supposed to leave for kernel functions (kernel32.dll, advapi32.dll). This comes up quite a bit when the plan is to allocate hundreds of KB of RAM on the worker thread stack because it’s faster than heap allocation and trivially provable that it’s leak free. I’ve been assuming 32KB is enough (including the frames below the start of thread entry — now why that’s more than 1 dword on x86 or 5 qwords on x64 (calling convention) is anybody’s guess).

    • Antonio Rodríguez

      I suppose it’s not documented because it’s an implementation detail. If they said “32 KB” now, and then needed to grow it later (say, in Windows 12 they introduce Win128, where WOW64 runs under WOW128 😛 ), they would have their hands tied.

      This is one of these cases where if you have to ask where the limit is, you are doing something wrong (like thread number or filesystem handles limits). Stack allocation can be convenient, but you should use it wisely: do not allocate on the stack in recursive functions, and limit it to small chunks (the ones that work worst with heap allocations). To me, “hundreds of KB” does not sound so small, at least for a stack allocation.

      • Joshua Hudson

        I’m already calling CreateThread() and telling it how much stack to allocate because I know how much that thread’s going to use (and it’s more than the default value). What more do you want?

        Hands are tied anyway; it just without documenting a reserve value they’re tied empirically by whoever took the lowest guess that worked.

        • 紅樓鍮

          Can’t you use VirtualAlloc? I think if you ask the heap manager for some hundred KBs it will pass your request directly to VirtualAlloc anyways (20110930-00/?p=9513).

  • Antonio Rodríguez

    “Now, you (the person who wants to extract every last drop out of your stack before it overflows) might say, “Well, the kernel could just allocate a special emergency stack for stack overflow exceptions.””

    Above that, it would be a big contradiction. You want to squeeze those extra 12 KB at the end of your memory-hungry stacks. Well, allocating an emergency stack of at least 12 KB for *every* thread on the system would waste more memory by definition, because most threads wouldn’t need it. And if it’s an emergency stack for processing exceptions, maybe it should be committed at thread creation, because otherwise, on low memory situations, it could lead to a “domino effect” where threads would start to fail hard with exceptions which could not be processed. In no time, the system would be unstable and unresponsive.

    In other words, for a typical system with thousands of threads, you would end wasting tens or hundreds of megabytes of memory to handle an edge case. All because your process wanted to squeeze an extra 12 KB. *That* is inefficient.

  • Henke37

    Doesn’t fibers and such involve messing with the stack layout? Oh well, they are all in cohorts anyway.