Windows stack limit checking retrospective: x86-32 also known as i386, second try

The last time we looked at the Windows stack limit checker on x86-32 (also known as i386), we noted that the function has changed over the years. Here’s the revised version.

_chkstk:
    push    ecx             ; preserve register

    lea     ecx, [esp][4]   ; ecx = original stack pointer - 4
    sub     ecx, eax        ; ecx = new stack pointer - 4

    sbb     eax, eax        ; clamp ecx to zero if underflow
    not     eax
    and     ecx, eax

    mov     eax, esp        ; round current stack pointer
    and     eax, -PAGE_SIZE ; to page boundary

    ; eax = most recently probed page
    ; ecx = desired final stack pointer

check:
    cmp     ecx, eax        ; done probing?
    jb      probe           ; N: keep probing

    mov     eax, ecx        ; eax = desired final stack pointer - 4
    pop     ecx             ; restore register
    xchg    esp, eax        ; move stack pointer to final home - 4
                            ; eax gets old stack pointer
    mov     eax, [eax]      ; get return address
    mov     [esp], eax      ; put it on top of the stack
    ret                     ; and "return" to it

cs20:
    sub     eax, PAGE_SIZE  ; move to next page
    test    [eax], eax      ; probe it
    jmp     check           ; go back to see if we're done

Instead of jumping to the caller, the code copies the caller’s address to the top of the stack and performs a ret. This is a significant change because it avoids desynchronizing the return address predictor.

The ret will increment the stack pointer by four bytes, so the code over-allocates the stack by 4 bytes to compensate.

This code remains a drop-in replacement for the old chkstk function, so there is no need to change the compiler’s code generator. It also means that you can link together code compiled with the old chkstk and the new chkstk since the two versions are compatible. It does mean that we still has the wacky calling convention of returning with an adjusted stack pointer, but that’s now part of the ABI so we have to live with it.

Since we perform a ret instruction on a return address that was not placed there by a matching call instruction, this code is not compatible with shadow stacks (which Intel calls Control-Flow Enforcement Technology, or CET). The chkstk function’s wacky calling convention makes it incompatible with shadow stacks.

Okay, so much for that sadness. Next time, we’ll look at the Alpha AXP.

Author

Raymond Chen

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

2 comments

Stefan Kanthak March 26, 2026 · Edited

“Since we perform a ret instruction on a return address that was not placed there by a matching call instruction, this code is not compatible with shadow stacks […]”

OUCH: this statement is but wrong — Intel’s CET verifies that the instruction pointers popped from both stacks match!

Swap Swap March 18, 2026

The “cs20” label should be named “probe”

Windows stack limit checking retrospective: x86-32 also known as i386, second try

Category

Topics

Author

2 comments

Read next

Windows stack limit checking retrospective: Alpha AXP

Windows stack limit checking retrospective: amd64, also known as x86-64

Category

Topics

Share

Author

2 comments

Read next

Windows stack limit checking retrospective: Alpha AXP

Windows stack limit checking retrospective: amd64, also known as x86-64

Stay informed