{"id":112154,"date":"2026-03-20T07:00:00","date_gmt":"2026-03-20T14:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=112154"},"modified":"2026-03-20T15:11:23","modified_gmt":"2026-03-20T22:11:23","slug":"20260320-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20260320-00\/?p=112154","title":{"rendered":"Windows stack limit checking retrospective: arm64, also known as AArch64"},"content":{"rendered":"<p>Our survey of stack limit checking wraps up with arm64, also known as AArch64.<\/p>\n<p>The stack limit checking takes two forms, one simple version for pure arm64 processes, and a more complex version for Arm64EC. I&#8217;m going to look at the simple version. The complex version differs in that it has to check whether the code is running on the native arm64 stack or the emulation stack before calculating the stack limit. That part isn&#8217;t all that interesting.<\/p>\n<pre>; on entry, x15 is the number of paragraphs to allocate\r\n;           (bytes divided by 16)\r\n; on exit, stack has been validated (but not adjusted)\r\n; modifies x16, x17\r\n\r\nchkstk:\r\n    subs    x16, sp, x15, lsl #4\r\n                            ; x16 = sp - x15 * 16\r\n                            ; x16 = desired new stack pointer\r\n    csello  x16, xzr, x16   ; clamp to 0 on underflow\r\n\r\n    mov     x17, sp\r\n    and     x17, x17, #-PAGE_SIZE   ; round down to nearest page\r\n    and     x16, x16, #-PAGE_SIZE   ; round down to nearest page\r\n\r\n    cmp     x16, x17        ; on the same page?\r\n    beq     done            ; Y: nothing to do\r\n\r\nprobe:\r\n    sub     x17, x17, #PAGE_SIZE ; move to next page\u00b9\r\n    ldr     xzr, [x17]      ; probe\r\n    cmp     x17, x16        ; done?\r\n    bne     probe           ; N: keep going\r\n\r\ndone:\r\n    ret\r\n<\/pre>\n<p>The inbound value in <code>x15<\/code> is the number of bytes desired <i>divided by 16<\/i>. Since the arm64 stack must be kept 16-byte aligned, we know that the division by 16 will not produce a remainder. Passing the amount in paragraphs expands the number of bytes expressible in a single constant load from <code>0xFFF0<\/code> to <code>0x0FFF0<\/code> (via the <code>movz<\/code> instruction), allowing convenient allocation of stack frames up to just shy of a megabyte in size. Since the default stack size is a megabyte, this is sufficient to cover all typical usages.<\/p>\n<p>Here&#8217;s an example of how a function might use <code>chkstk<\/code> in its prologue:<\/p>\n<pre>    mov     x15, #17328\/16      ; desired stack frame size divided by 16\r\n    bl      chkstk              ; ensure enough stack space available\r\n    sub     sp, sp, x15, lsl #4 ; reserve the stack space\r\n<\/pre>\n<p>Okay, so let&#8217;s summarize all of the different stack limit checks into a table, because people like tables.<\/p>\n<table class=\"cp3\" style=\"border-collapse: collapse; text-align: center;\" border=\"1\" cellspacing=\"0\" cellpadding=\"3\">\n<tbody>\n<tr>\n<th>\u00a0<\/th>\n<th>x86-32<\/th>\n<th>MIPS<\/th>\n<th>PowerPC<\/th>\n<th>Alpha AXP<\/th>\n<th>x86-64<\/th>\n<th>AArch64<\/th>\n<\/tr>\n<tr>\n<td style=\"text-align: left;\">unit requested<\/td>\n<td>Bytes<\/td>\n<td>Bytes<\/td>\n<td>Negative bytes<\/td>\n<td>Bytes<\/td>\n<td>Bytes<\/td>\n<td>Paragraphs<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: left;\">adjusts stack pointer before returning<\/td>\n<td>Yes<\/td>\n<td>No<\/td>\n<td>No<\/td>\n<td>No<\/td>\n<td>No<\/td>\n<td>No<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: left;\">detects stack placement at runtime<\/td>\n<td>No<\/td>\n<td>Yes<\/td>\n<td>Yes<\/td>\n<td>Yes<\/td>\n<td>Yes<\/td>\n<td>Yes<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: left;\">short-circuits<\/td>\n<td>No<\/td>\n<td>Yes<\/td>\n<td>Yes<\/td>\n<td>Yes<\/td>\n<td>Yes<\/td>\n<td>No<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: left;\">probe operation<\/td>\n<td>Read<\/td>\n<td>Write<\/td>\n<td>Read<\/td>\n<td>Write<\/td>\n<td>Either<\/td>\n<td>Read<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>As we discussed earlier, if the probe operation is a write, then short-circuiting is mandatory.<\/p>\n<p>\u00b9 If you&#8217;re paying close attention, you may have noticed that <code>PAGE_SIZE<\/code> is too large to fit in a 12-bit immediate constant. No problem, because the assembler rewrites it as<\/p>\n<pre>    sub x17, x17, #PAGE_SIZE\/4096, lsl #12\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Wrapping things up.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-112154","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing"],"acf":[],"blog_post_summary":"<p>Wrapping things up.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/112154","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=112154"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/112154\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=112154"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=112154"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=112154"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}