{"id":110380,"date":"2024-10-17T07:00:00","date_gmt":"2024-10-17T14:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=110380"},"modified":"2024-10-17T10:14:26","modified_gmt":"2024-10-17T17:14:26","slug":"20241017-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20241017-00\/?p=110380","title":{"rendered":"Evaluating tail call elimination in the face of return address protection, part 1"},"content":{"rendered":"<p>Tail call elimination is straightforward if the tail call is to a function with a compatible stack parameter layout as the original function, since you can just replace the parameter slots on the stack with the new parameters. (The register-based parameters you can just overwrite directly in registers.)<\/p>\n<p>The obvious case where this applies is where the tail-calling and tail-called functions both have the same number of stack-based parameters. Just reuse the slots and jump to the next function.<\/p>\n<p>But you can also employ tail calling even if the number of stack-based parameters does not match exactly.<\/p>\n<p>One case where the tail call is possible is if the tail-called function has fewer parameters as the tail-calling function, and the calling convention is caller-clean. In that case, you can reuse the stack slots for the outbound parameters, and just leave any extra ones uninitialized. The tail-called function won&#8217;t use them, but the original caller will still clean them up. (Note that this doesn&#8217;t work in reverse: If the tail-called function has <i>more<\/i> parameters than the tail-calling function, you can&#8217;t just smash the extra parameters onto the stack beyond those of the tail-calling function, because that&#8217;s writing into stack space that belongs to the original caller.)<\/p>\n<p>Here&#8217;s an example of a tail call on x86-32 to a function with fewer stack-based parameters.<\/p>\n<pre>int __cdecl g(int c);\r\n\r\nint __cdecl f(int a, int b)\r\n{\r\n    int v = helper(a, b);\r\n\r\n    return g(a + b);\r\n    \r\n}\r\n<\/pre>\n<p>You can reuse the stack space for the tail call to <code>g<\/code><\/p>\n<pre>    ; on entry, stack parameters are at [esp+4]\r\n    ; and [esp+8]\r\n\r\n    ; v = helper(a, b)\r\n    push    [esp+8]\r\n    push    [esp+8]\r\n    call    helper\r\n\r\n    ; reuse the \"a\" slot for the outbound\r\n    ; \"c\" slot\r\n    mov     [esp+4], eax\r\n\r\n    ; tail call to g\r\n    jmp     g\r\n<\/pre>\n<p>The caller of <code>f<\/code> will clean up two stack slots, and everything will return to normal. What the original caller doesn&#8217;t realize is that we reused one of them for <code>g<\/code>, and the other still contains leftover data from <code>f<\/code>. Logically, you can think that we inlined all of <code>g<\/code> into <code>f<\/code>.<\/p>\n<p>How does this interact with <a title=\"A quick introduction to return address protection technologies\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20241015-00\/?p=110374\"> return address protection<\/a>?<\/p>\n<p>Since we aren&#8217;t creating any imbalance in <code>call<\/code> or <code>ret<\/code> instructions, compact shadow stacks are still happy. And since the return address did not move in memory, parallel shadow stacks and return address signing are still satisfied. (For architectures that use a link register, don&#8217;t forget to authenticate the link register before jumping to the tail-called function, so that the link register on entry to the tail-called function is untagged.)<\/p>\n<p>Next time, we&#8217;ll look at another type of tail call elimination and study how it interacts with return address protection.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Reusing the activation frame.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-110380","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>Reusing the activation frame.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/110380","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=110380"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/110380\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=110380"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=110380"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=110380"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}