{"id":102792,"date":"2019-08-20T07:00:00","date_gmt":"2019-08-20T14:00:00","guid":{"rendered":"http:\/\/devblogs.microsoft.com\/oldnewthing\/?p=102792"},"modified":"2019-09-13T21:56:35","modified_gmt":"2019-09-14T04:56:35","slug":"20190820-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20190820-00\/?p=102792","title":{"rendered":"The SuperH-3, part 12: Calling convention and function prologues\/epilogues"},"content":{"rendered":"<p>The calling convention used by Windows CE for the SH-3 processor looks very much like the calling convention for other RISC architectures on Windows.<\/p>\n<p>The short version is that the first four parameters (assuming they are all 32-bit integers) are passed in registers <var>r4<\/var> through <var>r7<\/var>, and the rest go onto the stack after a 16-byte gap. The 16-byte gap is the home space for the register parameters, and even if a function accepts fewer than four parameters, you must still provide a full 16 bytes of home space.<\/p>\n<p>More strictly, the first 16 bytes of parameters are passed in registers <var>r4<\/var> through <var>r7<\/var>. If a parameter is a floating point type, then how it gets passed depends on how the parameter is declared in the function prototype.<\/p>\n<ul>\n<li>If the floating point type is prototyped as non-variadic, then it goes into the corresponding register <var>fr4<\/var> through <var>fr7<\/var>, and the integer register goes unused.<\/li>\n<li>If the floating point type is prototyped as variadic, then it stays in the integer register.<\/li>\n<li>If the function has no prototype, then the floating point type goes into both the floating point register and the integer register.<\/li>\n<\/ul>\n<p>The reason for this rule is the same as before. Variadic parameters go into integer registers because the callee doesn&#8217;t know what type they are upon function entry. To make things easier, variadic parameters are always passed in integer registers, so that the callee can just spill them into the home space and treat them all as stack-based parameters. And unprototyped functions pass the floating point values in both floating point and integer registers because it doesn&#8217;t know whether the function is going to treat them as variadic or non-variadic, so it has to cover both bases.<\/p>\n<p>Unlike <a href=\"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/20180417-00\/?p=98525\"> the Windows calling convention for the MIPS R4000<\/a>, the Windows calling convention for the SH-3 does not require 64-bit values to be 8-byte aligned. For example:<\/p>\n<pre>void f(int a, __int64 b, int c);\r\n<\/pre>\n<table class=\"cp3\" style=\"border-collapse: collapse; text-align: center;\" border=\"0\" cellpadding=\"3\">\n<tbody>\n<tr>\n<th style=\"border: solid 1px black;\">MIPS<\/th>\n<th style=\"border: solid 1px black;\">Contents<\/th>\n<td rowspan=\"5\">\u00a0<\/td>\n<th style=\"border: solid 1px black;\">SH-3<\/th>\n<th style=\"border: solid 1px black;\">Contents<\/th>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\"><var>a0<\/var><\/td>\n<td style=\"border: solid 1px black;\"><var>a<\/var><\/td>\n<td style=\"border: solid 1px black;\"><var>r4<\/var><\/td>\n<td style=\"border: solid 1px black;\"><var>a<\/var><\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\"><var>a1<\/var><\/td>\n<td style=\"border: solid 1px black;\">unused<\/td>\n<td style=\"border: solid 1px black;\"><var>r5<\/var><\/td>\n<td style=\"border: solid 1px black;\" rowspan=\"2\"><var>b<\/var><\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\"><var>a2<\/var><\/td>\n<td style=\"border: solid 1px black;\" rowspan=\"2\"><var>b<\/var><\/td>\n<td style=\"border: solid 1px black;\"><var>r6<\/var><\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\"><var>a3<\/var><\/td>\n<td style=\"border: solid 1px black;\"><var>r7<\/var><\/td>\n<td style=\"border: solid 1px black;\"><var>c<\/var><\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\">on stack<\/td>\n<td style=\"border: solid 1px black;\"><var>c<\/var><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>On entry to the function, the return address is provided in the <var>pr<\/var> register, and on exit the function&#8217;s return value is placed in the <var>r0<\/var> register. However, if the function&#8217;s return value is larger than 32 bits, then a secret first parameter is passed which is a pointer to a buffer to receive the return value. The parameters are caller-clean; the function must return with the stack pointer at the same value it had when control entered.<\/p>\n<p>If the concept of home space offends you, you can think of it as a <a href=\"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/20190111-00\/?p=100685\"> 16-byte red zone that sits above the stack pointer<\/a>.<\/p>\n<p>The stack for a typical function looks like this:<\/p>\n<table class=\"cp3\" style=\"border-collapse: collapse;\" border=\"0\" cellpadding=\"3\">\n<tbody>\n<tr>\n<td align=\"center\">\u22ee<\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\" align=\"center\">param 6<\/td>\n<td>(if function accepts more than 4 parameters)<\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\" align=\"center\">param 5<\/td>\n<td>(if function accepts more than 4 parameters)<\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\" align=\"center\">param 4 home space<\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\" align=\"center\">param 3 home space<\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\" align=\"center\">param 2 home space<\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\" align=\"center\">param 1 home space<\/td>\n<td>\u2190 stack pointer at function entry<\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\" align=\"center\">\n<div>saved registers<\/div>\n<div>\u22ee<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\" align=\"center\">saved return address<\/td>\n<td>\u2190 stack pointer after saving registers<\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\" align=\"center\">\n<div>local variables<\/div>\n<div>\u22ee<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\" align=\"center\">\n<div>outbound parameters<\/div>\n<div>beyond 4 (if any)<\/div>\n<div>\u22ee<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\" align=\"center\">param 4 home space<\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\" align=\"center\">param 3 home space<\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\" align=\"center\">param 2 home space<\/td>\n<\/tr>\n<tr>\n<td style=\"border: solid 1px black;\" align=\"center\">param 1 home space<\/td>\n<td>\u2190 stack pointer after prologue complete<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The function typically starts by pushing onto the stack any nonvolatile registers, as well as its return address. This takes advantage of the pre-decrement addressing mode. In practice, the Microsoft C compiler allocates nonvolatile registers starting at <var>r8<\/var> and increasing, and preserves them on the stack in that order, followed by the return address.<\/p>\n<p>In this example, the function has four registers to save, plus the return address.<\/p>\n<pre>function_start:\r\n    MOV.L   r8, @-r15   ; push r8\r\n    MOV.L   r9, @-r15   ; push r9\r\n    MOV.L   r10, @-r15  ; push r10\r\n    MOV.L   r11, @-r15  ; push r11\r\n    STS.L   pr, @-r15   ; push pr\r\n<\/pre>\n<p>At some point (perhaps not immediately), the function will adjust its stack pointer to create space for its local variables and outbound parameters. If the function has a small stack frame, it can use the immediate form of the <code>SUB<\/code> instruction. Otherwise, it&#8217;s probably going to load a constant into a register and use that as the input to the two-register form of the <code>SUB<\/code> instruction.<\/p>\n<p>If the function has a large stack frame, it will be difficult to access variables far away from <var>r15<\/var> due to the limited reach of the <i>register indirect with displacement<\/i> addressing mode. To help with this problem, the compiler might park the frame pointer register <var>r14<\/var> in the middle of the frame, or at least close to a frequently-used variable, so that it can reach more local variables in a single instruction.<\/p>\n<p>At the exit of the function, the operations performed in the prologue are reversed: The stack pointer is adjusted to point to the saved return address, and the saved registers are popped off the stack. Finally, the function returns with a <code>rts<\/code>.<\/p>\n<pre>    LDS.L   @r15+, pr   ; pop pr\r\n    MOV.L   @r15+, r11  ; pop r11\r\n    MOV.L   @r15+, r10  ; pop r10\r\n    MOV.L   @r15+, r9   ; pop r9\r\n    RTS                 ; return\r\n    MOV.L   @r15+, r8   ; pop r8 (in the delay slot)\r\n<\/pre>\n<p>Lightweight leaf functions are those which call no other functions and which can accomplish their task using only volatile registers and the 16 bytes of home space. Such functions may not modify the <var>pr<\/var> register or any nonvolatile registers (which includes the stack pointer).<\/p>\n<p>Next time, we\u2019ll look at some code patterns you\u2019ll see in the compiler-generated code, y\u2019know, the stuff that goes <i>inside<\/i> the function. <a href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20190821-00\/?p=102794\"> We&#8217;ll start with misaligned data<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A typical RISC pattern.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[2],"class_list":["post-102792","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-history"],"acf":[],"blog_post_summary":"<p>A typical RISC pattern.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/102792","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=102792"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/102792\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=102792"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=102792"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=102792"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}