{"id":98555,"date":"2018-04-19T07:00:00","date_gmt":"2018-04-19T21:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/?p=98555"},"modified":"2019-03-13T00:46:11","modified_gmt":"2019-03-13T07:46:11","slug":"20180419-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20180419-00\/?p=98555","title":{"rendered":"The MIPS R4000, part 14: Common patterns"},"content":{"rendered":"<p>Okay, now that we see how function calls work, we can demonstrate some common code sequences. If you are debugging through MIPS code, you&#8217;ll need to be able to recognize these different types of calling sequences in order to keep your bearings. <\/p>\n<p>Non-virtual calls generally look like this: <\/p>\n<pre>\n    ; Put the parameters in a0 through a3,\n    ; and additional parameters go on the stack\n    ; after the home space.\n    sw      t0, 20(sp)  ; parameter 5 passed on the stack\n    move    a3, s1      ; parameter 4 copied from another register\n    addiu   a2, sp, 32  ; parameter 3 is address of local variable\n    addiu   a1, t1, 1   ; parameter 2 is calculated in place\n    jal     destination ; call the function\n    move    a0, s1      ; parameter 1 copied from another register\n<\/pre>\n<p>The parameters could be set up in any order, and there&#8217;s a good chance you&#8217;ll find one of the parameters being set up in the branch delay slot. Note also that the <code>JAL<\/code> instruction might end up jumping to an import stub if this turns out to have been a na&iuml;vely-imported function. <\/p>\n<p>Virtual calls load the destination from the target&#8217;s vtable: <\/p>\n<pre>\n    ; \"this\" passed in a0. Other parameters go\n    ; into a1 through a3, with additional parameters\n    ; on the stack after the home space.\n    sw      t0, 20(sp)  ; parameter 5 passed on the stack\n    move    a3, s1      ; parameter 4 copied from another register\n    addiu   a2, sp, 32  ; parameter 3 is address of local variable\n    <font COLOR=\"blue\">lw      t6, 0(a0)   ; t6 -&gt; vtable of target\n    lw      t7, n(t6)   ; t7 = function pointer from vtable\n    jalr    t7          ; call the function<\/font>\n    addiu   a1, t1, 1   ; parameter 2 is calculated in place\n<\/pre>\n<p>I put all of the virtual dispatch code in one block of contiguous instructions, but in practice the compiler may choose to interleave it with the preparation of the function arguments to avoid data load stalls. The above example uses <var>t6<\/var> and <var>t7<\/var> as temporary registers for preparing the call, but in practice, the compiler will use any volatile register that is not being used to pass parameters. <\/p>\n<p>Calls to imported functions indirect through the entry in the import address table. <\/p>\n<pre>\n    ; Put the parameters in a0 through a3,\n    ; and additional parameters go on the stack\n    ; after the home space.\n    sw      t0, 20(sp)  ; parameter 5 passed on the stack\n    move    a3, s1      ; parameter 4 copied from another register\n    addiu   a2, sp, 32  ; parameter 3 is address of local variable\n    addiu   a1, t1, 1   ; parameter 2 is calculated in place\n    <font COLOR=\"blue\">lui     t6, XXXX    ; t6 -&gt; 64KB block containing import address table entry\n    lw      t6, YYYY(t6); t6 = function pointer from import address table entry\n    jalr    t6          ; call the function<\/font>\n    move    a0, s1      ; parameter 1 copied from another register\n<\/pre>\n<p>Again, I put all of the relevant instructions together. In practice, the compiler tends to front-load the fetching of the function pointer. <\/p>\n<p>The last interesting calling pattern for today is the jump table, commonly used for dense <code>switch<\/code> statements. Suppose we have this: <\/p>\n<pre>\n    switch (n) {\n    case 1: ...; break;\n    case 2: ...; break;\n    case 3: ...; break;\n    case 4: ...; break;\n    }\n<\/pre>\n<p>The resulting code would look like this:<\/p>\n<pre>\n    ; jump to address based on value in v0\n    addiu   v0,v0,-1    ; subtract 1\n    sltiu   at,v0,4     ; in range of the jump table?\n    beqz    at,default  ; nope - go to default\n    sll     v0,v0,2     ; convert to byte offset\n    <font COLOR=\"blue\">lui     at,XXXX     ; load high part of jump table address\n    addu    at,at,v0    ; add in the byte offset\n    lw      v0,YYYY(at) ; add in the low part and load jump table entry<\/font>\n    jr      v0          ; and jump there\n    nop                 ; branch delay slot\n<\/pre>\n<p>The jump table pattern first performs a single-comparison range check by the standard trick of offseting the control value by the lowest value in the range and using an unsigned comparison against the length of the range. Asssuming the range check passes, we load the word at <\/p>\n<pre>\n    address of start of jump table + 4 * index\n<\/pre>\n<p>The <code>lui<\/code> + <code>addu<\/code> + <code>lw<\/code> sequence is a pattern we saw earlier when we studied memory access: It&#8217;s the expansion of the pseudo-instruction <\/p>\n<pre>\n    lw      v0, XXXXYYYY(v0) ; load jump table entry\n<\/pre>\n<p>Once we load the jump target, we perform a register indirect jump to the intended target, and put a <code>nop<\/code> in the branch delay slot because we don&#8217;t have anything useful to put in there. (In practice, there might be something useful in there.) <\/p>\n<p>Okay, now that we&#8217;ve seen some patterns, next time we&#8217;ll try to understand an entire function. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>How to recognize different kinds of jumps and calls.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-98555","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>How to recognize different kinds of jumps and calls.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/98555","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=98555"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/98555\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=98555"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=98555"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=98555"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}