{"id":98425,"date":"2018-04-03T07:00:00","date_gmt":"2018-04-03T21:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/?p=98425"},"modified":"2019-03-13T00:44:51","modified_gmt":"2019-03-13T07:44:51","slug":"20180403-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20180403-00\/?p=98425","title":{"rendered":"The MIPS R4000, part 2: 32-bit integer calculations"},"content":{"rendered":"<p>The MIPS R4000 has the usual collection of arithmetic operations, but the mnemonics are confusingly-named. The general notation for arithmetic operations is <\/p>\n<pre>\n    OP      destination, source1, source2\n<\/pre>\n<p>with the destination register on the left and the source register or registers on the right. <\/p>\n<p>Okay, here goes. We start with addition and subtraction. <\/p>\n<pre>\n    ADD     rd, rs, rt      ; rd = rs + rt, trap on overflow\n    ADDU    rd, rs, rt      ; rd = rs + rt, no trap on overflow\n    SUB     rd, rs, rt      ; rd = rs - rt, trap on overflow\n    SUBU    rd, rs, rt      ; rd = rs - rt, no trap on overflow\n<\/pre>\n<p>The <code>ADD<\/code> and <code>SUB<\/code> instructions perform addition and subtraction and raise a trap if a signed overflow occurs. The <code>ADDU<\/code> and <code>SUBU<\/code> instructions do the same thing, but without the overflow trap. The <code>U<\/code> suffix officially means &#8220;unsigned&#8221;, but this is confusing because the addition can be performed on both signed and unsigned values, thanks to twos complement. The real issue is whether an overflow trap is raised. <\/p>\n<p>There are also versions of the addition instructions that accept a 16-bit signed immediate as a second addend: <\/p>\n<pre>\n    ADDI    rd, rs, imm16   ; rd = rs + (int16_t)imm16, trap on overflow\n    ADDIU   rd, rs, imm16   ; rd = rs + (int16_t)imm16, no trap on overflow\n<\/pre>\n<p>Note that the <code>U<\/code> is double-confusing here, because even though the <code>U<\/code> officially stands for &#8220;unsigned&#8221;, the immediate value is treated as signed, and the addition is suitable for both signed and unsigned values. <\/p>\n<p>There are no corresponding <code>SUBI<\/code> or <code>SUBIU<\/code> instructions, but they can be synthesized: <\/p>\n<pre>\n    ADDI   rd, rs, -imm16   ; SUBI   rd, rs, imm16\n    ADDIU  rd, rs, -imm16   ; SUBIU  rd, rs, imm16\n<\/pre>\n<p>(Of course, this doesn&#8217;t work if the value you want to subtract is &minus;32768, but hey, it mostly works.) <\/p>\n<p>The next group of instructions is the bitwise operations. These never trap.&sup1; <\/p>\n<pre>\n    AND     rd, rs, rt      ; rd = rs &amp; rt\n    ANDI    rd, rs, imm16   ; rd = rs &amp; (uint16_t)imm16\n    OR      rd, rs, rt      ; rd = rs | rt\n    ORI     rd, rs, imm16   ; rd = rs | (uint16_t)imm16\n    XOR     rd, rs, rt      ; rd = rs ^ rt\n    XORI    rd, rs, imm16   ; rd = rs ^ (uint16_t)imm16\n    NOR     rd, rs, rt      ; rd = ~(rs | rt)\n<\/pre>\n<p>Note the inconsistency: The addition instructions treat the immediate as a signed 16-bit value (and sign-extend it to a 32-bit value), but the bitwise logical operations treat it as an unsigned 16-bit value (and zero-extend it to a 32-bit value). Stay alert! <\/p>\n<p>The last group of instructions for today is the shift instructions. These also never trap. <\/p>\n<pre>\n    SLL     rd, rs, imm5    ; rd = rs &lt;&lt;  imm5\n    SLLV    rd, rs, rt      ; rd = rs &lt;&lt;  (rt % 32)\n    SRL     rd, rs, imm5    ; rd = rs &gt;&gt;U imm5\n    SRLV    rd, rs, rt      ; rd = rs &gt;&gt;U (rt % 32)\n    SRA     rd, rs, imm5    ; rd = rs &gt;&gt;  imm5\n    SRAV    rd, rs, rt      ; rd = rs &gt;&gt;  (rt % 32)\n<\/pre>\n<p>The mnemonics stand for &#8220;shift left logical&#8221;, &#8220;shift right logical&#8221; and &#8220;shift right arithmetic&#8221;. The <code>V<\/code> suffix stands for &#8220;variable&#8221;, and indicates that the shift amount comes from a register rather than an immediate. <\/p>\n<p>Yup, that&#8217;s another inconsistency. Following the pattern of the addition and bitwise logical groups, these instructions should have been named <code>SLL<\/code> for shifting by an amount specified by a register and <code>SLLI<\/code> for shifting by an amount specified by an immediate. Go figure. <\/p>\n<p>There are no built-in sign-extension or zero-extension instructions. You can get zero-extension in one instruction by explicitly masking out the upper bytes: <\/p>\n<pre>\n    ; zero extend byte to word\n    ANDI    rd, rs, 0xFF    ; rd = ( uint8_t)rs\n\n    ; zero extend halfword to word\n    ANDI    rd, rs, 0xFFFF  ; rd = (uint16_t)rs\n<\/pre>\n<p>Sign extension requires two instructions. <\/p>\n<pre>\n    ; sign extend byte to word\n    SLL     rd, rs, 24      ; rd = rs &lt;&lt; 24\n    SRA     rd, rd, 24      ; rd = (int32_t)rd &gt;&gt; 24\n\n    ; sign extend halfword to word\n    SLL     rd, rs, 16      ; rd = rs &lt;&lt; 16\n    SRA     rd, rd, 16      ; rd = (int32_t)rd &gt;&gt; 16\n<\/pre>\n<\/p>\n<p>And I&#8217;m going to mention these instructions here because I can&#8217;t find a good place to put them: <\/p>\n<pre>\n    SYSCALL imm20           ; system call\n    BREAK   imm20           ; breakpoint\n<\/pre>\n<p>Both instructions trap into the kernel. The system call instruction is intended to be used to make operation system calls; the breakpoint instruction is intended to be used for software breakpoints. Both instructions carry a 20-bit immediate payload that can be used for whatever purpose the operating system chooses. <\/p>\n<p>Here are some more instructions you can synthesize from the official instructions: <\/p>\n<pre>\n    SUB     rd, zero, rs    ; NEG     rd, rs\n    SUBU    rd, zero, rs    ; NEGU    rd, rs\n    ADDU    rd, zero, rs    ; MOVE    rd, rs\n    OR      rd, zero, rs    ; MOVE    rd, rs\n    NOR     rd, zero, rs    ; NOT     rd, rs\n    SLL     zero, zero, 0   ; NOP\n    SLL     zero, zero, 1   ; SSNOP\n<\/pre>\n<p>There are many possible ways of synthesizing a <code>MOVE<\/code> instruction, but in order to be able to unwind exceptions, Windows NT requires that register motion in the prologue or epilogue of a function must take one of the two forms given above. <\/p>\n<p>Similarly, there are many ways of performing a <code>NOP<\/code>. Basically, any non-trapping 32-bit computation that targets the <var>zero<\/var> register is functionally a nop, but the two above are treated specially by the processor. <\/p>\n<ul>\n<li>    <code>NOP<\/code> = <code>SLL zero, zero, 0<\/code>     is special-cased by the processor as a nop that can be optimized     out entirely.     Use it when you need to pad out some code for space. <\/li>\n<li>    <code>SSNOP<\/code> = <code>SLL zero, zero, 1<\/code>     is special-cased by the processor as a nop that must be issued,     and it will not be simultaneously issued with any other     instruction.     Use it when you need to pad out some code for time.     (The <code>SS<\/code> stands for &#8220;super-scalar&#8221;.) <\/li>\n<\/ul>\n<p>The encoding of <code>SLL zero, zero, 0<\/code> happens to be <code>0x00000000<\/code>, which I&#8217;m sure is not a coincidence. I&#8217;m not convinced that it&#8217;s a good idea, though. I would have chosen <code>0x00000000<\/code> to be the encoding of a breakpoint or invalid instruction. <\/p>\n<p>Okay, those are the 32-bit computation instructions. Next time, we&#8217;ll look at multiplication, division, and the temperamental <var>HI<\/var> and <var>LO<\/var> registers. <\/p>\n<p>&sup1; Alas, there is no <code><a HREF=\"https:\/\/en.wikipedia.org\/wiki\/Nori\">NORI<\/a><\/code> instruction. You think I&#8217;m joking, but I&#8217;m not. Be patient. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>The usual suspects.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-98425","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>The usual suspects.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/98425","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=98425"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/98425\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=98425"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=98425"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=98425"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}