{"id":102794,"date":"2019-08-21T07:00:00","date_gmt":"2019-08-21T14:00:00","guid":{"rendered":"http:\/\/devblogs.microsoft.com\/oldnewthing\/?p=102794"},"modified":"2019-09-13T21:58:44","modified_gmt":"2019-09-14T04:58:44","slug":"20190821-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20190821-00\/?p=102794","title":{"rendered":"The SuperH-3, part 13: Misaligned data, and converting between signed vs unsigned values"},"content":{"rendered":"<p>When going through compiler-generated assembly language, there are some patterns you&#8217;ll see over and over again. Note that the code you see may not look exactly like this due to compiler instruction scheduling. In particular, the sequences for misaligned memory access may bring additional registers into play in order to avoid register dependencies.<\/p>\n<p>First, is the unsigned memory access. Bytes and words loaded from memory are sign-extended by default. If you want to load an unsigned value, you need to perform an explicit zero-extension.<\/p>\n<pre>    ; load unsigned byte from address in r0\r\n    MOV.B   @r0, r1         ; loads sign-extended byte\r\n    EXTU.B  r1, r1          ; zero-extend the byte to a longword\r\n\r\n    ; load unsigned word from address in r0\r\n    MOV.W   @r0, r1         ; loads sign-extended word\r\n    EXTU.W  r1, r1          ; zero-extend the word to a longword\r\n<\/pre>\n<p>Next up is misaligned data. The SH-3 does not support unaligned memory access. Not only that, but the kernel doesn&#8217;t even emulate unaligned memory access. If you access memory from a misaligned address, you take an access violation and your process crashes. So don&#8217;t mess up!<\/p>\n<p>There are no special instructions for accessing misaligned data. You are on your own to take individual bytes and combine them into the desired final value, or to take the starting value and decompose it into bytes.<\/p>\n<pre>    ; store 16-bit value in r1 to possibly unaligned address in r0\r\n    ; destroys r1\r\n    ;                           r1      @r0\r\n    ;                         xxxxAABB  xx xx\r\n    MOV.B   r1, @r0         ; xxxxAABB  BB xx\r\n    SHLR8   r1              ; 00xxxxAA  BB xx\r\n    MOV.B   r1, @(1, r0)    ; 00xxxxAA  BB AA\r\n\r\n    ; store 32-bit value in r1 to possibly unaligned address in r0\r\n    ; destroys r1\r\n    ;                           r1      @r0\r\n    ;                         AABBCCDD  xx xx xx xx\r\n    MOV.B   r1, @r0         ; AABBCCDD  DD xx xx xx\r\n    SHLR8   r1              ; 00AABBCC  DD xx xx xx\r\n    MOV.B   r1, @(1, r0)    ; 00AABBCC  DD CC xx xx\r\n    SHLR8   r1              ; 0000AABB  DD CC xx xx\r\n    MOV.B   r5, @(2, r0)    ; 0000AABB  DD CC BB xx\r\n    SHLR8   r1              ; 000000AA  DD CC BB xx\r\n    MOV.B   r1, @(3, r0)    ; 000000AA  DD CC BB AA\r\n\r\n    ; read 16-bit value from possibly unaligned address in r0\r\n    ;                           r1      r2        @r0\r\n    ;                         xxxxxxxx  xxxxxxxx  BB AA\r\n    MOV.B   @(1, r0), r1    ; SSSSSSAA  xxxxxxxx\r\n    SHLL8   r1              ; SSSSAA00  xxxxxxxx\r\n    MOV.B   @r0, r2         ; SSSSAA00  SSSSSSBB\r\n    EXTU.B  r2, r2          ; SSSSAA00  000000BB\r\n    OR      r1, r2          ; SSSSAA00  SSSSAABB\r\n                            ; r2 contains signed 16-bit value\r\n    EXTU.W  r2, r2          ; SSSSAA00  0000AABB\r\n                            ; r2 contains unsigned 16-bit value\r\n\r\n    ; read 32-bit value from possibly unaligned address in r0\r\n    ;                           r1      r2        @r0\r\n    ;                         xxxxxxxx  xxxxxxxx  DD CC BB AA\r\n    MOV.B   @(3, r0), r1    ; SSSSSSAA  xxxxxxxx\r\n    SHLL8   r1              ; SSSSAA00  xxxxxxxx\r\n    MOV.B   @(2, r0), r2    ; SSSSAA00  SSSSSSBB\r\n    EXTU.B  r2, r2          ; SSSSAA00  000000BB\r\n    OR      r2, r1          ; SSSSAABB  000000BB\r\n    SHLL8   r1              ; SSAABB00  000000BB\r\n    MOV.B   @(1, r0), r2    ; SSAABB00  SSSSSSCC\r\n    EXTU.B  r2, r2          ; SSAABB00  000000CC\r\n    OR      r2, r1          ; SSAABBCC  000000CC\r\n    SHLL8   r1              ; AABBCC00  000000CC\r\n    MOV.B   @r0, r2         ; AABBCC00  SSSSSSDD\r\n    EXTU.B  r2, r2          ; AABBCC00  000000DD\r\n    OR      r1, r2          ; AABBCC00  AABBCCDD\r\n<\/pre>\n<p>Less often, you will see code that sign-extends a 32-bit value to a 64-bit value.<\/p>\n<pre>    ; sign-extend 32-bit value in r0 to 64-bit value in r1:r0\r\n    MOV     r0, r1          ; copy value to r1\r\n    SHLL    r1              ; T contains high bit of value\r\n    SUBC    r1, r1          ; if T=0, then r1 = 00000000\r\n                            ; if T=1, then r1 = FFFFFFFF\r\n<\/pre>\n<p>If you happen to have the value 0 lying around in a register, you could accomplish the task in two instructions:<\/p>\n<pre>    ; sign-extend 32-bit value in r0 to 64-bit value in r1:r0\r\n    ; assumes that r2 already contains the value zero\r\n    CMP\/GT  r0, r2          ; T = (0 &gt; r0)\r\n                            ; in other words, T=0 if r0 is positive or zero\r\n                            ;                 T=1 if r0 is negative\r\n    SUBC    r1, r1          ; if T=0, then r1 = 00000000\r\n                            ; if T=1, then r1 = FFFFFFFF\r\n<\/pre>\n<p>That is just code golf on my part. I haven&#8217;t seen the compiler use this trick, or the next one.<\/p>\n<pre>    ; sign-extend 32-bit value in r0 to 64-bit value in r1:r0\r\n    ; preserves flags\r\n    ROTCL   r0              ; rotate r0 left, copying high bit into T\r\n                            ; and saving old T in low bit of r0\r\n    SUBC    r1, r1          ; if T=0, then r1 = 00000000, T stays 0\r\n                            ; if T=1, then r1 = FFFFFFFF, T stays 1\r\n    ROTCR   r0              ; rotate r0 right to restore original value\r\n                            ; and recover original value of T\r\n<\/pre>\n<p>In general, you&#8217;ll see that SH-3 assembly code is somewhat verbose, even more so because compiler technology back in this time period was not as advanced as it is today, but you have to realize that each of these instructions is only half the size of the instructions of its RISC-style contemporaries, so even though you plowed through 2000 instructions, that&#8217;s only 4KB of code.<\/p>\n<p>Okay, next time, we&#8217;re returning to reality and <a href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20190822-00\/?p=102796\"> looking at function call patterns<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Okay, now we&#8217;re doing some programming.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[2],"class_list":["post-102794","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-history"],"acf":[],"blog_post_summary":"<p>Okay, now we&#8217;re doing some programming.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/102794","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=102794"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/102794\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=102794"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=102794"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=102794"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}