{"id":106927,"date":"2022-08-02T07:00:00","date_gmt":"2022-08-02T14:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=106927"},"modified":"2022-08-02T06:38:22","modified_gmt":"2022-08-02T13:38:22","slug":"20220802-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20220802-00\/?p=106927","title":{"rendered":"The AArch64 processor (aka arm64), part 6: Bitwise operations"},"content":{"rendered":"<p>Bitwise logical operations are not normally particularly exciting, but for AArch64, they get exciting not so much for the operations themselves but for the immediates they can encode.<\/p>\n<p>Let&#8217;s get the boring part out of the way.<\/p>\n<pre>    ; bitwise and with immediate\r\n    ; Rd = Rn &amp; imm\r\n    and     Rd\/sp, Rn\/zr, #imm\r\n\r\n    ; bitwise and with shifted register\r\n    ; Rd = Rn &amp; (Rm with shift)\r\n    and     Rd\/zr, Rn\/zr, Rm\/zr, shift\r\n\r\n    ; bitwise and with immediate, set flags\r\n    ; Rd = Rn &amp; #imm, set flags\r\n    ands    Rd\/zr, Rn\/zr, #imm\r\n\r\n    ; bitwise and with shifted register, set flags\r\n    ; Rd = Rn &amp; (Rm with shift), set flags\r\n    ands    Rd\/zr, Rn\/zr, Rm\/zr, shift\r\n\r\n    ; bitwise clear\r\n    ; Rd = Rn &amp; ~(Rm with shift)\r\n    bic     Rd\/zr, Rn\/zr, Rm\/zr, shift\r\n\r\n    ; bitwise clear, set flags\r\n    ; Rd = Rn &amp; ~(Rm with shift), set flags\r\n    bics    Rd\/zr, Rn\/zr, Rm\/zr, shift\r\n\r\n    ; bitwise or with immediate\r\n    ; Rd = Rn | imm\r\n    orr     Rd\/sp, Rn\/zr, #imm\r\n\r\n    ; bitwise or with shifted register\r\n    ; Rd = Rn | (Rm with shift)\r\n    orr     Rd\/zr, Rn\/zr, Rm\/zr, shift\r\n\r\n    ; bitwise or not with shifted register\r\n    ; Rd = Rn | ~(Rm with shift)\r\n    orn     Rd\/zr, Rn\/zr, Rm\/zr, shift\r\n\r\n    ; bitwise exclusive or with immediate\r\n    ; Rd = Rn ^ imm\r\n    eor     Rd\/sp, Rn\/zr, #imm\r\n\r\n    ; bitwise exclusive or with shifted register\r\n    ; Rd = Rn ^ (Rm with shift)\r\n    eor     Rd\/zr, Rn\/zr, Rm\/zr, shift\r\n\r\n    ; bitwise exclusive or not with shifted register\u00b9\r\n    ; Rd = Rn ^ ~(Rm with shift)\r\n    eon     Rd\/zr, Rn\/zr, Rm\/zr, shift\r\n<\/pre>\n<p>There are a lot of combinations here. Let&#8217;s put them in a table.<\/p>\n<table class=\"cp3\" style=\"border-collapse: collapse; text-align: center;\" border=\"1\" cellspacing=\"0\" cellpadding=\"3\">\n<tbody>\n<tr>\n<th rowspan=\"2\">Instruction<\/th>\n<th colspan=\"2\">Immediate<\/th>\n<th colspan=\"2\">Shifted register<\/th>\n<\/tr>\n<tr>\n<th>to Rd\/sp<br \/>\nno flags<\/th>\n<th>to Rd\/zr<br \/>\nwith flags<\/th>\n<th>to Rd\/zr<br \/>\nno flags<\/th>\n<th>to Rd\/zr<br \/>\nwith flags<\/th>\n<\/tr>\n<tr>\n<td><code>AND<\/code><\/td>\n<td>\u2022<\/td>\n<td>\u2022<\/td>\n<td>\u2022<\/td>\n<td>\u2022<\/td>\n<\/tr>\n<tr>\n<td><code>BIC<\/code><\/td>\n<td>&nbsp;<\/td>\n<td>&nbsp;<\/td>\n<td>\u2022<\/td>\n<td>\u2022<\/td>\n<\/tr>\n<tr>\n<td><code>ORR<\/code><\/td>\n<td>\u2022<\/td>\n<td>&nbsp;<\/td>\n<td>\u2022<\/td>\n<td>&nbsp;<\/td>\n<\/tr>\n<tr>\n<td><code>ORN<\/code><\/td>\n<td>&nbsp;<\/td>\n<td>&nbsp;<\/td>\n<td>\u2022<\/td>\n<td>&nbsp;<\/td>\n<\/tr>\n<tr>\n<td><code>EOR<\/code><\/td>\n<td>\u2022<\/td>\n<td>&nbsp;<\/td>\n<td>\u2022<\/td>\n<td>&nbsp;<\/td>\n<\/tr>\n<tr>\n<td><code>EON<\/code><\/td>\n<td>&nbsp;<\/td>\n<td>&nbsp;<\/td>\n<td>\u2022<\/td>\n<td>&nbsp;<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>For the instructions that set flags, the N and Z flags represent the result of the operation, and the C and V flags are cleared.\u00b2<\/p>\n<p>Stare at this table a bit and you start to see patterns.<\/p>\n<p>All of the bitwise operations support a shifted register, which could be a <code>LSL #0<\/code> to mean &#8220;no shift&#8221;. The operations that do not complement the second input operand support an immediate. (There&#8217;s no need to support an immediate for the complement versions, because you can just complement the immediate.) And the <code>AND<\/code>-like operations are the only ones which support flags. We&#8217;ll see workarounds for the lack of flags support in the other bitwise operations when we get to control transfer.<\/p>\n<p>With these instructions, we can create some pseudo-instructions:\u00b3<\/p>\n<pre>    tst     Rn\/zr, #imm             ; ands zr, Rn\/zr, #imm\r\n    tst     Rn\/zr, Rm\/zr, shift     ; ands zr, Rn\/zr, Rm\/zr, shift\r\n\r\n    mov     Rd, #imm                ; orr  Rd, zr, #imm\r\n    mov     Rd, Rn\/zr, shift        ; orr  Rd, zr, Rn\/zr, shift\r\n\r\n    mvn     Rd, Rn\/zr, shift        ; orn  Rd, zr, Rn\/zr, shift\r\n<\/pre>\n<p>The <code>TST<\/code> pseudo-instruction performs a bitwise <i>and<\/i> of its arguments and sets flags, but discards the result. It&#8217;s common to use a power-of-two immediate here, to test a specific bit.<\/p>\n<p>The <code>MOV<\/code> instruction set a register equal to the value of another register or a supported immediate.<\/p>\n<p>The <code>MVN<\/code> instruction sets a register to the bitwise inverse of another register.<\/p>\n<p>Okay, so about those immediates.<\/p>\n<p>The bitwise operations encode the immediates in a very strange way. If that&#8217;s the sort of thing that interests you, I encourage you to read <a href=\"https:\/\/dinfuehr.github.io\/blog\/encoding-of-immediate-values-on-aarch64\/\"> Dominik Inf\u00fchr&#8217;s explanation of how they are formed<\/a> for the gory details.<\/p>\n<p>The short version is that the immediate can encode<\/p>\n<ul>\n<li>a 2-bit pattern repeated 32 times,<\/li>\n<li>a 4-bit pattern repeated 16 times,<\/li>\n<li>an 8-bit pattern repeated 8 times,<\/li>\n<li>a 16-bit pattern repeated 4 times,<\/li>\n<li>a 32-bit pattern repeated 2 times, or<\/li>\n<li>a 64-bit pattern repeated 1 time.<\/li>\n<\/ul>\n<p>The pattern consists of a bunch of right-justified 1&#8217;s, with leading bits filled with 0&#8217;s.<\/p>\n<p>Finally, after concatenating the copies of the pattern, you can rotate the whole thing to the right by any amount.<\/p>\n<p>For example, single bits are expressible in this format, because you can ask for a 64-bit pattern consisting of a single rightmost set bit, and then rotate that single bit into the position you like.<\/p>\n<p>Conversely, all bits set except one can be generated by asking for a 64-bit pattern consisting of 63 rightmost set bits (a single clear bit in position 63), and then rotate that 0 bit into the position you like.<\/p>\n<p>Interestingly, you cannot generate all ones or all zeros with this pattern. Fortunately, you don&#8217;t need to. You can use <var>zr<\/var> for zero and the complement instruction with <var>zr<\/var> for ones. And operations with all ones or all zeroes can often be simplified to another instruction anyway, often avoiding a register dependency.<\/p>\n<table class=\"cp3\" style=\"border-collapse: collapse; text-align: center;\" border=\"1\" cellspacing=\"0\" cellpadding=\"3\">\n<tbody>\n<tr>\n<th>Missing instruction<\/th>\n<th>Replacement<\/th>\n<th>Note<\/th>\n<\/tr>\n<tr>\n<td><code>and Rd, Rn, #0<\/code><\/td>\n<td><code>mov Rd, #0<\/code><\/td>\n<td>AND with zero is zero<\/td>\n<\/tr>\n<tr>\n<td><code>and Rd, Rn, #-1<\/code><\/td>\n<td><code>mov Rd, Rn<\/code><\/td>\n<td>AND with -1 is unchanged<\/td>\n<\/tr>\n<tr>\n<td><code>orr Rd, Rn, #0<\/code><\/td>\n<td><code>mov Rd, Rn<\/code><\/td>\n<td>OR with zero is unchanged<\/td>\n<\/tr>\n<tr>\n<td><code>orr Rd, Rn, #-1<\/code><\/td>\n<td><code>orn Rd, zr, zr<\/code><\/td>\n<td>OR with -1 is -1<\/td>\n<\/tr>\n<tr>\n<td><code>eor Rd, Rn, #0<\/code><\/td>\n<td><code>mov Rd, Rn<\/code><\/td>\n<td>EOR with zero is unchanged<\/td>\n<\/tr>\n<tr>\n<td><code>eor Rd, Rn, #-1<\/code><\/td>\n<td><code>orn Rd, zr, Rn<\/code><\/td>\n<td>EOR with -1 is bitwise negation<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Okay, so that&#8217;s it for the bitwise logical operations. Next time, we&#8217;ll look at bit shifting.<\/p>\n<p>\u00b9 The <code>EON<\/code> instruction is new for AArch64. AArch32 does not have this opcode.<\/p>\n<p>\u00b2 AArch32 left C and V unchanged. My guess is that AArch64 forces both bits clear in order to avoid partial flags updates, which creates unintended dependencies among instructions.<\/p>\n<p>\u00b3 AArch64 lost the <code>TEQ<\/code> instruction from AArch32, which I noted was <a href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20210608-00\/?p=105290\"> of limited utility<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>And their very strange immediates.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[2],"class_list":["post-106927","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-history"],"acf":[],"blog_post_summary":"<p>And their very strange immediates.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/106927","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=106927"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/106927\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=106927"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=106927"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=106927"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}