{"id":105317,"date":"2021-06-17T07:00:00","date_gmt":"2021-06-17T14:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=105317"},"modified":"2021-06-16T17:03:32","modified_gmt":"2021-06-17T00:03:32","slug":"20210617-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20210617-00\/?p=105317","title":{"rendered":"The ARM processor (Thumb-2), part 14: Manipulating flags"},"content":{"rendered":"<p>There are two instructions for accessing the flags register directly.<\/p>\n<pre>    ; move register from special register\r\n    mrs     Rd, apsr        ; Rd = APSR\r\n\r\n    ; move special register from register\r\n    msr     apsr, Rd        ; APSR = Rd\r\n<\/pre>\n<p>These instructions are for accessing special registers, but the only special register available to user mode is <code>APSR<\/code>, so that&#8217;s all you&#8217;re going to see, if you even see this at all.<\/p>\n<p>The format of the Application Program Status Register (APSR) is as follows:<\/p>\n<table class=\"cp3\" style=\"border-collapse: collapse; text-align: center;\" title=\"A 32-bit bitfield with N in bit 31, Z in bit 30, C in bit 29, V in bit 28, Q in bit 27, and GE in bits 16 through 19. The other bits are unlabeled.\" border=\"1\" cellspacing=\"0\" cellpadding=\"3\">\n<tbody>\n<tr style=\"font-size: 75%;\">\n<td>3<br \/>\n1<\/td>\n<td>3<br \/>\n0<\/td>\n<td>2<br \/>\n9<\/td>\n<td>2<br \/>\n8<\/td>\n<td>2<br \/>\n7<\/td>\n<td>2<br \/>\n6<\/td>\n<td>2<br \/>\n5<\/td>\n<td>2<br \/>\n4<\/td>\n<td>2<br \/>\n3<\/td>\n<td>2<br \/>\n2<\/td>\n<td>2<br \/>\n1<\/td>\n<td>2<br \/>\n0<\/td>\n<td>1<br \/>\n9<\/td>\n<td>1<br \/>\n8<\/td>\n<td>1<br \/>\n7<\/td>\n<td>1<br \/>\n6<\/td>\n<td>1<br \/>\n5<\/td>\n<td>1<br \/>\n4<\/td>\n<td>1<br \/>\n3<\/td>\n<td>1<br \/>\n2<\/td>\n<td>1<br \/>\n1<\/td>\n<td>1<br \/>\n0<\/td>\n<td>\n9<\/td>\n<td>\n8<\/td>\n<td>\n7<\/td>\n<td>\n6<\/td>\n<td>\n5<\/td>\n<td>\n4<\/td>\n<td>\n3<\/td>\n<td>\n2<\/td>\n<td>\n1<\/td>\n<td>\n0<\/td>\n<\/tr>\n<tr>\n<td>N<\/td>\n<td>Z<\/td>\n<td>C<\/td>\n<td>V<\/td>\n<td>Q<\/td>\n<td colspan=\"7\" bgcolor=\"#c0c0c0\">\u00a0<\/td>\n<td colspan=\"4\">GE[3:0]<\/td>\n<td colspan=\"16\" bgcolor=\"#c0c0c0\">\u00a0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>the N, Z, C, and V flags are updated by arithmetic operations. The GE flags are updated by SIMD operations. The Q flag is different: It is set when a saturating arithmetic operation overflows, and the only way to clear it is to issue an <code>MSR<\/code> instruction.<\/p>\n<p>In user mode, the unlabeled bits of the APSR read as zero, and any attempts to modify them are ignored.<\/p>\n<p>The odd placement of the four main numeric flags dates back to the first revision of the ARM processor.<\/p>\n<p>The original ARM processor supported only 26-bit addresses, for a total address space of 64MB, and all instructions had to begin on a four-byte boundary. The unused bits of the <var>pc<\/var> register <a href=\"http:\/\/www.peter-cockerell.net\/aalp\/html\/ch-2.html\"> were repurposed to hold the flags<\/a>!<\/p>\n<table class=\"cp3\" style=\"border-collapse: collapse; text-align: center;\" title=\"A 32-bit bitfield with N, Z, C, and V in the upper four bits, and bits 2 through 25 containing the program counter. The other bits are unlabeled.\" border=\"1\" cellspacing=\"0\" cellpadding=\"3\">\n<tbody>\n<tr style=\"font-size: 75%;\">\n<td>3<br \/>\n1<\/td>\n<td>3<br \/>\n0<\/td>\n<td>2<br \/>\n9<\/td>\n<td>2<br \/>\n8<\/td>\n<td>2<br \/>\n7<\/td>\n<td>2<br \/>\n6<\/td>\n<td>2<br \/>\n5<\/td>\n<td>2<br \/>\n4<\/td>\n<td>2<br \/>\n3<\/td>\n<td>2<br \/>\n2<\/td>\n<td>2<br \/>\n1<\/td>\n<td>2<br \/>\n0<\/td>\n<td>1<br \/>\n9<\/td>\n<td>1<br \/>\n8<\/td>\n<td>1<br \/>\n7<\/td>\n<td>1<br \/>\n6<\/td>\n<td>1<br \/>\n5<\/td>\n<td>1<br \/>\n4<\/td>\n<td>1<br \/>\n3<\/td>\n<td>1<br \/>\n2<\/td>\n<td>1<br \/>\n1<\/td>\n<td>1<br \/>\n0<\/td>\n<td>\n9<\/td>\n<td>\n8<\/td>\n<td>\n7<\/td>\n<td>\n6<\/td>\n<td>\n5<\/td>\n<td>\n4<\/td>\n<td>\n3<\/td>\n<td>\n2<\/td>\n<td>\n1<\/td>\n<td>\n0<\/td>\n<\/tr>\n<tr>\n<td>N<\/td>\n<td>Z<\/td>\n<td>C<\/td>\n<td>V<\/td>\n<td colspan=\"2\" bgcolor=\"#c0c0c0\">\u00a0<\/td>\n<td colspan=\"24\">program counter<\/td>\n<td colspan=\"2\" bgcolor=\"#c0c0c0\">\u00a0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The unlabeled bits are used only in kernel mode: In user mode, they read as zero and writes are ignored. The <code>Q<\/code> and <code>GE<\/code> flags had not been invented yet, so the only user-mode flags are N, Z, C, and V.<\/p>\n<p>You can think of the flag bits as stowaways hiding inside the unused bits of the program counter register. If used as the first source parameter in a binary operation, all the extraneous non-program-counter bits were masked off, allowing you to perform <var>pc<\/var>-relative addressing and <var>pc<\/var>-based arithmetic.\u00b9 In other contexts, however, the full 32-bit value of <var>pc<\/var> is used, flags and all.<\/p>\n<p>When support expanded to a full 32-bit address space in ARM 3(?), those flag bits had to move to the APSR register, but to faciliate porting, their bit positions were preserved.<\/p>\n<p>There are no dedicated instructions for manipulating specific flags. If you want to, say, set the carry flag and leave all other flags unchanged, you&#8217;ll have to copy the ASPR to a general-purpose register, set the carry bit, and then set it back.<\/p>\n<p>If you don&#8217;t mind corrupting the other flags, then you can use some tricks to coerce a particular flag to a specific state.<\/p>\n<pre>    ; compare a number with itself\r\n    cmp     r0, r0      ; sets N = 0, Z = 1, C = 1, V = 0\r\n<\/pre>\n<p>Comparing a number sets flags according to the result of the subtraction, which produces zero. Therefore, the flags are set for nonnegative, zero, carry set (no underflow), and no overflow.<\/p>\n<p>To clear carry, you can add zero:<\/p>\n<pre>    adds    r0, r0, #0\r\n<\/pre>\n<p>Adding zero will never cause unsigned overflow, so this leaves carry clear.<\/p>\n<p>Alternatively, if you don&#8217;t want to create a false write dependency on <var>r0<\/var>, you could use<\/p>\n<pre>    ; add 0 and set flags, but discard result\r\n    cmn     r0, #0\r\n<\/pre>\n<p>This takes advantage of <a title=\"The ARM processor (Thumb-2), part 6: The lie hiding inside the CMN instruction\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20210607-00\/?p=105288\"> the lie hiding inside the <code>CMN<\/code> instruction<\/a> that causes <code>CMN Rd, #0<\/code> to clear carry when it really should have set it.<\/p>\n<p>If you want to force a nonnegative, zero result without affecting carry or overflow, you can use the otherwise-neglected <code>TEQ<\/code> instruction:<\/p>\n<pre>    ; test a number for equivalence with itself\r\n    teq     r0, r0      ; sets N = 0, Z = 1, C and V unchanged\r\n<\/pre>\n<p>To force a nonzero result, you can compare the stack pointer against an odd number, since Thumb-2 does not permit the stack pointer to be odd.<\/p>\n<pre>    cmp     sp, #1      ; force nonzero result\r\n<\/pre>\n<p>You can&#8217;t use <var>pc<\/var> for this trick because Thumb-2 does not allow the <var>pc<\/var> register to be used by a <code>CMP<\/code> instruction.<\/p>\n<p>I couldn&#8217;t think of a single-instruction way to force the negative or overflow bit to be set without modifying any integer registers. Maybe you can come up with something.\u00b2<\/p>\n<p>Okay, so the second half of this article was mostly just code golf. Next time, we&#8217;ll return to reality by looking at a few miscellaneous instructions.<\/p>\n<p>\u00b9 This explains <a href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20210608-00\/?p=105290#comment-138049\"> the comment from Neil Rashbrook<\/a> that you could use <code>TEQ<\/code> to copy the sign bit from a register into the <var>N<\/var> flag:<\/p>\n<pre>    teq     pc, Rn      ; set flags according to (pc &amp; 0x03FFFFFC) ^ Rn\r\n<\/pre>\n<p>Masking out the flag bits from the left-hand side (<var>pc<\/var>) means that the high bit is always clear. Exclusive-or with zero has no effect, so the tested value has the same high bit as <var>Rn<\/var>, which then becomes the <var>N<\/var> flag. This trick stopped working in ARM3, when the flags moved to a separate special register.<\/p>\n<p>\u00b2 I considered taking advantage of the fact that in Thumb-2 mode, the bottom bit of <var>pc<\/var> is always set, but the bit shifting and bit extraction instructions disallow <var>pc<\/var> as a source (or destination).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Reaching in and flipping the switches.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[2],"class_list":["post-105317","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-history"],"acf":[],"blog_post_summary":"<p>Reaching in and flipping the switches.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/105317","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=105317"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/105317\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=105317"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=105317"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=105317"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}