{"id":96805,"date":"2017-08-11T07:00:00","date_gmt":"2017-08-11T21:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/?p=96805"},"modified":"2019-03-13T01:15:10","modified_gmt":"2019-03-13T08:15:10","slug":"20170811-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20170811-00\/?p=96805","title":{"rendered":"The Alpha AXP, part 5: Conditional operations and control flow"},"content":{"rendered":"<p>The Alpha AXP has no flags register. Conditional operations are performed based on the current value of a general-purpose register. The conditions available on the Alpha AXP are the following: <\/p>\n<table BORDER=\"0\" CELLSPACING=\"0\">\n<tr>\n<td><code>EQ<\/code>&nbsp;<\/td>\n<td>if zero<\/td>\n<\/tr>\n<tr>\n<td><code>NE<\/code>&nbsp;<\/td>\n<td>if not zero<\/td>\n<\/tr>\n<tr>\n<td><code>GE<\/code>&nbsp;<\/td>\n<td>if signed greater than or equal to zero<\/td>\n<\/tr>\n<tr>\n<td><code>GT<\/code>&nbsp;<\/td>\n<td>if signed greater than zero<\/td>\n<\/tr>\n<tr>\n<td><code>LE<\/code>&nbsp;<\/td>\n<td>if signed less than or equal to zero<\/td>\n<\/tr>\n<tr>\n<td><code>LT<\/code>&nbsp;<\/td>\n<td>if signed less than zero<\/td>\n<\/tr>\n<tr>\n<td><code>LBC<\/code>&nbsp;<\/td>\n<td>if low bit clear (if even)<\/td>\n<\/tr>\n<tr>\n<td><code>LBS<\/code>&nbsp;<\/td>\n<td>if low bit set (if odd)<\/td>\n<\/tr>\n<\/table>\n<p>In the discussion below, the abbreviation <code><u>cc<\/u><\/code> represents one of the above condition codes. <\/p>\n<p>The conditional move instructions test a source register against a condition, and if the condition is true, the destination register receives the second source. <\/p>\n<pre>\n    CMOV<u>cc<\/u>  Ra, Rb\/#b, Rc   ; if Ra meets condition, then Rc = Rb\/#b\n<\/pre>\n<p>You can also generate booleans from conditions. Note that the set of conditions here is not the same as the standard set of conditions above! <\/p>\n<pre>\n    CMPEQ   Ra, Rb\/#b, Rc   ; Rc = (Ra == Rb\/#b)\n    CMPLT   Ra, Rb\/#b, Rc   ; Rc = (Ra &lt; Rb\/#b) signed comparison\n    CMPLE   Ra, Rb\/#b, Rc   ; Rc = (Ra &le; Rb\/#b) signed comparison\n    CMPULT  Ra, Rb\/#b, Rc   ; Rc = (Ra &lt; Rb\/#b) unsigned comparison\n    CMPULE  Ra, Rb\/#b, Rc   ; Rc = (Ra &le; Rb\/#b) unsigned comparison\n<\/pre>\n<p>These comparison operators produce values of exactly 0 or 1, according to the result of the comparison, and the comparison is against the full 64-bit register value. <\/p>\n<p>Conditional jump instructions provide a condition and a register, as well as a jump target. <\/p>\n<pre>\n    B<u>cc<\/u>     Ra, destination\n<\/pre>\n<p>where <code><u>cc<\/u><\/code> is one of the condition codes above. The instruction tests the specified register against the condition, and if true, control is transferred to the destination. The test is against the full 64-bit register value, and the destination is encoded as a 21-bit value, in units of instructions (4 bytes), which provides a reach of &amp;pm;4MB. <\/p>\n<p>Conditional branches backward are predicted taken. Conditional branches forward are predicted not taken. <\/p>\n<p>There are two types of unconditional branches. They are functionally the same but have different consequences for the return address predictor. <\/p>\n<pre>\n    BR      Ra, destination ; not expected to return\n    BSR     Ra, destination ; expected to return\n<\/pre>\n<p>These instructions store the address of the subsequent instruction (the return address) in the <var>Ra<\/var> register and then transfer to the destination. The <code>BR<\/code> instruction does not push the return address onto the return address predictor stack; the <code>BSR<\/code> instruction does. <\/p>\n<p>The <code>BR<\/code> instruction is typically used with <var>zero<\/var> as the register to receive the return address, since the value is almost always thrown away. (Recall that there is a special exemption for branch instructions to the usual rule that instructions which write to <var>zero<\/var> can be optimized away.) <\/p>\n<p>The Win32 calling convention dictates that the <var>ra<\/var> register holds the return address on entry to a function. <\/p>\n<p>There are four indirect jump instructions which are all functionally equivalent but differ in their effect on the return address predictor. <\/p>\n<pre>\n    JMP     Ra, (Rb), hint16    ; not expected to return\n    JSR     Ra, (Rb), hint16    ; expected to return\n    RET     Ra, (Rb), hint16    ; end of function\n    JSR_CO  Ra, (Rb), hint16    ; coroutine\n<\/pre>\n<p>The <var>Ra<\/var> register receives the return address, typically <var>zero<\/var> in the case of <code>JMP<\/code> and <code>RET<\/code>, and conventionally <var>ra<\/var> in the case of <code>JSR<\/code>. As you have probably guessed, <code>JMP<\/code> has no effect on the return address predictor, <code>JSR<\/code> pushes the return address onto the predictor stack, and <code>RET<\/code> pops the return address off of the predictor stack and predicts a transfer to the popped value. The weird guy is <code>JSR_CO<\/code> which replaces the return address at the top of the predictor stack with the new return address and predicts a transfer to the old value. <\/p>\n<p>The official name of <code>JSR_CO<\/code> is <code>JSR_<\/code><code>COROUTINE<\/code>, but it doesn&#8217;t really matter because I have never see <code>JSR_CO<\/code> in practice. <\/p>\n<p>For the <code>JMP<\/code> and <code>JSR<\/code> instructions, the &#8220;hint&#8221; is a static prediction of the low 16 bits of the value in <var>Rb<\/var>. <\/p>\n<p>The <code>RET<\/code> and <code>JSR_CO<\/code> instructions don&#8217;t need a hint because they have their own return address predictor. However, DEC recommends that the hint for a <code>RET<\/code> instruction be 1 for a return from a procedure, and 0 otherwise. We&#8217;ll see more about this another day. <\/p>\n<p>The Microsoft compiler doesn&#8217;t generate hints; it just sets the hint to zero. Profile-guided optimization didn&#8217;t come to Visual C++ until after support for the Alpha AXP was dropped, but if it were still in support, I&#8217;m assuming that profile-guided optimization would have filled in the hint. <\/p>\n<p>Non-virtual calls will look generally like this: <\/p>\n<pre>\n    ; Put the parameters in a0 through a5\n    ; by whatever means appropriate.\n    ; Excess parameters go on the stack.\n    ; (Not shown here.)\n    BIS     zero, s1, a0    ; copied from another register\n    LDL     a1, 32(sp)      ; loaded from memory\n    ADDL    zero, #1, a2    ; calculated in place\n\n    BSR     ra, destination ; call the other function\n    ; result is in the v0 register\n<\/pre>\n<p>Virtual calls load the destination from the target&#8217;s vtable: <\/p>\n<pre>\n    ; Put the parameters in a0 through a5\n    ; by whatever means appropriate.\n    ; Excess parameters go on the stack.\n    ; (Not shown here.)\n    ; \"this\" goes into a0.\n    BIS     zero, s1, a0    ; copied from another register\n    LDL     a1, 32(sp)      ; loaded from memory\n    ADDL    zero, #1, a2    ; calculated in place\n\n    LDL     t0, (a0)        ; load vtable\n    LDL     t0, 8(t0)       ; load function from vtable\n    BSR     ra, (t0)        ; call the function pointer\n    ; result is in the v0 register\n<\/pre>\n<p>Calls to exported functions are indirect through a global variable, which means we need to get the address of that global. <\/p>\n<pre>\n    ; Put the parameters in a0 through a5\n    ; by whatever means appropriate.\n    ; Excess parameters go on the stack.\n    ; (Not shown here.)\n    BIS     zero, s1, a0    ; copied from another register\n    LDL     a1, 32(sp)      ; loaded from memory\n    ADDL    zero, #1, a2    ; calculated in place\n\n    LDAH    t0, xxxx(zero)  ; 64KB block where global variable resides\n    LDL     t0, yyyy(t0)    ; load the global variable\n    BSR     ra, (t0)        ; call the function pointer\n    ; result is in the v0 register\n<\/pre>\n<p>The above examples use the <code>LDL<\/code> instruction, which loads a register from memory. We&#8217;ll learn more about memory access next time. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>But there is no flags register.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[26],"class_list":["post-96805","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-other"],"acf":[],"blog_post_summary":"<p>But there is no flags register.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/96805","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=96805"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/96805\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=96805"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=96805"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=96805"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}