The Alpha AXP: Part 12: How you detect carry on a processor with no carry?
The Alpha AXP has no corresponding trap variant for arithmetic carry. So how would you detect carry?¹
Answer: The same way you detect carry in C, or pretty much any other programming language that doesn’t support carry.
To detect carry during addition, you check whether the sum is less than either addend. If the sum is less than one addend, then it will also be less than the other addend, so use whichever addend is most convenient.
; Rc = Ra + Rb, with Rd receiving carry ; Assumes Rc is not the same as Ra ADDx Ra, Rb, Rc ; Rc = Ra + Rb CMPULT Ra, Rc, Rd ; Rd = carry ; Rc = Ra + Rb, with Rd receiving carry ; Assumes Rc is not the same as Rb ADDx Ra, Rb, Rc ; Rc = Ra + Rb CMPULT Rb, Rc, Rd ; Rd = carry ; Rc = Rc + Rc, with Rd receiving carry ; Assumes Rd is distinct from Rc BIS Rd, Rc, Rc ; Rd = Rc ADDx Rc, Rc, Rc ; Rc = Rc + Rc CMPULT Rd, Rc, Rd ; Rd = carry
The last case is where the output overwrites both inputs, so we have to stash one of the inputs in Rd so we can compare it to the result afterwards.
To detect borrow during subtraction, you check whether the subtrahend is greater than the minuend.
; Rc = Ra - Rb, with Rd receiving borrow ; Assumes Rd is distinct from both inputs CMPULT Ra, Rb, Rd ; Rd = borrow SUBx Ra, Rb, Rc ; Rc = Ra - Rb
To detect carry during multiplication, you capture the upper bits of the extended result.
; Rc = Ra *U Rb, with Rd receiving carry; 32-bit multiply ZAPNOT Ra, #15, Ra ; zero-extend Ra from 32 to 64 bits ZAPNOT Rb, #15, Rb ; zero-extend Rb from 32 to 64 bits MULQ Ra, Rb, Rc ; Rc = Ra *U Rb (64-bit multiply) SRA Rc, #32, Rd ; Rd = excess to carry forward ADDL Rc, zero, Rc ; Convert Rc to canonical form ; Rc = Ra *U Rb, with Rd receiving carry; 64-bit multiply ; Assumes Rd is distinct from both inputs UMULH Ra, Rb, Rd ; Rd = excess to carry forward MULQ Ra, Rb, Rc ; Rc = Ra *U Rb (64-bit multiply)
In the subtraction and multiplication sequences above, you can elide the final instruction if Rd is identical to Rc. (In other words, if you care only about the carry and not the arithmetic result.)
Exercise: Why did I sometimes calculate Rd early and sometimes late?
Exercise 2: Why didn’t I have to convert Rd to canonical form at the end of the 32-bit multiply?
¹ The Itanium processor also doesn’t have a flags register, but nobody seemed to be upset that it didn’t provide a way to detect arithmetic carry or overflow.