Recall that the PowerPC had the magical rlwinm
instruction which was the Swiss army knife of bit operations. Well, AArch64 has its own allpurpose instruction, known as UBFM
, which stands for unsigned bitfield move.
; unsigned bitfield move ; ; if immr ≤ imms: ; take bits immr through imms and rotate right by immr ; ; if immr > imms: ; take imms+1 low bits and rotate right by immr ubfm Rd/zr, Rn/zr, #immr, #imms
This instruction hurts my brain. Although the description of the instruction appears to be two unrelated cases, they are handled by the same complex formula internally. It’s just that the formula produces different results depending on which case you’re in. The complex formula is the same one that is used to generate immediates for logical operations, so I’ll give the processor designers credit for the clever way they reduced transistor count.
Fortunately, you never see this instruction in the wild. The two cases are split into separate pseudoinstructions, which reexpress the immr and imms values in a more intuitive way.
; unsigned bitfield extract ; (used when immr ≤ imms) ; extract w bits starting at position lsb ubfx Rd/zr, Rn/zr, #lsb, #w
The UBFX
instruction handles the case of UBFM
where immr ≤ imms and reinterprets it as a bitfield extraction:
w  lsb  


zerofill  
w 
Since immr ≤ imms, the rightrotation by immr is the same as a rightshift by immr.
And then we have the other case, where immr > imms:
; unsigned bitfield insert into zeroes ; (used when immr > imms) ; extract loworder w bits and shift left by lsb ubfiz Rd/zr, Rn/zr, #lsb, #w
The UBFIZ
instruction reinterprets the UBFM
as a bitfield insertion, and reinterprets the rightrotation as a leftshift. This reinterpretation is valid because immr > imms, so we are always rotating more bits than we extracted.
w  


zerofill  zerofill  
w  lsb 
There is also a signed version of this instruction:
; signed bitfield move ; ; if immr ≤ imms: ; take bits immr through imms and rotate right by immr ; signfill upper bits ; ; if immr > imms: ; take imms+1 low bits and rotate right by immr ; signfill upper bits sbfm Rd/zr, Rn/zr, #immr #imms
This behaves the same as the unsigned version, except that the upper bits are filled with the sign bit of the bitfield. Like UBFM
, the SBFM
instruction is also never seen in the wild; it is always replaced by a pseudoinstruction.
; signed bitfield extract ; (used when immr ≤ imms) ; extract w bits starting at position lsb ; signfill upper bits sbfx Rd/zr, Rn/zr, #lsb, #w ; signed bitfield insert into zeroes ; (used when immr > imms) ; extract loworder w bits and shift left by lsb ; signfill upper bits sbfiz Rd/zr, Rn/zr, #lsb, #w
Here is the operation of SBFX
in pictures:
w  lsb  
S  


signfill  ⇐⇐⇐  S  
w 
And here is SBFIZ
:
w  
S  


signfill ⇐⇐⇐  S  zerofill  
w  lsb 
Note that in the case of SBFIZ
, the lower bits are still zerofilled.
The last bitfield opcode is BFM
, which follows the same pattern, but just combines the results differently:
; bitfield move ; ; if immr ≤ imms: ; take bits immr through imms and rotate right by immr ; merge with existing bits in destination ; ; if immr > imms: ; take imms+1 low bits and rotate right by immr ; merge with existing bits in destination bfm Rd/zr, Rn/zr, #immr #imms
Again, you will never see this instruction in the wild because it always disassembles as a pseudoinstruction:
; bitfield extract and insert low ; (used when immr ≤ imms) ; replace bottom w bits in destination ; with w bits of source starting at lsb ; ; Rd[w1:0] = Rn[lsb+w1:lsb] ; bfxil Rd/zr, Rn/zr, #lsb, #w
The BFXIL
instruction is like the UBFX
and SBFX
instructions, but instead of filling the unused bits with zero or sign bits, the original bits of the destination are preserved.
w  lsb  


unchanged  
w 
; bitfield insert ; (used when immr > imms) ; replace w bits in destination starting at lsb ; with low w bits of source ; ; Rd[lsb+w1:lsb] = Rn[w1:0] ; bfi Rd/zr, Rn/zr, #lsb, #w
The BFI
instruction is like the UBFIZ
and SBFIZ
instructions, but instead of filling the unused bits with zero or sign bits, the original bits of the destination are preserved.
w  


unchanged  unchanged  
w  lsb 
; bitfield clear ; replace w bits in destination starting at lsb ; with zero ; ; Rd[lsb+w1:lsb] = 0 ; bfc Rd/zr, #lsb, #w ; bfi Rd/zr, zr, #lsb, #w
The BFC
instruction just inserts zeroes.
unchanged  zerofill  unchanged 
w  lsb 
The last instruction in the bitfield manipulation category is word/doubleword extraction.
; extract a register from a pair of registers ; ; Wd = ((Wn << 32)  Wm)[lsb+31:lsb] ; Xd = ((Xn << 64)  Xm)[lsb+63:lsb] ; extr Rd/zr, Rn/zr, Rm/zr, #lsb
The extract register instruction treats its inputs as a register pair and extracts a registersized stretch of bits from them. This can be used to synthesize multiword shifts.
size  shift  
Rn

Rm




Rd 
Note that the two input registers are concatenated in bigendian order.
It turns out that a lot of other operations can be reinterpreted as bitfield extractions. We’ll look at some of them next time.
Bonus chatter: AArch32 also had instructions bfi
, bfc
, ubfx
, and sbfx
, but each was treated as a unique instruction. AArch64 generalizes them to cover additional scenarios, leaving the classic instructions as special cases of the generalized instructions.
0 comments