September 4th, 2023

Just for fun: What happens when you shift a register by more than the register size?

Just for fun, let’s compare what happens on different processor architectures when you shift a register by more than the register size.

Processor Register size Behavior
8086 Any mod 256 (Note 1)
Alpha AXP 64 mod  64
80386 Any mod  32
x86-64 ≤ 32 mod  32
64 mod  64
Intel ia64 64 full value
MIPS 32 mod  32
64 mod  64
PowerPC 32 mod  64
64 mod 128
SH-4 32 mod  32 + sign (Note 2)
ARM (Thumb-2) Any mod 256
ARM (AArch 64) 32 mod  32
64 mod  64
68000 32 mod  64
SPARC 32 mod  32
RISC-V 32 mod  32
64 mod  64

Notes:

  1. On the 8086, the shift amount is given by the 8-bit cl register. The running time of the instruction is proportional to the number of bits shifted, and the processor does not optimize shifts that are larger than the register size, so if you ask to shift by 255 places, it will run a loop 255 times.
  2. On the SH-4, the sign of the shift amount controls the direction of the shift, and the lower 5 bits control how far to shift.

I’ve added some architectures that are of historic or current interest. Do not assume that the presence of an architecture on this list implies that I will someday cover it, and don’t assume that omission of your favorite architecture means that I never will.

From the above table, you can sort of come up with a taxonomy of shifts.

  Unsigned Signed
mod register size Alpha AXP, x86-32, x86-64,
MIPS, AArch64, SPARC, RISC-V
SH-4
mod 2 × register size PowerPC, 68000  
mod 256 8086, Thumb-2  
full value ia64  

For x86-32, I’m kind of cheating and ignoring the registers smaller than 32 bits.

Bonus chatter: The wide variety of behavior when shifting by more than the register size is one of the reasons why the C and C++ languages leave undefined what happens when you shift by more than the bit width of the shifted type.

Topics
History

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

9 comments

Discussion is closed. Login to edit/delete existing comments.

Newest
Newest
Popular
Oldest
  • alighieri dante

    “The wide variety of behavior when shifting by more than the register size is one of the reasons why the C and C++ languages leave undefined what happens when you shift by more than the bit width of the shifted type.”

    How many of the processors you mention existed when C was invented? Or even C++?

  • Johan Hanson · Edited

    “Register size” for Motorola 68000 should say “≤ 32”, as the amount is masked to six bits also when shifting only the low 8 or 16 bits.

    If you really want to go down this rabbit hole, you could compare shifts of GPR/data registers to shifts of SIMD lanes …
    On several architectures, those are different than integer code. x86-64 even has different behaviours for instructions from SSE, AVX2 and AVX-512, and ARM has different behaviours in NEON and SVE.

  • Dmitry

    So, what’s the difference between 8086/8088 and Itanium? AFAICT, both do not mask the shift counts. The only difference I can see is that for 8086/8088 the count is limited to 8 bits by the instruction formats while Itanium lets one specify larger values while still limiting (not MODding) them, at least for parallel shifts.

    I’m not an Itanium expert (just looked through the docs before writing this). I guess it would be nice to have some clarification on what was really meant by ”full value”.

    • Simon Farnsworth

      It’s a matter of perspective, I think.

      With Itanium, if I load 32,769 into a register, and then shift by that register, the shift behaves exactly as if it did the shift by the full 32,769 and then took the lowest 64 bits – it results in a 0 in the register when shifting left, and either all ones or a 0 when shifting right.

      With 8086, when I load the value I want to shift by into the CX register, CL is implicitly set to CX mod 256. There’s two interpretations you can take for shifts as a result; either it’s shift by the full CL register, or it’s take CX mod 256 (the CL register) and shift by that. The latter is what Raymond’s used and it’s closer to how a C programmer would think of the shift, the former is more what an 8086 assembly programmer would think of.

  • Falcon

    Since you have the 8086 and 68000 in the list, you could also mention that the 80286, with 16-bit registers, used “mod 32” for the shift count as well (it was actually the 80186/80188 that introduced the masking). The 80386 retained that behaviour even for 32-bit shifts.

    I’m guessing that Intel chose to mask the shift count to 5 bits instead of 4 for compatibility with existing 8086/8088 code, so that programs which shifted values by e.g. 16 bits would still get the same results. The 80386 did not face the same constraint for 32-bit code (which wouldn’t run on earlier processors).

    • Simon Farnsworth

      Asking out of hope more than anything else, but do you happen to have any historical data for which month the 80186/80188 came out? The 80286 came out in February 1982, and is well-documented as such because of its later use in the IBM PC/AT, but the 80186 seems to just be documented as coming out in 1982.

      I ask because as far as I can tell, the 80286 and 80186 have near-identical real mode instruction sets; the difference seems to be that the 80286 has protected mode, where the 80186 has internal peripherals instead, and I’m curious whether someone at the time of release of the 80186 would perceive it as an 80286 cut down for embedded applications, or as an enhanced 8086 for embedded applications. If the release dates were very close together, then the latter interpretation is more likely, while if the 80186 is end 1982, then the former interpretation is more likely.

      • Falcon

        I don’t have any more detailed info. The Wikipedia article, for instance, simply describes the two processors as contemporary.

        Given the lack of protected mode and extended address space, the “enhanced 8086” perspective would probably be more accurate.

      • Simon Farnsworth

        From today’s perspective, the 80186 is absolutely an “enhanced 8086”; I’m just wondering if that would have been obvious in 1982, or if 1982 engineers might have perceived the 80186 as a cut-down 80286 due to timing of chip releases.

        It’s fairly obvious that the ‘186 and ‘286 were at least planned concurrently, since their real mode operation is almost identical (some of the instructions that are necessary for protected mode are also useful in real mode, but not present in the ‘186); I’d like a sense of how that near-identical ISA appeared in the 1980s.

      • Jim Moores

        Can’t speak for engineers, but I actually used one of the few 186 based PCs quite a bit, the Research Machines Nimbus PC-186. It was definitely considered an enhanced 8086 (and barely to be honest) to us rather than a cut down 80286. The machine wasn’t actually fully PC compatible and you needed an add-in board and/or a TSR to increase the application compatibility. It was an education focused PC with quite decent graphics for the time (probably similar to Tandy 1000 range) and had custom versions of Windows 1.0 and 2.0, MS Word (for DOS with proper italic/bold/underline support in text mode), Multiplan, etc. that used the better graphics modes.

Feedback