Just for fun: What happens when you shift a register by more than the register size?

Raymond Chen

Just for fun, let’s compare what happens on different processor architectures when you shift a register by more than the register size.

Processor	Register size	Behavior
8086	Any	mod 256 (Note 1)
Alpha AXP	64	mod 64
80386	Any	mod 32
x86-64	≤ 32	mod 32
x86-64	64	mod 64
Intel ia64	64	full value
MIPS	32	mod 32
MIPS	64	mod 64
PowerPC	32	mod 64
PowerPC	64	mod 128
SH-4	32	mod 32 + sign (Note 2)
ARM (Thumb-2)	Any	mod 256
ARM (AArch 64)	32	mod 32
ARM (AArch 64)	64	mod 64
68000	32	mod 64
SPARC	32	mod 32
RISC-V	32	mod 32
RISC-V	64	mod 64

Notes:

On the 8086, the shift amount is given by the 8-bit cl register. The running time of the instruction is proportional to the number of bits shifted, and the processor does not optimize shifts that are larger than the register size, so if you ask to shift by 255 places, it will run a loop 255 times.
On the SH-4, the sign of the shift amount controls the direction of the shift, and the lower 5 bits control how far to shift.

I’ve added some architectures that are of historic or current interest. Do not assume that the presence of an architecture on this list implies that I will someday cover it, and don’t assume that omission of your favorite architecture means that I never will.

From the above table, you can sort of come up with a taxonomy of shifts.

	Unsigned	Signed
mod register size	Alpha AXP, x86-32, x86-64, MIPS, AArch64, SPARC, RISC-V	SH-4
mod 2 × register size	PowerPC, 68000
mod 256	8086, Thumb-2
full value	ia64

For x86-32, I’m kind of cheating and ignoring the registers smaller than 32 bits.

Bonus chatter: The wide variety of behavior when shifting by more than the register size is one of the reasons why the C and C++ languages leave undefined what happens when you shift by more than the bit width of the shifted type.

Author

Raymond Chen

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

9 comments

Discussion is closed. Login to edit/delete existing comments.

alighieri dante September 10, 2023

“The wide variety of behavior when shifting by more than the register size is one of the reasons why the C and C++ languages leave undefined what happens when you shift by more than the bit width of the shifted type.”

How many of the processors you mention existed when C was invented? Or even C++?
Johan Hanson September 9, 2023 · Edited

“Register size” for Motorola 68000 should say “≤ 32”, as the amount is masked to six bits also when shifting only the low 8 or 16 bits.

If you really want to go down this rabbit hole, you could compare shifts of GPR/data registers to shifts of SIMD lanes …
On several architectures, those are different than integer code. x86-64 even has different behaviours for instructions from SSE, AVX2 and AVX-512, and ARM has different behaviours in NEON and SVE.
Dmitry September 4, 2023

So, what’s the difference between 8086/8088 and Itanium? AFAICT, both do not mask the shift counts. The only difference I can see is that for 8086/8088 the count is limited to 8 bits by the instruction formats while Itanium lets one specify larger values while still limiting (not MODding) them, at least for parallel shifts.

I’m not an Itanium expert (just looked through the docs before writing this). I guess it would be nice to have some clarification on what was really meant by ”full value”.
- Simon Farnsworth September 5, 2023
  
  It's a matter of perspective, I think.
  
  With Itanium, if I load 32,769 into a register, and then shift by that register, the shift behaves exactly as if it did the shift by the full 32,769 and then took the lowest 64 bits - it results in a 0 in the register when shifting left, and either all ones or a 0 when shifting right.
  
  With 8086, when I load the value I want to shift by into the CX register, CL is implicitly set to CX mod 256. There's two interpretations you can take for shifts as a result; either it's...
  Read more
  It’s a matter of perspective, I think.
  
  With Itanium, if I load 32,769 into a register, and then shift by that register, the shift behaves exactly as if it did the shift by the full 32,769 and then took the lowest 64 bits – it results in a 0 in the register when shifting left, and either all ones or a 0 when shifting right.
  
  With 8086, when I load the value I want to shift by into the CX register, CL is implicitly set to CX mod 256. There’s two interpretations you can take for shifts as a result; either it’s shift by the full CL register, or it’s take CX mod 256 (the CL register) and shift by that. The latter is what Raymond’s used and it’s closer to how a C programmer would think of the shift, the former is more what an 8086 assembly programmer would think of.
  
  Read less
Falcon September 4, 2023

Since you have the 8086 and 68000 in the list, you could also mention that the 80286, with 16-bit registers, used “mod 32” for the shift count as well (it was actually the 80186/80188 that introduced the masking). The 80386 retained that behaviour even for 32-bit shifts.

I’m guessing that Intel chose to mask the shift count to 5 bits instead of 4 for compatibility with existing 8086/8088 code, so that programs which shifted values by e.g. 16 bits would still get the same results. The 80386 did not face the same constraint for 32-bit code (which wouldn’t run on earlier processors).
- Simon Farnsworth September 5, 2023
  
  Asking out of hope more than anything else, but do you happen to have any historical data for which month the 80186/80188 came out? The 80286 came out in February 1982, and is well-documented as such because of its later use in the IBM PC/AT, but the 80186 seems to just be documented as coming out in 1982.
  
  I ask because as far as I can tell, the 80286 and 80186 have near-identical real mode instruction sets; the difference seems to be that the 80286 has protected mode, where the 80186 has internal peripherals instead, and I'm curious whether someone at...
  Read more
  Asking out of hope more than anything else, but do you happen to have any historical data for which month the 80186/80188 came out? The 80286 came out in February 1982, and is well-documented as such because of its later use in the IBM PC/AT, but the 80186 seems to just be documented as coming out in 1982.
  
  I ask because as far as I can tell, the 80286 and 80186 have near-identical real mode instruction sets; the difference seems to be that the 80286 has protected mode, where the 80186 has internal peripherals instead, and I’m curious whether someone at the time of release of the 80186 would perceive it as an 80286 cut down for embedded applications, or as an enhanced 8086 for embedded applications. If the release dates were very close together, then the latter interpretation is more likely, while if the 80186 is end 1982, then the former interpretation is more likely.
  
  Read less
  - Falcon September 5, 2023
    
    I don’t have any more detailed info. The Wikipedia article, for instance, simply describes the two processors as contemporary.
    
    Given the lack of protected mode and extended address space, the “enhanced 8086” perspective would probably be more accurate.
  - Simon Farnsworth September 8, 2023
    
    From today’s perspective, the 80186 is absolutely an “enhanced 8086”; I’m just wondering if that would have been obvious in 1982, or if 1982 engineers might have perceived the 80186 as a cut-down 80286 due to timing of chip releases.
    
    It’s fairly obvious that the ‘186 and ‘286 were at least planned concurrently, since their real mode operation is almost identical (some of the instructions that are necessary for protected mode are also useful in real mode, but not present in the ‘186); I’d like a sense of how that near-identical ISA appeared in the 1980s.
  - Jim Moores September 9, 2023
    
    Can't speak for engineers, but I actually used one of the few 186 based PCs quite a bit, the Research Machines Nimbus PC-186. It was definitely considered an enhanced 8086 (and barely to be honest) to us rather than a cut down 80286. The machine wasn't actually fully PC compatible and you needed an add-in board and/or a TSR to increase the application compatibility. It was an education focused PC with quite decent graphics for the time (probably similar to Tandy 1000 range) and had custom versions of Windows 1.0 and 2.0, MS Word (for DOS with...
    Read more
    Can’t speak for engineers, but I actually used one of the few 186 based PCs quite a bit, the Research Machines Nimbus PC-186. It was definitely considered an enhanced 8086 (and barely to be honest) to us rather than a cut down 80286. The machine wasn’t actually fully PC compatible and you needed an add-in board and/or a TSR to increase the application compatibility. It was an education focused PC with quite decent graphics for the time (probably similar to Tandy 1000 range) and had custom versions of Windows 1.0 and 2.0, MS Word (for DOS with proper italic/bold/underline support in text mode), Multiplan, etc. that used the better graphics modes.
    
    Read less

Just for fun: What happens when you shift a register by more than the register size?

Author

9 comments

Read next

Why did the 16-bit _lopen and _lcreat function return -1 on failure instead of 0?

Why is kernel32.dll running in user mode and not kernel mode, like its name implies?

Author

9 comments

Read next

Why did the 16-bit _lopen and _lcreat function return -1 on failure instead of 0?

Why is kernel32.dll running in user mode and not kernel mode, like its name implies?

Stay informed