The SuperH-3, part 6: Division

Raymond Chen


The SH-3 does not have a simple “divide two integers please” instruction. Rather, it has a collection of instructions that let you build a division operation yourself.

    DIV0U               ; prepare for unsigned division
    DIV0S   Rm, Rn      ; prepare for signed division Rn ÷ Rm
    DIV1    Rm, Rn      ; generate 1 bit of the quotient Rn ÷ Rm

To begin an integer division operation, you execute either a DIV0U or DIV0S instruction, depending on whether you want a signed or unsigned division.

You then perform a number of DIV1 instructions equal to the number of bits of quotient you need, mixed in with other instructions are outlined in the programmer’s manual. After running the desired number of iterations, the result is in either the Rn register or in the register you accumulated the results into, depending on the specific algorithm you used.

These instructions do not attempt to handle division by zero or division overflow. They will simply generate nonsense results. If preventing division by zero or overflow is important to you, you will have to check for them yourself explicitly.

I’m not going to go into the fine details of how these instructions operate. They use Rm and Rn to record the state of the division, with three additional bits of state recorded in the M, Q, and T flags.

It’s basically magic.

In practice, you won’t see the compiler generate these instructions anyway. Instead, the compiler is going to do one of the following:

  • If dividing by a constant power of 2, use a shift instruction.
  • If dividing by a small constant, multiply by 2³²÷n and extract the high 32 bits of the result.
  • Otherwise, call a helper function in the runtime library.

Phew, that was crazy.

Next time, we’ll return to the relative sanity of bitwise logical operations.

Bonus chatter: If you want to see how one compiler implements division on the SH-3 (and if you are okay with being exposed to GPL source code), you can take a look at how GCC implements division in its runtime library.


Raymond Chen
Raymond Chen

Follow Raymond   

Zak Larue-Buckley 2019-08-12 08:36:16
I'm just suprised there is a hardware divide at all. Thats quite a luxury! How did it compare to a pure software integer divide in terms of cycles?
Simon Clarkstone 2019-08-12 11:24:35
I spent a while wondering why this seemed familiar, then I rememberd it is like the GreenArrays F18A's multiply-step instruction. It does a conditional add based on the bottom bit of a register and a shift of two registers joined together, so that you can multiply numbers by repeating the multiply-step instruction (though with NOPs in-between, because the multiply-step instruction has a preceding delay-slot, as does the addition instruction). The F18A is a weird processor; it's a grid of tiny microprocessors that pass data asynchronously. It was invented by Chuck Moore, the same guy who invented the early Forths (and IIRC distains ANSI Forth as too complex and over-generalised).