Commenter rs asks,
“Why does Windows (historically) return 2 for
MulDiv(1, -0x80000000, -0x80000000)
while Wine returns zero?”
The MulDiv
function multiplies the first two parameters
and divides by the third. Therefore, the mathematically correct
answer for
MulDiv(1, -0x80000000,
-0x80000000)
is 1,
because
a ×
b ÷
b = a for all nonzero b.
So both Windows and Wine get it wrong. I don’t know why Wine gets it wrong, but I dug through the archives to figure out what happened to Windows.
First, some background.
What’s the point of the MulDiv
function anyway?
Back in the days of 16-bit Windows, floating point was very expensive. Most people did not have math coprocessors, so floating point was performed via software emulation. And the software emulation was slow. First, you issued a floating point operation on the assumption that you had a float point coprocessor. If you didn’t, then a coprocessor not available exception was raised. This exception handler had a lot of work to do.
It decoded the instruction that caused
the exception and then emulated the operation.
For example, if the bytes at the point of the exception were
d9 45 08
, the exception handler would have to
figure out that the instruction was fld dword ptr ds:[di][8]
.
It then had to simulate the operation of that instruction.
In this case, it would
retrieve the caller’s di
register,
add 8 to that value,
load four bytes from that address
(relative to the caller’s ds
register),
expand them from 32-bit floating point to 80-bit floating point,
and push them onto a pretend floating point stack.
Then it advanced the instruction pointer three bytes and
resumed execution.
This took an instruction that with a coprocessor would take around 40 cycles (already slow) and ballooned its total execution time to a few hundred, probably thousand cycles. (I didn’t bother counting. Those who are offended by this horrific laziness on my part can apply for a refund.)
It was in this sort of floating point-hostile environment that
Windows was originally developed.
As a result,
Windows has historically avoided using floating point and preferred
to use integers.
And one of the things you often have to do with integers is
scale them by some ratio.
For example, a horizontal dialog unit is ¼ of the average
character width, and a vertical dialog unit is 1/8 of the average
character height.
If you have a value of, say, 15 horizontal dlu, the corresponding
number of pixels is
15 × average character width ÷ 4.
This multiply-then-divide operation is quite common, and that’s
the model that the MulDiv
function is designed to
help out with.
In particular, MulDiv
took care of three
things that a simple
a ×
b ÷
c didn’t.
(And remember, we’re in 16-bit Windows, so a,
b and c are all 16-bit signed values.)
- The intermediate product a × b was computed as a 32-bit value, thereby avoiding overflow.
- The result was rounded to the nearest integer instead of truncated toward zero
- If c = 0 or if the result did not fit in a signed 16-bit integer, it returned INT_MAX or INT_MIN as appropriate.
The MulDiv
function was written in assembly language,
as was most of GDI at the time.
Oh right, the MulDiv
function was exported by GDI
in 16-bit Windows.
Why?
Probably because they were the people who needed the function first,
so they ended up writing it.
Anyway, after I studied the assembly language for the function,
I found the bug.
A shr
instruction was accidentally coded as sar
.
The problem manifests itself only for the denominator
−0x8000
, because that’s the only one whose
absolute value has the high bit set.
The purpose of the sar
instruction was to divide the
denominator by two, so it can get the appropriate rounding behavior
when there is a remainder.
Reverse-compiling back into C, the function goes like this:
int16 MulDiv(int16 a, int16 b, int16 c) { int16 sign = a ^ b ^ c; // sign of result // make everything positive; we will apply sign at the end if (a < 0) a = -a; if (b < 0) b = -b; if (c < 0) c = -c; // add half the denominator to get rounding behavior uint32 prod = UInt16x16To32(a, b) + c / 2; if (HIWORD(prod) >= c) goto overflow; int16 result = UInt32Div16To16(prod, c); if (result < 0) goto overflow; if (sign < 0) result = -result; return result; overflow: return sign < 0 ? INT_MIN : INT_MAX; }
Given that I’ve already told you where the bug is, it should be pretty easy to spot in the code above.
Anyway, when this assembly language function was ported to Win32, it was ported as, well, an assembly language function. And the port was so successful, it even preserved (probably by accident) the sign extension bug.
Mind you, it’s a bug with amazing seniority.
0 comments