The 80386 processor did not support symmetric multiprocessing, yet we discussed atomic operations when in our overview of the processor. If the processor doesn’t even support symmetric multiprocessing, why does it matter?
Well, one reason is that the 80386 processor does support asymmetric multiprocessing. Floating point operations are performed by a coprocessor, and the main processor and coprocessor are both accessing the same memory. Another source of competing memory access is from hardware devices that are using Direct Memory Access (DMA).
Even within the processor, you have to worry about races, because you might be racing with yourself.
The 80386 did not support symmetric multiprocessing, but it did support pre-emptive multitasking, which means that any multi-instruction sequence is at risk of being interrupted, and at the worst possible time.
; decrement the variable and check against zero mov eax, [var] dec eax mov [var], eax je zero
If the threads gets pre-empted between the first and third instructions, then the contents of the variable may be changed by another thread, and the decrement operation becomes non-atomic. To ensure atomicity, you need to force the compiler to generate a single dec
instruction, and then to test the flags directly from the decrement.
; decrement the variable and check against zero dec [var] jz zero
There was no way to express this level of detail to compilers of that era, so you had to hide it behind a function call.
And if your operation cannot be expressed in a single instruction, then you’re out of luck. Increment and compare against 10? Compare and exchange if equal? Nope, you can’t do those things, at least not without some help from the operating system.
Do newer processors have a (single) "decrement and jump if zero" atomic instruction, built into the hardware? Or, a decrement like you mention that can decrement a value at a memory location and set a flag, all interlocked at the CPU level (and multiprocessor-safe)?
I realize that decrementing from a memory location usually involves reading the value into a register, decrementing the register, and writing the value back out. But, if you can specify a...
Do you mean “LOOPZ”? (decrements CX and jump to label if zero and ZF is set, although the jump range is limited)
I was actually looking for an atomic increment or decrement of a memory location, not a register.
Yup, it’s right there in the article. But all you get is the sign, not the value. See also the link at the end of the article.
Oh, right. Thanks.
FYI, if you code in x86 assembly, you can use any flags generated by the LOCK DEC/SUB/ADD/INC instructions after it is executed.
The self-imposed race conditions between two threads running on a single CPU can be handled without atomics. E.g. given i that another thread can change, `int local_copy_of_i = i+1; i = local_copy_of_i; if (local_copy_of_i > 10) do_something(); else do_something_else();` I believe it will even resolve DMA race conditions.
This wouldn't work if two increments of i is expected to increment i twice. Say i is 0, your local thread takes local_copy_of_i to be 1. Then you're preempted and another thread does the same, taking its local_copy_of_i to be 1. The other thread writes back the incremented value 1, and then your thread also writes back the local_copy_of_i which is still 1. Oops - you've incremented i twice from 0 and got 1!
Actually, there is nothing preventing the 80386 from supporting SMP and NT 3.1 did support it with Compaq SystemPro I think. It is not common though.
Wasn’t SystemPro asymmetric?
Yes, the SystemPro was asymmetric, it used the 2nd CPU for I/O. I think “80386 processor did not support symmetric multiprocessing” means more “80386 processor did not support symmetric multiprocessing out of the box”. You could build a multiprocessor using 8080s with enough external logic. I’m not an expert on the topic, but I think if a CPU supports external bus masters it should also be possible to build an SMP system with it.
AFAIK the 80386 and 80486 bus was very similar