Why do we need atomic operations on the 80386, if it doesn’t support symmetric multiprocessing anyway?

Raymond Chen

Raymond

The 80386 processor did not support symmetric multiprocessing, yet we discussed atomic operations when in our overview of the processor. If the processor doesn’t even support symmetric multiprocessing, why does it matter?

Well, one reason is that the 80386 processor does support asymmetric multiprocessing. Floating point operations are performed by a coprocessor, and the main processor and coprocessor are both accessing the same memory. Another source of competing memory access is from hardware devices that are using Direct Memory Access (DMA).

Even within the processor, you have to worry about races, because you might be racing with yourself.

The 80386 did not support symmetric multiprocessing, but it did support pre-emptive multitasking, which means that any multi-instruction sequence is at risk of being interrupted, and at the worst possible time.

    ; decrement the variable and check against zero
    mov     eax, [var]
    dec     eax
    mov     [var], eax
    je      zero

If the threads gets pre-empted between the first and third instructions, then the contents of the variable may be changed by another thread, and the decrement operation becomes non-atomic. To ensure atomicity, you need to force the compiler to generate a single dec instruction, and then to test the flags directly from the decrement.

    ; decrement the variable and check against zero
    dec     [var]
    jz      zero

There was no way to express this level of detail to compilers of that era, so you had to hide it behind a function call.

And if your operation cannot be expressed in a single instruction, then you’re out of luck. Increment and compare against 10? Compare and exchange if equal? Nope, you can’t do those things, at least not without some help from the operating system.

Raymond Chen
Raymond Chen

Follow Raymond   

12 comments

Comments are closed.

  • Yuhong Bao
    Yuhong Bao

    Actually, there is nothing preventing the 80386 from supporting SMP and NT 3.1 did support it with Compaq SystemPro I think. It is not common though.

      • Avatar
        Piotr Gliźniewicz

        Yes, the SystemPro was asymmetric, it used the 2nd CPU for I/O. I think “80386 processor did not support symmetric multiprocessing” means more “80386 processor did not support symmetric multiprocessing out of the box”. You could build a multiprocessor using 8080s with enough external logic. I’m not an expert on the topic, but I think if a CPU supports external bus masters it should also be possible to build an SMP system with it.

  • Avatar
    Alex Cohn

    The self-imposed race conditions between two threads running on a single CPU can be handled without atomics. E.g. given i that another thread can change, `int local_copy_of_i = i+1; i = local_copy_of_i; if (local_copy_of_i > 10) do_something(); else do_something_else();` I believe it will even resolve DMA race conditions.

    • Avatar
      Murray Colpman

      This wouldn’t work if two increments of i is expected to increment i twice. Say i is 0, your local thread takes local_copy_of_i to be 1. Then you’re preempted and another thread does the same, taking its local_copy_of_i to be 1. The other thread writes back the incremented value 1, and then your thread also writes back the local_copy_of_i which is still 1. Oops – you’ve incremented i twice from 0 and got 1!

  • Avatar
    David Walker

    Do newer processors have a (single) “decrement and jump if zero” atomic instruction, built into the hardware?  Or, a decrement like you mention that can decrement a value at a memory location and set a flag, all interlocked at the CPU level (and multiprocessor-safe)?
    I realize that decrementing from a memory location usually involves reading the value into a register, decrementing the register, and writing the value back out.  But, if you can specify a memory location (and size) in the instruction, the silicon could be smart enough to decrement with appropriate memory barriers and set a flag, in the same instruction.  It’s not very RISC, but these are not RISC processors.

    • Avatar
      cheong00

      Do you mean “LOOPZ”? (decrements CX and jump to label if zero and ZF is set, although the jump range is limited)

      • Avatar
        David Walker

        I was actually looking for an atomic increment or decrement of a memory location, not a register.

          • Yuhong Bao
            Yuhong Bao

            FYI, if you code in x86 assembly, you can use any flags generated by the LOCK DEC/SUB/ADD/INC instructions after it is executed.