Why do we need atomic operations on the 80386, if it doesn’t support symmetric multiprocessing anyway?

Raymond Chen


The 80386 processor did not support symmetric multiprocessing, yet we discussed atomic operations when in our overview of the processor. If the processor doesn’t even support symmetric multiprocessing, why does it matter?

Well, one reason is that the 80386 processor does support asymmetric multiprocessing. Floating point operations are performed by a coprocessor, and the main processor and coprocessor are both accessing the same memory. Another source of competing memory access is from hardware devices that are using Direct Memory Access (DMA).

Even within the processor, you have to worry about races, because you might be racing with yourself.

The 80386 did not support symmetric multiprocessing, but it did support pre-emptive multitasking, which means that any multi-instruction sequence is at risk of being interrupted, and at the worst possible time.

    ; decrement the variable and check against zero
    mov     eax, [var]
    dec     eax
    mov     [var], eax
    je      zero

If the threads gets pre-empted between the first and third instructions, then the contents of the variable may be changed by another thread, and the decrement operation becomes non-atomic. To ensure atomicity, you need to force the compiler to generate a single dec instruction, and then to test the flags directly from the decrement.

    ; decrement the variable and check against zero
    dec     [var]
    jz      zero

There was no way to express this level of detail to compilers of that era, so you had to hide it behind a function call.

And if your operation cannot be expressed in a single instruction, then you’re out of luck. Increment and compare against 10? Compare and exchange if equal? Nope, you can’t do those things, at least not without some help from the operating system.


Comments are closed.

  • Avatar
    Alex Cohn

    The self-imposed race conditions between two threads running on a single CPU can be handled without atomics. E.g. given i that another thread can change, `int local_copy_of_i = i+1; i = local_copy_of_i; if (local_copy_of_i > 10) do_something(); else do_something_else();` I believe it will even resolve DMA race conditions.

    • Avatar
      Murray Colpman

      This wouldn’t work if two increments of i is expected to increment i twice. Say i is 0, your local thread takes local_copy_of_i to be 1. Then you’re preempted and another thread does the same, taking its local_copy_of_i to be 1. The other thread writes back the incremented value 1, and then your thread also writes back the local_copy_of_i which is still 1. Oops – you’ve incremented i twice from 0 and got 1!

  • Avatar
    David Walker

    Do newer processors have a (single) “decrement and jump if zero” atomic instruction, built into the hardware?  Or, a decrement like you mention that can decrement a value at a memory location and set a flag, all interlocked at the CPU level (and multiprocessor-safe)?
    I realize that decrementing from a memory location usually involves reading the value into a register, decrementing the register, and writing the value back out.  But, if you can specify a memory location (and size) in the instruction, the silicon could be smart enough to decrement with appropriate memory barriers and set a flag, in the same instruction.  It’s not very RISC, but these are not RISC processors.