How did MS-DOS decide that two seconds was the amount of time to keep the floppy disk cache valid?

Raymond Chen

Raymond

MS-DOS 2.0 contained a disk read cache, but not a disk write cache. Disk read caches are important because they avoid having to re-read data from the disk. And you can invalidate the read cache when the volume is unmounted.

But wait, you don’t unmount floppy drives. You just take them out.

IBM PC floppy disk drives of this era did not have lockable doors. You could open the drive door and yank the floppy disk at any time. The specification had provisions for reporting whether the floppy drive door was open, but IBM didn’t implement that part of the specification because it saved them a NAND gate. Hardware vendors will do anything to save a penny.

But that read cache is crucial for performance. Without it, you have to start from scratch at every I/O operation, re-reading the volume table of contents, finding the directory entries, searching the block allocation tables looking for the next free cluster… And a floppy disk is not exactly the fastest storage medium out there, so all of these operations cost seconds of performance.

To avoid having to abandon the cache entire, the MS-DOS developers did some benchmarking: How fast can a human being swap floppies in an IBM PC floppy drive?

Mark Zbikowski led the MS-DOS 2.0 project, and he sat down with a stopwatch while Aaron Reynolds and Chris Peters tried to swap floppy disks on an IBM PC as fast as they could.

They couldn’t do it under two seconds.

So the MS-DOS cache validity was set to two seconds. If two disk accesses occurred within two seconds of each other, the second one would assume that the cached values were still good.

I don’t know if the modern two-second cache flush policy is a direct descendant of this original office competition, but I like to think there’s some connection.

Raymond Chen
Raymond Chen

Follow Raymond   

11 comments

Comments are closed.

  • Avatar
    Jonathan Duncan

    2 seconds seems a disappointingly short period of time that would miss many chances. How is the invalidation handled? Is it a per-block thing (each block is invalidated after 2 seconds without repeat reads) or a per-device thing (the entire device cache is invalidated if no reads come from that device in 2 seconds)?

    • Avatar
      Julien Oster

      Given that the invalidation in question here is specifically about floppies being changed without knowledge of the system, it’s almost certainly per-device (or even more coarse). You might want to invalidate individual blocks as well at some point if you have limited memory, but that’s probably better done with a different policy, e.g. least recently used.

      I think I remember that MS-DOS after 2.0 introduced “disk IDs” (just a somewhat random number assigned, usually, during formatting of the disk), probably specifically to be able to just read that ID to get a reasonable guess whether the user changed the disk or not.

  • Avatar
    Rainer Wahnsinn

    I guess I would have been able to beat 2 seconds. I had extensive floppy changing training while installing multiple Windowses and other games

    • Avatar
      Jeremy Richards

      Keep in mind this is almost certainly with regard to 5.25″ floppies not 3.5″ floppies. With the 5.25″ ones, you had to rotate the lock mechanism, pull the floppy, insert the new floppy, hold it in and rotate the lock mechanism again. Also keep in mind that the insertion slot for 5.25″ is much narrower than for the 3.5″ and the disk is less forgiving if you miss. 2 seconds sounds reasonable for 5.25″ disks. I agree you can probably do faster with 3.5″ floppies since that is just push the ejector, remove the old one then shove the new one in.

      I remember I really disliked CD-ROMs when they first came out because the eject button was software controlled (and frequently the disk took many seconds to eject after you pressed the button), rather than a mechanical release like the floppies had.

      • Avatar
        cheong00

        Given he mentioned the floppy drive at that time does not have lock, it’s most likely a 8-inch one.

        As far as I remember, at the time of 5.25-inch, the rotatory lock was almost a standard (at least I can’t remember seeing any drive that don’t have it)

        • Avatar
          Joshua Dye

          I think when talking about locking, he means software locking. He mentioned opening the drive as part of what was timed, and the Wikipedia article says that the IBM PC had 5.25″ drives, which, looking at the pictures, definitely had a latch of some sort, though not the rotary lock I remember on all the 5.25″ drives I dealt with. He also mentioned that they didn’t include the hardware for detecting the door being opened in order to save a NAND gate.

  • Avatar
    David Lewington

    This reminded me of the most tedious jobs I had to do in my early career. I had to take 8 backups of an IBM 8100 (I think that’s what it was). Each backup required many 8″ floppy discs and each took just a few minutes to write to. I soon realised that to minimise the amount of time this delightful task was going to take the time spent changing discs was critical. I had to be ready to remove each disc with one hand and shove in the next one with the other hand. I could hear the heads retracting moments before the disc busy light went off and would get ready to pounce. There was a lock on the disc drive to slow me down too but I got this down to a fine art and slashed the overall time of the backup.

  • Avatar
    Ivan Kljajic

    Well, maybe the hardware vendor’s stinginess also allowed programmers to be stingy? https://devblogs.microsoft.com/oldnewthing/?p=40473. If two seconds saves some DOS timestamp resolution bits then that leaves more room for more cool stuff? (Though I’m just guessing that the two two second things are related.)

    So the solution is to write fat code that makes hardware vendors give more and charge more!

  • Avatar
    John Elliott

    Fortunately it’s unlikely that if anyone changed floppies in under 2 seconds, it would lead to anything as bad as the Hull Paragon rail crash (when a particular action took 1.6 seconds, and happened to take place within the 1.9 second window when it would have led to disaster).

    IIRC +3DOS on the Spectrum +3 also used a 2-second timeout to assume a disc had been changed after the last file on it was closed. Tended to annoy me, because with 720k discs it took an age to mount the disc again the next time it was accessed, so I preferred to use CP/M where the OS needed to be told explicitly that the disc had changed.