March 17th, 2020

We called it RAID because it kills bugs dead

The history of defect tracking in the Windows team goes back to Windows 1.0, which used a text file.

After Windows 1.01 released, a bunch of people in the apps division got together and threw together a bug tracking database. Because hey, a database, wouldn’t that be neat?

The name was chosen by vote among the team, and the selected name was RAID, which is the name of a brand of insecticide whose advertisements in the United States use the tag line “Kills bugs dead.” The icon for the program was a can of bug spray, naturally.

The letters RAID were retroactively declared to be an acronym for “Reporting and Incidents Database”, but nobody knew that or cared. It was RAID.

After you built a bug query, you could save it for future use, and the file extension was .rdq, short for “RAID Query”.

The name RAID was linguistically productive, because you can “RAID a bug”, which means “File a bug in the project’s RAID database.” The .rdq also could be used as its own noun, meaning the query file. “Can you send me the .rdq for the bugs we are reviewing tomorrow?”

The database was written back in the days of 16-bit computing, so naturally it had a limit of 32,767 bugs. This was sufficient for many years, but eventually products encountered the record limit and had to “roll over” to new databases, where all bugs from the old database that hadn’t yet been closed were copied to a new database (and received new record numbers), and the old database was put into read-only mode.

Naturally, this created confusion when you were reading through some code, and it had a comment like “This fixes bug 3141,” with no indication as to which bug database that bug number refers to.

I think Windows 95 went through three RAID databases during its life.

The original authors of RAID had no idea that their little bug tracking database tool would be the primary defect tracking tool across all of Microsoft for multiple decades. If they had known, they might have been too scared to write it. When looking back on the origin of RAID, one of the original developers confessed, “It really wasn’t made to last that long. Sorry!”

Another scalability problem was that by the time the Windows XP project was chugging along, you would get into situations where there were so many people using RAID at once that the server would simply stop accepting new connections. When the ship room convened to go over the state of the Windows project, they sometimes had to call into operations and ask them to kill a few active connections to the back-end database so that the ship room could connect.

It was clear that RAID was being pushed far beyond what it was originally designed. A new defect tracking system was developed, named Product Studio, because naming your app Something Studio was fashionable at the time.

Product Studio didn’t have a limit of 32,767 records. It used a three-tier architecture for improved reliability and flexibility. It supported file attachments!

Product Studio served as the primary bug-tracking database for many years. But even with its improved architecture, you often ran into cases where the app stopped responding and simply told you “There was an error contacting the middle tier.”

I liked to joke that we should just get rid of that middle tier. It’s always the one that’s causing problems.

Product Studio kept things going until Windows 8, at which point Windows switched to on-premise Team Foundation Services for work item tracking.

The most recent move was in Windows 10, when the Windows team switched to Visual Studio Online for its work item tracking database. Mind you, that doesn’t mean that things have been stable, because the name of the service changed from Visual Studio Online to Visual Studio Team Services, and then again to Azure DevOps Services.

Even Azure DevOps wasn’t big enough to contain all of the Windows work items. Periodically, old work items are archived and moved to another project.¹ But at least the remaining work items didn’t get renumbered. They kept their old numbers, thank goodness.

¹ Unfortunately, the archive project renumbers the work items. Fortunately, the original work item is remembered in the title, so you can do a search for originalid:3141 to find the old work item known as number 3141.

 

Topics
History

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

15 comments

Discussion is closed. Login to edit/delete existing comments.

Newest
Newest
Popular
Oldest
  • Keith Personett

    I was in Critical Problem Resolution for Exchange 1999-2010ish… I remember when we switched from RAID to Product Studio, while the interfaces were somewhat similar, the performance difference was like night and day. There were a lot of bugs initially, but it seemed that the PTT handled them pretty quickly when reported.

    Where I’ve been since leaving MSFT, we use TFS, and it serves our purposes well.

    While SD was kind of a PITA to work with, it worked well… but I would have loved to have something like TFS for the Exchange codebase back then

  • cheong00

    > Naturally, this created confusion when you were reading through some code, and it had a comment like “This fixes bug 3141,” with no indication as to which bug database that bug number refers to.

    It’s easy. If the bugfix is for item 3141 on the third database, just say “This fixes bug 3-3141,” much like how we address the page we’re talking about in “volume+page_num” when discussing about books.

    • Keith Personett

      At the time we were writing fixes (in my case, Exchange), we didn’t necessarily know what database number/version it was… We just knew the bug number … RAID went away soon after I started at MSFT, but back then, many of us writing fixes in CPR didn’t know it’s back end limitations.

      • Raymond ChenMicrosoft employee Author

        Exactly. Nobody remembers how many rollovers there have been. The number 3141 is the number that shows up in queries, reports, email, etc. It’s like nobody prints “206-555-1212 after the area code split of 1995” on their business cards. They just write “206-555-1212”, and then you call that number and it’s the wrong number, because the card was printed in 1996, and the number changed as a result of the area code split of 1997.

    • Ian Yates

      Yeah I would’ve thought so too. And fortunately if you’re referencing a bug because you’re fixing it, then it wouldn’t (shouldn’t!?) be found in a later bug database under a different number.

      • Scarlet Manuka

        Sure, but there’s every possibility that a completely different bug is in the new database with that bug number. So when you look for the bug that was fixed you find some bug that (if you’re *lucky*) is in a completely different part of the application and couldn’t possibly have been fixed by the code you’re looking at.

        If you’re unlucky, the new bug is just close enough to the old one that it’s vaguely plausible that there might be some connection between the bug and the code, and you waste a lot of time trying to puzzle out the details before you realise it was all an illusion.

  • Remy Lebeau

    Funny, when I saw RAID, the first thing that came to mind was Borland’s internal bug tracker, which was also known as RAID. Gee, I wonder where they got that name from…

    • cheong00

      Hong Kong have RAID (the insecticide) be sold, maybe it’s sold at UK too.

      • Ian Yates

        RAID and Mortein were the two big brands here in Australia when I was growing up. They still are, except I have the luxury of not seeing any TV ads these days so can only judge based on shelf space and eye-level placement when shopping for groceries.

  • Jeffrey Tippet

    > ¹ Unfortunately, the archive project renumbers the work items. Fortunately, the original work item is remembered in the title, so you can do a search for originalid:3141 to find the old work item known as number 3141.

    Employees can also visit https://task.ms/3141 , which routes you to the bug you wanted, even if it was archived and renumbered.

  • Dave BartolomeoMicrosoft employee

    The two things I remember most about RAID after all these years are:
    1. Admin access to the RAID database was determined by having specific credentials in the connection string in the .rdq file. If you asked someone “Hey, can you send me the .rdq for the Frobnozzle database?”, and that someone was an admin for that database, there was a pretty good chance they’d accidentally send you the .rdq with the admin credentials, at which point _you_ were now an admin for the Frobnozzle database.
    2. I was mildly disturbed the first time I ran a big query, waited for it for 10 minutes, and then got a message box saying “You have been chosen as the deadlock victim.”

  • Steve Palmer

    This brings back memories. I was the first lead for Product Studio from ’99 until ’05. and it was one of a suite of tools that came out of the newly organised Productivity Tools Team which was part of the Windows division. The others which have been mentioned on this blog and elsewhere were Source Depot, for source control tracking, and LocStudio, which was the primary localisation tool for most of the other products we shipped. When I left, PTT was re-organised into DevDiv and most of my old team went on to build Team Foundation Server using much of the experience gained from working on Source Depot and Product Studio. Frankly, those six years at PTT were the best I spent in my 15 year career at Microsoft.

    RAID, incidentally, wasn’t just one version but at least two. Office cloned the source and built their own custom version and one of my jobs when we brought RAID into PTT was to merge the Windows and Office versions into one. Office had its own peculiar customisations specific to how they worked so it wasn’t easy but the knowledge we learned went onto making Product Studio flexible enough for all of the big six divisions.

    (The precursor to Source Depot came from Windows and was written by one of the original NT devs (Steve Wood, I think) in response to requirements to store the NT source. The challenges of getting the Windows source code onto Source Depot were legion and most of the technical designs we made were primarily to meet its requirements. In any case TFS wasn’t up to it so they went from Source Depot to an especially customised version of Git.)

    • Yuhong Bao

      I think SLM dates before NT.

      • Steve Palmer

        I think you’re right. My memory is a bit hazy but I remember SLM quite well in the early days and seeing Steve’s name in the source code function headers.

      • Don DumitruMicrosoft employee

        When I joined MSFT in Jan ’94, both SLM and Windows NT existed. So SLM is at least as hold as ’94. How much older, I dunno.

Feedback