Microsoft Open-Sources GW-BASIC

Rich Turner

We are excited to announce the open-sourcing of Microsoft GW-BASIC on GitHub!

Yes, seriously 😀

Why?

Since re-open-sourcing MS-DOS 1.25 & 2.0 on GitHub last year, we’ve received numerous requests to also open-source Microsoft BASIC.

Well, here we are! 😁

The Source

These sources, as clearly stated in the repo’s readme, are the 8088 assembly language sources from 10th Feb 1983, and are being open-sourced for historical reference and educational purposes. This means we will not be accepting PRs that modify the source in any way.

A little historical context

The GW-BASIC source code being published is dated Feb 10th 1983. That was quite a while ago, so just to set a little historical perspective:

The week this source was created Men At Work topped the US and UK singles charts with “Down Under”, Dustin Hoffman starred in the #1 US box-office movie, “Tootsie”. In 1983, “Star Wars Episode VI – Return of the Jedi” was released, as was “War Games”! And, Emily Blunt, Kate Mara, Jonah Hill, Chris Hemsworth, and Henry Cavill, were born! Ronald Reagan was President of the USA, and Margaret Thatcher was the UK’s Prime Minister.

That same year, Bjarne Stroustrup was in the middle of developing the first version of the C++ programming language, ARPANET standardized TCP/IP. Borland announced Turbo Pascal, created by Anders Hejlsberg (who went on to join Microsoft, and create J++, C# and TypeScript).

1983 was also the year AT&T released UNIX System V R1, and BSD 4.2 was released, introducing the pseudoterminal for the first time (the progenitor to Windows’ ConPTY we introduced to Windows in 2018 😁)

I was 13, and spent every spare second that I wasn’t finishing my homework or doing my chores, writing BASIC and assembly code on one of the hottest home computers of the time – the BBC Micro sporting 32KB RAM (yes, 32,768 bytes, total!), powered by a 6502 processor running at a BLAZING 2MHz. When not coding, I was usually playing one of the most groundbreaking games of all time: “Elite” by David Braben & Ian Bell.

In 1983, Apple launched the 1MHz 6502-powered Apple IIe for US$1,395 (> $3,500 in 2020). Apple also launched the first retail-available computer with a GUI – the Apple Lisa. The Lisa contained a staggering 1MB RAM, and ran the awesome Motorola 68000 processor at an astounding 5MHz, but it cost $9,995 (> $25,000 in 2020 dollars), so all I could do was peer at it through the window of the one computer store in our town authorized to sell Apple’s products … and dream.

And, in 1983 Microsoft released MS-DOS 2.0 (source here), and GW-BASIC for the IBM PC XT and compatibles.

What IS GW-BASIC?

GW-BASIC was a BASIC interpreter derived from IBM’s Advanced BASIC/BASICA, which itself was a port of Microsoft BASIC.

Microsoft’s various BASIC implementations can trace their origins all the way back to Bill Gates & Paul Allen’s implementation of Microsoft’s first product – a BASIC interpreter for the Altair 8800.

During the late ’70s and 80s, Microsoft’s BASIC was ported to many OEM’s specific platform and hardware needs, and for several processors popular at that time, including the 8088, 6502, 6809, Z80, and others.

FAQ

Wait – where’s the C source code?

There is no C source code!

Like much software from the 70s and 80s, and just like the source for MS-DOS, the source code of GW-BASIC is 100% assembly language.

Why assembly? Why didn’t developers use higher-level languages like C, or Pascal?

When developing on/for mainframes and minicomputers of the day, developers were sometimes able to use higher-level languages like FORTRAN, LISP, COBOL, RPG, CPL/BCPL, C, etc. but the compilers for such languages were often hugely expensive, rarely generated efficient code, and were generally unavailable for the space and performance constrained home and personal computers of the day.

When writing software for early PCs, every single byte and every single instruction mattered, so developers often wrote code entirely in assembly language simply to be able to physically fit their software into the available memory, and to be able to access the computer’s resources and internal workings.

Thus, all the source code for GW-BASIC is pure assembly code, translated on a per-processor/per-machine basis from core/master sources.

This source was ‘translated’?

Each of the assembly source files contains a header stating This translation created 10-Feb-83 by Version 4.3

Since the Instruction Set Architecture (ISA) of the early processors used in home and personal computers weren’t spectacularly different from one another, Microsoft was able to generate a substantial amount of the code for a port from the sources of a master implementation. (Alas, sorry, we’re unable to open-source the ISA translator.)

What about other ports?

Many have asked if we can also open-source implementations for processors other than the 808x. Alas, we’re unable to provide sources for these ports and/or customizations.

Enjoy!

We hope you enjoy exploring this fascinating snapshot of what software development looked like during the glorious, exciting, heady days of the ’70s and early ’80s at the dawn of “the personal computer” 😁

Many thanks to Amy, Julia Liuson, Amanda Silver, and our awesome CELA team for their approval and help finding, reviewing, and open-sourcing GW-BASIC.

68 comments

Discussion is closed. Login to edit/delete existing comments.

  • theuserbl 0

    Ok, here again my comment.
    I have had written on May 22, 2020 2:13 am and there still stands “Your comment is awaiting moderation.”.
    Now Rich Turner have answered a lot of commets after my one. And my was still not activated. I think the reason could be the integrated links.

    Here the comment I posted, without links:

    Please add on the “releases” section of GW-BASIC and MS-DOS on GitHub precompiled … eh… I mean preassembled binaries
    [Link to Github GW-BASIC releases]
    [Link to Github MS-DOS releases]
    With which assembler was GW-BASIC and MS-DOS generated?
    Which assembler can build it today?

    • Rich TurnerMicrosoft employee 0

      Thanks for re-posting. Your previous posts likely got blocked by our comment filter.

      Sorry – this isn’t a repo for distributing binaries – we’re simply preserving and sharing the source for historical & research purposes.

      Which assembler was used? Good question – depends on the processor architecture being targeted. In the 808x case, likely early versions of Microsoft Assembler – MASM.

      No idea if MASM can still build it – give it a try! 🙂

      • Diomidis Spinellis 0

        Thank you Rich for making this available!

        After making a few changes I was able to assemble all files with MASM 5.10A. You can find them, together with a rough Makefile and a linker script, in this repo. Slightly earlier MASM versions might have also worked, but initially I was moving from one MASM version to the next, trying to find one that would assemble all files without an error. In the end, I fixed a few issues by hand. Sadly, many routines are missing from the source code. It seems to me that this source code was coupled with an OEM vendor-specific file that performed low-level stuff, such as clearing the screen.

  • Luis Alonso Ramos 0

    Awesome!! I remember using GW-BASIC when I was 9 or 10, learning programming in 1991-92.

    10 PRINT “Hi!”
    20 GOTO 100
    100 END

    The most impressive thing for me is the copyright notice in GWMAIN.ASM:

    COPYRIGHT 1975 BY BILL GATES AND PAUL ALLEN

    No Microsoft back then!

  • Shaun Brandt 0

    While most of my personal experience was with QBasic and not GW-BASIC (I didn’t have my first Intel PC until 1995), I’m always happy to see more historical software open sourced. Thanks!

  • John Selbie 0

    Rich,

    Two things that would make this interesting:

    1) Build instructions. Or at least some hints about what tools to us to build gwbasic.exe.

    2) The original sources – not the transpiled output from the ISA translator.

  • Shaun Rossi 0

    Big thanks for doing this ! I spent so much time in GW-Basic, then later moving on the Qbasic. I often dump old computer programming books, but I’ve kept my GW Basic book all these years. Was so much fun. It will be interesting to poke around the ASM code.

  • James Wilson 0

    Awesome work! Now, what are the chances of publishing the original Altair 4k or 8k BASIC?

  • Anthony Caudill 0

    Why GW instead of Quick?

    I mean, is there any particular reason MS is holding onto that code? (besides the bit that parts of it are still used in VB6…)

    I did a review of QB based on the MSDOS-6 code leak. https://github.com/tcaudilllg/QB-Disassembly . QB had a lot of good ideas… was ahead of its time in many ways and possibly still unrivaled in sheer depth of handholding (which has its uses).

  • Alexei 0

    Looks like the source code is incomplete. For example, where’s EXTRN SETCLR defined?

  • Paul Pacheco 0

    I learned to code 32 years ago in GW-BASIC . This certainly takes me back.

  • Jim Callahan 0

    Could PDF scans of the original programming manuals be posted?
    Perhaps the original IBM Basic manual that came with the original IBM PC
    or the Compaq Microsoft Basic manual that came with Compaq’s original luggable?

    UPDATE:
    I found this manual:
    https://hwiegman.home.xs4all.nl/gw-man/index.html

    I remember programming a hexdump utility in Compaq Basic.
    This would have been in 1986 or 1987,
    I was trying to emulate this style of output:

    0000: 57 69 6B 69 70 65 64 69 61 2C 20 74 68 65 20 66 Wikipedia, the f
    0010: 72 65 65 20 65 6E 63 79 63 6C 6F 70 65 64 69 61 ree encyclopedia
    0020: 20 74 68 61 74 20 61 6E 79 6F 6E 65 20 63 61 6E that anyone can
    0030: 20 65 64 69 74 00 00 00 00 00 00 00 00 00 00 00 edit………..
    https://en.wikipedia.org/wiki/Hex_dump

    I had had a crash course in C and had a copy of K&R;
    so my starting point was the C copy program which
    reads and writes a character in a “while not eof()” loop.

    GW-Basic had a hex$() function, so I knew I could copy the input character to hex.
    https://hwiegman.home.xs4all.nl/gw-man/HEXS.html

    And GW-Basic had an eof() function, but the example used an if and goto!
    https://hwiegman.home.xs4all.nl/gw-man/EOF.html

    But, there was a WHILE loop implemented as While….Wend.
    https://hwiegman.home.xs4all.nl/gw-man/WHILEWEND.html

    So, I could do a K&R C style While not eof() loop to get one character and convert it to hex with the hex$() function
    then for output formatting I could also output a space before looping back for the the next character. Then I implemented
    a counter to keep track of the number of characters on the output line. I think I decided to output each input line as a “paragraph”
    and then skip an output line before printing the next input line.

    The finishing touch was the line numbers on the left and the original text on the right hand side.

    The program was very slow, but spead up dramatically when I told it to use buffers.
    I don’t see a buffer option in GW-Basic so it may have been an MS-DOS command.

    It would have been routine to add “files=20” and “buffers=20” to the config.sys file,
    but I thought there was a command line option to request buffered I/O when invoking GW-Basic.

    So, I think I wound up writing a MS-DOS batch file that invoked GW-Basic with the hexdump.bas program which
    prompted the user for a filename if it did not find one on the input line.

    The program eventually worked and its use case was identifying problematic characters that caused the file import utilities
    in Lotus 1-2-3 spreadsheet or dBase database to blow up. Once the offending character was identified;
    another “While not eof()” copy program could be written with an IF statement to replace offending character
    with an acceptable substitute such as a space or simply resume with the next character. Often the offending character
    would be a happy face! I recall one time I was trying to import into dBase a file that had been downloaded from an IBM mainframe
    and it turned out to have a column of null characters and dBase stopped at this column and ignored all the data to the right.
    I ran a fix-up “While not eof()” loop program that replaced the null characters with spaces and read in the entire file into dBase.
    I think it may have been a prank by the mainframe programmers; because they were really surprised when I read the whole thing!

    Now most file input routines are more robust or provide more useful error messages. If it became necessary to dump a file; Linux has a hexdump utility.
    And of course with open source software there are many choices for implementation languages from shell scripts to interpreted languages to compiled languages. But, that’s my how we made do in the “old days” story.

Feedback usabilla icon