May 21st, 2020

Microsoft Open-Sources GW-BASIC

Rich Turner
Sr. Program Manager

We are excited to announce the open-sourcing of Microsoft GW-BASIC on GitHub!

Yes, seriously 😀

Why?

Since re-open-sourcing MS-DOS 1.25 & 2.0 on GitHub last year, we’ve received numerous requests to also open-source Microsoft BASIC.

Well, here we are! 😁

The Source

These sources, as clearly stated in the repo’s readme, are the 8088 assembly language sources from 10th Feb 1983, and are being open-sourced for historical reference and educational purposes. This means we will not be accepting PRs that modify the source in any way.

A little historical context

The GW-BASIC source code being published is dated Feb 10th 1983. That was quite a while ago, so just to set a little historical perspective:

The week this source was created Men At Work topped the US and UK singles charts with “Down Under”, Dustin Hoffman starred in the #1 US box-office movie, “Tootsie”. In 1983, “Star Wars Episode VI – Return of the Jedi” was released, as was “War Games”! And, Emily Blunt, Kate Mara, Jonah Hill, Chris Hemsworth, and Henry Cavill, were born! Ronald Reagan was President of the USA, and Margaret Thatcher was the UK’s Prime Minister.

That same year, Bjarne Stroustrup was in the middle of developing the first version of the C++ programming language, ARPANET standardized TCP/IP. Borland announced Turbo Pascal, created by Anders Hejlsberg (who went on to join Microsoft, and create J++, C# and TypeScript).

1983 was also the year AT&T released UNIX System V R1, and BSD 4.2 was released, introducing the pseudoterminal for the first time (the progenitor to Windows’ ConPTY we introduced to Windows in 2018 😁)

I was 13, and spent every spare second that I wasn’t finishing my homework or doing my chores, writing BASIC and assembly code on one of the hottest home computers of the time – the BBC Micro sporting 32KB RAM (yes, 32,768 bytes, total!), powered by a 6502 processor running at a BLAZING 2MHz. When not coding, I was usually playing one of the most groundbreaking games of all time: “Elite” by David Braben & Ian Bell.

In 1983, Apple launched the 1MHz 6502-powered Apple IIe for US$1,395 (> $3,500 in 2020). Apple also launched the first retail-available computer with a GUI – the Apple Lisa. The Lisa contained a staggering 1MB RAM, and ran the awesome Motorola 68000 processor at an astounding 5MHz, but it cost $9,995 (> $25,000 in 2020 dollars), so all I could do was peer at it through the window of the one computer store in our town authorized to sell Apple’s products … and dream.

And, in 1983 Microsoft released MS-DOS 2.0 (source here), and GW-BASIC for the IBM PC XT and compatibles.

What IS GW-BASIC?

GW-BASIC was a BASIC interpreter derived from IBM’s Advanced BASIC/BASICA, which itself was a port of Microsoft BASIC.

Microsoft’s various BASIC implementations can trace their origins all the way back to Bill Gates & Paul Allen’s implementation of Microsoft’s first product – a BASIC interpreter for the Altair 8800.

During the late ’70s and 80s, Microsoft’s BASIC was ported to many OEM’s specific platform and hardware needs, and for several processors popular at that time, including the 8088, 6502, 6809, Z80, and others.

FAQ

Wait – where’s the C source code?

There is no C source code!

Like much software from the 70s and 80s, and just like the source for MS-DOS, the source code of GW-BASIC is 100% assembly language.

Why assembly? Why didn’t developers use higher-level languages like C, or Pascal?

When developing on/for mainframes and minicomputers of the day, developers were sometimes able to use higher-level languages like FORTRAN, LISP, COBOL, RPG, CPL/BCPL, C, etc. but the compilers for such languages were often hugely expensive, rarely generated efficient code, and were generally unavailable for the space and performance constrained home and personal computers of the day.

When writing software for early PCs, every single byte and every single instruction mattered, so developers often wrote code entirely in assembly language simply to be able to physically fit their software into the available memory, and to be able to access the computer’s resources and internal workings.

Thus, all the source code for GW-BASIC is pure assembly code, translated on a per-processor/per-machine basis from core/master sources.

This source was ‘translated’?

Each of the assembly source files contains a header stating This translation created 10-Feb-83 by Version 4.3

Since the Instruction Set Architecture (ISA) of the early processors used in home and personal computers weren’t spectacularly different from one another, Microsoft was able to generate a substantial amount of the code for a port from the sources of a master implementation. (Alas, sorry, we’re unable to open-source the ISA translator.)

What about other ports?

Many have asked if we can also open-source implementations for processors other than the 808x. Alas, we’re unable to provide sources for these ports and/or customizations.

Enjoy!

We hope you enjoy exploring this fascinating snapshot of what software development looked like during the glorious, exciting, heady days of the ’70s and early ’80s at the dawn of “the personal computer” 😁

Many thanks to Amy, Julia Liuson, Amanda Silver, and our awesome CELA team for their approval and help finding, reviewing, and open-sourcing GW-BASIC.

Author

Rich Turner
Sr. Program Manager

Geek, Nerd, Hacker. Fan of Rugby, Motorcycles, Skiing, Outdoor activities.

68 comments

Discussion is closed. Login to edit/delete existing comments.

Newest
Newest
Popular
Oldest
  • Dave Mackey

    Been great to see Microsoft embracing open source so fully over the past number of years.

    Like many, I’d love to see QuickBasic/PDS/VBDOS source code released, understand limitations of third party code.

    Might be interesting to release just Microsoft code portions. The community might be able to figure out how to build the third-party components via reverse engineering. For example, with DOS, I imagine a lot of the utilities already exist through OSS like FreeDOS and DOSBox.

    If source code can’t be released for these, is there a possibility of releasing the software under a free license? Copies of all are readily available floating around the internet, and Microsoft doesn’t seem interested in prosecuting these hosts of “abandonware” – but would be nice to see an official, legal release from Microsoft. 🙂

  • BERDAH Paul Emile

    15 years ago, I start studying what does it take to run basic syntax on any computer.
    I discovered that most of the microprocessors comes with C++ support.
    As Basica is a function call language, I try to write C++ Basic function counterpart of most used ones and it works.
    Deciding for good behaviors, like one line one command, defining variables at the beginning of piece of code etc …
    It begin clear what designing a code editor that translate line by line, in the background, Basic syntax into C++ syntax was the solution.
    The result, a code editor that translate Basic syntax to C++ while doing some grooming and a C++ library of Basic functions equivalent.
    At that point is just the matter to choose the right compiler for the target machine and that’s all.
    Using that process, I write Basica to Javascript transcompiler, to test piece of code interactively, treating files as localstorage.

  • Pedro Prado

    I was already happy to read you talking how we simply had to use ASM to do useful things, but you got me totally when you mentioned Elite. I played it a lot on the MSX (popular in Brazil).

    Nice article, Commander Jameson!

    • Rich TurnerMicrosoft employee Author

      Many thanks Pedro 🙂 Look forward to seeing you on Lave 😀

  • Mariano Hidalgo

    Magnificent article Rich. Thank you!

  • Shafiul Islam

    This is really great! Thanks Rich!

    • amir Ahmadi

      Good write-up. I definitely appreciate this
      https://diji-takip.com/
      site. Keep it up!

  • Jim Callahan

    Could PDF scans of the original programming manuals be posted?
    Perhaps the original IBM Basic manual that came with the original IBM PC
    or the Compaq Microsoft Basic manual that came with Compaq’s original luggable?

    UPDATE:
    I found this manual:
    https://hwiegman.home.xs4all.nl/gw-man/index.html

    I remember programming a hexdump utility in Compaq Basic.
    This would have been in 1986 or 1987,
    I was trying to emulate this style of output:

    0000: 57 69 6B 69 70 65 64 69 61 2C 20 74 68 65 20 66 Wikipedia, the f
    0010: 72 65 65 20 65 6E 63 79 63 6C 6F 70 65 64 69 61 ree encyclopedia
    0020: 20 74 68 61 74 20 61 6E 79 6F 6E 65 20 63 61 6E that anyone can
    0030: 20 65 64 69 74 00 00 00 00 00 00 00 00 00 00 00 edit………..
    https://en.wikipedia.org/wiki/Hex_dump

    I had had a crash course in C and had a copy of K&R;
    so my starting point was the C copy program which
    reads and writes a character in a “while not eof()” loop.

    GW-Basic had a hex$() function, so I knew I could copy the input character to hex.
    https://hwiegman.home.xs4all.nl/gw-man/HEXS.html

    And GW-Basic had an eof() function, but the example used an if and goto!
    https://hwiegman.home.xs4all.nl/gw-man/EOF.html

    But, there was a WHILE loop implemented as While….Wend.
    https://hwiegman.home.xs4all.nl/gw-man/WHILEWEND.html

    So, I could do a K&R C style While not eof() loop to get one character and convert it to hex with the hex$() function
    then for output formatting I could also output a space before looping back for the the next character. Then I implemented
    a counter to keep track of the number of characters on the output line. I think I decided to output each input line as a “paragraph”
    and then skip an output line before printing the next input line.

    The finishing touch was the line numbers on the left and the original text on the right hand side.

    The program was very slow, but spead up dramatically when I told it to use buffers.
    I don’t see a buffer option in GW-Basic so it may have been an MS-DOS command.

    It would have been routine to add “files=20” and “buffers=20” to the config.sys file,
    but I thought there was a command line option to request buffered I/O when invoking GW-Basic.

    So, I think I wound up writing a MS-DOS batch file that invoked GW-Basic with the hexdump.bas program which
    prompted the user for a filename if it did not find one on the input line.

    The program eventually worked and its use case was identifying problematic characters that caused the file import utilities
    in Lotus 1-2-3 spreadsheet or dBase database to blow up. Once the offending character was identified;
    another “While not eof()” copy program could be written with an IF statement to replace offending character
    with an acceptable substitute such as a space or simply resume with the next character. Often the offending character
    would be a happy face! I recall one time I was trying to import into dBase a file that had been downloaded from an IBM mainframe
    and it turned out to have a column of null characters and dBase stopped at this column and ignored all the data to the right.
    I ran a fix-up “While not eof()” loop program that replaced the null characters with spaces and read in the entire file into dBase.
    I think it may have been a prank by the mainframe programmers; because they were really surprised when I read the whole thing!

    Now most file input routines are more robust or provide more useful error messages. If it became necessary to dump a file; Linux has a hexdump utility.
    And of course with open source software there are many choices for implementation languages from shell scripts to interpreted languages to compiled languages. But, that’s my how we made do in the “old days” story.

  • Paul Pacheco

    I learned to code 32 years ago in GW-BASIC . This certainly takes me back.

  • Alexei

    Looks like the source code is incomplete. For example, where’s EXTRN SETCLR defined?

Feedback