We are excited to announce the open-sourcing of Microsoft GW-BASIC on GitHub!
Yes, seriously 😀
Why?
Since re-open-sourcing MS-DOS 1.25 & 2.0 on GitHub last year, we’ve received numerous requests to also open-source Microsoft BASIC.
Well, here we are! 😁
The Source
These sources, as clearly stated in the repo’s readme, are the 8088 assembly language sources from 10th Feb 1983, and are being open-sourced for historical reference and educational purposes. This means we will not be accepting PRs that modify the source in any way.
A little historical context
The GW-BASIC source code being published is dated Feb 10th 1983. That was quite a while ago, so just to set a little historical perspective:
The week this source was created Men At Work topped the US and UK singles charts with “Down Under”, Dustin Hoffman starred in the #1 US box-office movie, “Tootsie”. In 1983, “Star Wars Episode VI – Return of the Jedi” was released, as was “War Games”! And, Emily Blunt, Kate Mara, Jonah Hill, Chris Hemsworth, and Henry Cavill, were born! Ronald Reagan was President of the USA, and Margaret Thatcher was the UK’s Prime Minister.
That same year, Bjarne Stroustrup was in the middle of developing the first version of the C++ programming language, ARPANET standardized TCP/IP. Borland announced Turbo Pascal, created by Anders Hejlsberg (who went on to join Microsoft, and create J++, C# and TypeScript).
1983 was also the year AT&T released UNIX System V R1, and BSD 4.2 was released, introducing the pseudoterminal for the first time (the progenitor to Windows’ ConPTY we introduced to Windows in 2018 😁)
I was 13, and spent every spare second that I wasn’t finishing my homework or doing my chores, writing BASIC and assembly code on one of the hottest home computers of the time – the BBC Micro sporting 32KB RAM (yes, 32,768 bytes, total!), powered by a 6502 processor running at a BLAZING 2MHz. When not coding, I was usually playing one of the most groundbreaking games of all time: “Elite” by David Braben & Ian Bell.
In 1983, Apple launched the 1MHz 6502-powered Apple IIe for US$1,395 (> $3,500 in 2020). Apple also launched the first retail-available computer with a GUI – the Apple Lisa. The Lisa contained a staggering 1MB RAM, and ran the awesome Motorola 68000 processor at an astounding 5MHz, but it cost $9,995 (> $25,000 in 2020 dollars), so all I could do was peer at it through the window of the one computer store in our town authorized to sell Apple’s products … and dream.
And, in 1983 Microsoft released MS-DOS 2.0 (source here), and GW-BASIC for the IBM PC XT and compatibles.
What IS GW-BASIC?
GW-BASIC was a BASIC interpreter derived from IBM’s Advanced BASIC/BASICA, which itself was a port of Microsoft BASIC.
Microsoft’s various BASIC implementations can trace their origins all the way back to Bill Gates & Paul Allen’s implementation of Microsoft’s first product – a BASIC interpreter for the Altair 8800.
During the late ’70s and 80s, Microsoft’s BASIC was ported to many OEM’s specific platform and hardware needs, and for several processors popular at that time, including the 8088, 6502, 6809, Z80, and others.
FAQ
Wait – where’s the C source code?
There is no C source code!
Like much software from the 70s and 80s, and just like the source for MS-DOS, the source code of GW-BASIC is 100% assembly language.
Why assembly? Why didn’t developers use higher-level languages like C, or Pascal?
When developing on/for mainframes and minicomputers of the day, developers were sometimes able to use higher-level languages like FORTRAN, LISP, COBOL, RPG, CPL/BCPL, C, etc. but the compilers for such languages were often hugely expensive, rarely generated efficient code, and were generally unavailable for the space and performance constrained home and personal computers of the day.
When writing software for early PCs, every single byte and every single instruction mattered, so developers often wrote code entirely in assembly language simply to be able to physically fit their software into the available memory, and to be able to access the computer’s resources and internal workings.
Thus, all the source code for GW-BASIC is pure assembly code, translated on a per-processor/per-machine basis from core/master sources.
This source was ‘translated’?
Each of the assembly source files contains a header stating This translation created 10-Feb-83 by Version 4.3
Since the Instruction Set Architecture (ISA) of the early processors used in home and personal computers weren’t spectacularly different from one another, Microsoft was able to generate a substantial amount of the code for a port from the sources of a master implementation. (Alas, sorry, we’re unable to open-source the ISA translator.)
What about other ports?
Many have asked if we can also open-source implementations for processors other than the 808x. Alas, we’re unable to provide sources for these ports and/or customizations.
Enjoy!
We hope you enjoy exploring this fascinating snapshot of what software development looked like during the glorious, exciting, heady days of the ’70s and early ’80s at the dawn of “the personal computer” 😁
Many thanks to Amy, Julia Liuson, Amanda Silver, and our awesome CELA team for their approval and help finding, reviewing, and open-sourcing GW-BASIC.
Been great to see Microsoft embracing open source so fully over the past number of years.
Like many, I’d love to see QuickBasic/PDS/VBDOS source code released, understand limitations of third party code.
Might be interesting to release just Microsoft code portions. The community might be able to figure out how to build the third-party components via reverse engineering. For example, with DOS, I imagine a lot of the utilities already exist through OSS like FreeDOS and DOSBox.
If source code can’t be released for these, is there a possibility of releasing the software under a free license? Copies of all are readily available floating around the internet, and Microsoft doesn’t seem interested in prosecuting these hosts of “abandonware” – but would be nice to see an official, legal release from Microsoft. 🙂
15 years ago, I start studying what does it take to run basic syntax on any computer.
I discovered that most of the microprocessors comes with C++ support.
As Basica is a function call language, I try to write C++ Basic function counterpart of most used ones and it works.
Deciding for good behaviors, like one line one command, defining variables at the beginning of piece of code etc …
It begin clear what designing a code editor that translate line by line, in the background, Basic syntax into C++ syntax was the solution.
The result, a code editor that translate Basic syntax to C++ while doing some grooming and a C++ library of Basic functions equivalent.
At that point is just the matter to choose the right compiler for the target machine and that’s all.
Using that process, I write Basica to Javascript transcompiler, to test piece of code interactively, treating files as localstorage.
I was already happy to read you talking how we simply had to use ASM to do useful things, but you got me totally when you mentioned Elite. I played it a lot on the MSX (popular in Brazil).
Nice article, Commander Jameson!
Many thanks Pedro 🙂 Look forward to seeing you on Lave 😀
Muito bom gostei
Atenciosamente Equipe de Marketing Digital
Magnificent article Rich. Thank you!
This is really great! Thanks Rich!
Good write-up. I definitely appreciate this
https://diji-takip.com/
site. Keep it up!
Could PDF scans of the original programming manuals be posted?
Perhaps the original IBM Basic manual that came with the original IBM PC
or the Compaq Microsoft Basic manual that came with Compaq’s original luggable?
UPDATE:
I found this manual:
https://hwiegman.home.xs4all.nl/gw-man/index.html
I remember programming a hexdump utility in Compaq Basic.
This would have been in 1986 or 1987,
I was trying to emulate this style of output:
0000: 57 69 6B 69 70 65 64 69 61 2C 20 74 68 65 20 66 Wikipedia, the f
0010: 72 65 65 20 65 6E 63 79 63 6C 6F 70 65 64 69 61 ree encyclopedia
0020: 20 74 68 61 74 20 61 6E 79 6F 6E 65 20 63 61 6E that anyone can
0030: 20 65 64 69 74 00 00 00 00 00 00 00 00 00 00 00 edit………..
https://en.wikipedia.org/wiki/Hex_dump
I had had a crash course in C and had a copy of K&R;
so my starting point was the C copy program which
reads and writes a character in a “while not eof()” loop.
GW-Basic had a hex$() function, so I knew I could copy the input character to hex.
https://hwiegman.home.xs4all.nl/gw-man/HEXS.html
And GW-Basic had an eof() function, but the example used an if and goto!
https://hwiegman.home.xs4all.nl/gw-man/EOF.html
But, there was a WHILE loop implemented as While….Wend.
https://hwiegman.home.xs4all.nl/gw-man/WHILEWEND.html
So, I could do a K&R C style While not eof() loop to get one character and convert it to hex with the hex$() function
then for output formatting I could also output a space before looping back for the the next character. Then I implemented
a counter to keep track of the number of characters on the output line. I think I decided to output each input line as a “paragraph”
and then skip an output line before printing the next input line.
The finishing touch was the line numbers on the left and the original text on the right hand side.
The program was very slow, but spead up dramatically when I told it to use buffers.
I don’t see a buffer option in GW-Basic so it may have been an MS-DOS command.
It would have been routine to add “files=20” and “buffers=20” to the config.sys file,
but I thought there was a command line option to request buffered I/O when invoking GW-Basic.
So, I think I wound up writing a MS-DOS batch file that invoked GW-Basic with the hexdump.bas program which
prompted the user for a filename if it did not find one on the input line.
The program eventually worked and its use case was identifying problematic characters that caused the file import utilities
in Lotus 1-2-3 spreadsheet or dBase database to blow up. Once the offending character was identified;
another “While not eof()” copy program could be written with an IF statement to replace offending character
with an acceptable substitute such as a space or simply resume with the next character. Often the offending character
would be a happy face! I recall one time I was trying to import into dBase a file that had been downloaded from an IBM mainframe
and it turned out to have a column of null characters and dBase stopped at this column and ignored all the data to the right.
I ran a fix-up “While not eof()” loop program that replaced the null characters with spaces and read in the entire file into dBase.
I think it may have been a prank by the mainframe programmers; because they were really surprised when I read the whole thing!
Now most file input routines are more robust or provide more useful error messages. If it became necessary to dump a file; Linux has a hexdump utility.
And of course with open source software there are many choices for implementation languages from shell scripts to interpreted languages to compiled languages. But, that’s my how we made do in the “old days” story.
I learned to code 32 years ago in GW-BASIC . This certainly takes me back.
Looks like the source code is incomplete. For example, where’s EXTRN SETCLR defined?