How “Roslyn” Finally Unshackled Visual Basic From The Tyranny of the Pretty-Lister

Anthony D. Green [MSFT]

UPDATE 2015-04-02: After reading this post be sure to read the follow-up post!

I was chatting with an old Microsoftie a while ago and he let me in on the real story behind Visual Basic’s at times aggressive reformatting of code. It turns out that it didn’t actually start out as a feature but as a consequence of how the IDE was implemented. You see, older computers had significantly more limited memory available to them by modern standards. Every byte was precious. If you were to look at the way the VB IDE works today there are separate layers and data structures for representing the syntax of the language as parsed, the semantics of the language as bound, and the characters that appear on screen. But in order to make the original Microsoft BASIC IDEs fit into memory they had to be very efficient with their data storage. So what they did is they would parse a program into a set of data structures for the syntax, then mutate those structures with semantic information to check them for correctness, and then reverse that process to spit characters back out on the screen. This way there was only ever one representation of a program in memory at any time. Because the compiler/IDE simply had no idea of a separation between how something was typed and what it meant, by the time it got to round-trip the data back to the screen the way things were originally written was completely forgotten. This meant that even if you typed diM x      as    InteGER the characters that got spat back out would be DIM x AS INTEGER:

We changed the default casing from UPPERCASE to PascalCase in Visual Basic 1.0

Thus everyone was forced to use a consistent casing and whitespace (it’s not a bug, it’s a feature). And this was the state of affairs for the next three decades.

When I first learned this I was incensed. At had always assumed the pretty-lister was this helpful well-meaning guardian of style consistency. I honestly felt betrayed to learn that it was just another piece of cold soulless software too lazy to care about my uniqueness. Then it hit me: “Computers have more memory now than in the 80s!” The IDE didn’t need to work in this convoluted way anymore. And in fact, we’d made a specific point of designing the “Roslyn” syntax trees to preserve every trivial detail of how source code was written anyway. Our syntax trees are immutable, full-fidelity, and always round-trip-able. This opened up a world of possibilities that we hadn’t considered before. We could stop stomping on user formatting entirely and give control back to our developers of how to type keywords.

Of course, this led to total chaos across the project. Older developers on the team would still type the first letting in uppercase in some places. Newer developers who didn’t have the muscle memory would type in lowercase in other places. Some would use sentence casing and only capitalize the first keyword on a line. It turns out that some developers had been leaving the caps lock on for years and just hadn’t noticed. And no one could remember how to deal with compound keywords. Should it be AddHandler, addHandler, Addhandler, or addhandler? The code was a mess! Still recovering from the dreaded Type Inference War of the Summer of 2011, the Brace Style Lunchroom Battles of the Winter of 2012, the Single-Line Ifs Skirmish of May 2013, and the Great Field Naming Convention Civil Wars of 2014-2015 no one was prepared to go through another team wide debate on style. So we decided to re-institute tyranny (to thunderous applause).

Then some PM said, “Let’s take a step back here. What problem are we really trying to solve?

We then did some sprinting, some spiking, some prototyping, some mock ups, a focus group, various diagrams, profiling, and several internal alphas and betas, spun up a working group, defined some goals, gathered requirements, measured some KPIs, and spec’d test plans. We uploaded things to the cloud, we downloaded them back from the cloud. We proactively drove consensus while challenging our preconceived notions. We thought outside the box! And then we thought more back inside of the box and finally concluded that the best course of action was still to re-institute the pretty-lister (but now it was a data-driven decision!) with one small change. This time it would force all keywords to be entirely lowercase.

Now, before you recoil let me explain all of the data points that led us to conclude that this change actually makes the most sense:

1. Performance

We discovered that in 93.72529179378492348% of cases, developers were already typing all of the keywords in lowercase. Because the pretty-lister only had to correct casing at all < 7% of the time we increased VB typing responsiveness by 74%. And that’s nothing to sneeze at.

2. Agility

Every now and then we get complaints about official documentation taking too long to come out on MSDN. We chatted with our technical writers and discovered that there is actually a 42% increase in the time it takes to write technical articles about APIs and language features simply because the writers have to scan every document to put in parentheticals for VB specifically to change the casing. How many times have we all seen text like “This method is public (Public in Visual Basic)” “Write more responsive apps with async (Async in Visual Basic) and await (Await in Visual Basic)”. It might seem petty but across a library as large as MSDN little things like that add up. Once we unified the casing between VB and C# we saw an immediate 30% increase in documentation throughput.

3. Portability

Our team works in both VB and C# and every now and then some code has to be copied from VB to C# and then manually fixed up. VB and C# share many of the same accessibility modifiers (public, private, protected) and other keywords like class, delegate, and interface, and having them already be in lowercase meant that the C# compiler would report fewer initial errors that needed to be fixed up by hand.

4. Readability

Our resident typographic telemetry content experience psychometrician manager lead (it’s a real title, look it up) informed us that due to the PascalCasing and VB’s inherent preference for keywords over symbols the incidence of capitalized words in a VB program is 82% higher than in typical English writing and that this contrast made the VB code editor utterly inappropriate for writing scholastic literary assignments.

5. It just looks cool (particularly in the dark theme), and cool counts

We surveyed every programming language ever made ever and discovered that NINE MILLION PERCENT!!! of them use lowercase keywords. If we wanted VB to attract cooler newer younger hipper developers whose primary form of communication consists of text messages using only lowercase letters, even in the word ‘i’ (don’t you hate that? i know i do), we had to get with the times.

If I squint it looks kinda like Python.

The overwhelming supermajority of the fact-based scientific evidence was incontrovertible: We had a moral obligation to our customers, our colleagues, and ourselves to make the pretty-lister force all keywords to lowercase moving forward.

So, there you have it – the whole story.

The first public preview of this change will be in Visual Studio 2015 RTM to ensure we have ample time to react to your feedback. I understand that this will take some time for some of our long-time users who are familiar with the VS2013 experience to adjust. In anticipation of your almost-certainly constructive feedback I have pre-created a UserVoice suggestion about this topic where you can give us your most livid, infuriated, and incendiary seething support. I encourage you all to head over there and use your votes and the freeform text comment box to send us a strong message that we are awesome!

Warmest Regards,

Anthony D. Green, Program Manager, Visual Basic, C#, and F# Languages Team

UPDATE 2015-04-02: If you like this idea then be sure to read the follow-up post!


Leave a comment

Feedback usabilla icon