We know developers like you take pride in your code! Many of the features in Visual Studio are designed to help you write the code you want. Visual Studio helps you ensure your code compiles and can even help with code styling. Now it can even make sure your spelling is accurate. Visual Studio 17.5 preview 3 introduces the first preview of the Spell Checker for C#, C++ and Markdown files.
Getting Started
The feature will be turned on automatically when working with any C#, C++ or Markdown file. Now, when you’re working with any document supported by the spell checker, Visual Studio will mark any words that detects as misspelled words. Visual Studio will also suggest alternate spellings, and help correct them, even doing a contextual rename when those misspellings are identifiers, so your code will still compile. The spell checker can be disabled by unchecking the “Text spell checker” feature under Manage Preview Features. The spell checker can also be enabled or disabled from the menu with the Edit > Advanced > Toggle Text Spell Checker command, or from a button on the main toolbar in Visual Studio.
How do you use it?
When the caret is on a spelling error, the quick actions provide solutions for fixing the spelling mistakes. You can bring up the quick actions with either “Ctrl+.” or “Alt+Enter”. When the context menu comes up, Visual Studio provides three options to handle a spelling issue.
If any of the dictionaries provide spelling suggestions, Visual Studio will provide them. If multiple dictionaries provide suggestions, the suggestions will be grouped by dictionary. For strings and comments, choosing one of these suggestions will do a single, in-place replacement. For identifiers in a C++ or a C# document, accepting a suggestion will perform a Refactor/Rename, updating all instances of the identifier to make sure the code compiles.
You can also choose to ignore the spelling issue. By choosing to ignore the issue, Visual Studio will create an exclusion.dic file in your AppData directory on your local machine. Once a word has been ignored, it will be ignored across all instances of Visual Studio for you.
How does it work?
If you’re interested in the details, this section will get into the specifics of how the spell checker works. Many of these behaviors can be customized, and we’ll cover that in the next section. Since C#, C++ and Markdown all use English as the language for their keywords, Visual Studio will always use the “English (United States)” or “en-us” dictionary for spell checking. Visual Studio will also ask the instance of Windows for the display language it’s using, and if it’s not “en-us”, it will use that dictionary as well.
Feedback from early users of this feature informed us that developers wanted to know about errors in the documents they were currently working with. In direct response to this feedback, the spell checker will only scan documents that are open.
This chart below shows some of the heuristics that the spell checker looks at when scanning a code document:
What’s in the code | What Visual Studio checks | Why? |
Hello | Hello, hello | Always check for both proper and common nouns |
HelloWorld | Hello, hello, World, world | Medial capitals common are used to mark word boundaries |
Hello.World | Hello, hello, World, world | Punctuation is used as a word boundary |
_Hello123 | Hello, hello | Leading or trailing numbers or punctuation is stripped |
Hello2World | Hello, hello, World, world | Medial numbers, like punctuation, is used as word boundary |
btnWorld | World, world | Fragments of 3 characters or less are ignored |
helloworld | Helloworld, helloworld | No indicator to identify word boundaries |
Otherwise, we consider the word misspelled and flag the token as a spelling error. This will show up with a severity of “Message” in the Error List with a “SPELL” Code.
Customizing the Spell Checker
The default behavior is designed to get users started and help with the initial experience. Once users are ready to start working in a collaborative environment, the spell checker has options for customizability.
We chose editorconfig for configuration to allow users to control the spell checker behavior in their repository. By configuring .editorconfig, users can establish coding standards they expect to be followed and maintain consistency that would be difficult through other methods.
Here are the switches you can configure in editorconfig:
spelling_languages = _language_[,_language_] (Example: = en-us,fr-fr)
- This lists the languages for Visual Studio to use. In this example, Visual Studio would only use the en-us and fr-fr dictionaries when checking for spelling issues. Note that the fr-fr language pack must be installed on the user’s machine or Visual Studio will incorrectly flag any French words as spelling errors.
spelling_checkable_types = strings,identifiers,comments (Example: = identifiers,comments)
- This controls what Visual Studio should check. In this example, Visual Studio would check identifiers and comments for misspelled words but wouldn’t check inside strings.
spelling_error_severity = error OR warning OR information OR hint (Example: = error)
- This controls the severity Visual Studio will assign to spelling errors in the error list. In this example, spelling errors will be displayed as errors.
spelling_exclusion_path = absolute OR relative path to exclusion dictionary. (Example: = .\exclusion.dic)
- This allows you to create your own exclusion dictionary to specify words you consider to be correctly spelled. In this example, the first time the spell checker is run against any file in the solution, Visual Studio will check for an exclusion.dic file in the same directory as the .sln file (for a C# project) or in the root directory (for a C++ directory). If no file exists, the spell checker will create one. Then, whenever the user chooses to ignore a word, it will be added to this exclusion.dic file. Visual Studio will consider any word that appears in this exclusion.dic file as a correctly spelled word. Note that the exclusion.dic file must be UTF16 with BOM encoding to work correctly.
What do you think?
We’re really excited to hear how the spell checker is helping developers feel better about their code. Features like these come from feedback from our community and we appreciate all the feedback. We’d love to know what you think, and we invite folks to join in the conversation over on Developer Community. Let us know what improvements you feel would help make the spell checker even better. Thanks for being an important part of the Visual Studio family!
False positive on C# keywords when a line is commented:
typeof is marked as spell error.
Agreed. This is a significant problem that I’m going to admit was one I hadn’t considered in the original design. When a big chunk of C# code is commented out, it’s going to move from “excluded by the spell checker since it’s code” to “checked by the spell checker since it’s now in a comment” and a large portion of that will be C# keywords.
We’ve got a prototype in development right now that adds C# and C++ specific dictionaries when checking C# and C++ files, and that should fix the issue you’re seeing (I’ll test it on that branch, but I’ve got high confidence that it’ll work). So, stay tuned.
Visual Studio will check for an exclusion.dic file in the same directory as the .sln file (for a C# project) or in the root directory (for a C++ directory). If no file exists, the spell checker will create one.
VS 17.5 doesn’t create it for me :(, at least not where my sln is.
Where it’s supposed to be?
The “ignores” are kept between re-openings of VS, so the list must be stored somewhere…
BTW: Finally, great to see this integrated! I was more than happy to drop the extension I’ve been using until now 🙂
Got it 🙂
For me it was under: \AppData\Local\Microsoft\VisualStudio\17.0_d17ec89c
What about text in Git or TFVC commit/check-in messages?
This is something we’re looking into. It’s been a very high request. The only reason why it didn’t make it into the initial pass is because the editor already has a mechanism (the Quick Info system) that we could hook into for suggestions. For commit/check-in messages, we’ll need to implement that context menu from scratch. It also has a complexity around editorconfig… when editing a file, it is deterministic as to which editorconfig file(s) should be applied and what information should be pulled. For the commit/check-in messages, that’s not as clear. We need to figure out what information we’ll use.
That’s not to say we can’t do it, it’s just to say that we have some things we need to work through.
My next 20 commits message: “fix a typo” 😂
After a quick look I saw some issues with codebases using a mix of German and English (It’s pretty common to generally use English for Identifiers except for subject-specific concepts).
It’s pretty common to replace non-ASCII characters with their ASCII counterparts when using German for identifiers (ä -> ae, ö -> oe, ü -> ue, ß -> ss), this is not supported yet.
Compound nouns are generally written attached to each other in German while they are normally written apart in English (e.g. house number is Hausnummer in German). So compound nouns usually end up as one compound in CamelCase like DeleteHausnummerIfContains13(). The existing dictionary contains common compound words, but it’s very common to create new ones (also in English). It’s pretty tedious to add combinations of words to the repository-specific dictionary.
We’re somewhat at the mercy of what the spell checking API does in these cases… so I’m curious what the API does. I’m fairly sure Microsoft Teams uses the same API, do you know if Teams handles these words correctly, or does it flag them as misspelled as well?
One case that we’re looking at in English is concatenated words, like “helloworld”. Without a medial capital, we don’t have a good way to know that there are two words here. The spell checking API will suggest “hello world” as a possible suggestion, but if we provide that as a replacement, the resulting code wouldn’t compile. We considered that if the spell checking API provided “hello world” as a suggestion for an identifier, we could change that to “helloWorld” as a possible recommendation, but should we suggestion “helloWorld”, “HelloWorld”, “hello_world” or something else? That opened an interesting can of worms for us that we weren’t ready to tackle in the first round here.
I think what I’m trying to tease out is whether the case you’re describing is different from the normal behavior of the spell checking API for the German language, or something special in how developers tend to work with German/English codebases. If it’s the latter, then we’d need to handle this as a special case, and that’s going to be somewhat tricky.
I’m assuming this is built (at least for C#) using Roslyn… so wondering why VB is left off of the supported list????
To peek under the hood, for the first implementation, we had to ask individual languages to provide ranges of code that should be spell checked. The Editor asks the language service “Tell me what is an identifier, comment or string” and then, based on the flags in editorconfig, determines what it should spell check. This means that it was an amount of work on the individual language to provide that tagging.
It wasn’t a LARGE amount of work, but it was non-zero. For the first preview, we went with the two most common programming languages, C# and C++ to get the widest usage and see how folks feel about the implementation. We also put in Markdown as an experiment since we could tell it “Check everything!” and see how it worked. Once we see how folks feel about the behavior and get it working… and prove out that it’s got value, we intend to expand this to other languages. We’re also hoping that the work needed for other languages will be fairly small, so it’ll be quick to onboard other languages. I would really like to see this in Javascript/Typescript, VB and Python… but it’ll simply be whichever languages have the cycles to provide the tags the editor needs.
I had a lot of false positives:
– Some plurals not working
– Lot of nouns not known (nuget, microsoft, netcore for example)
– Lots of software ‘slang’ not not known: ‘alloc’ for example
I’m not sure why plurals aren’t working. I’m curious what dictionaries you’re using and which words are working incorrectly. I’m curious if you type them here as a comment if they get flagged as misspelled words. For instance, here in my comment, when I typed “nuget”, it gets flagged as a misspelled word… and since I’m using Edge, I *THINK* it’s using the same API (but I could be wrong on that).
Proper nouns are only going to show up as spelled correctly if they’re in the dictionary. I’m not surprised that “Microsoft” shows up in our dictionary. I could argue that nuget or netcore probably should, but for those, you’re probably going to want to just add them yourself. In my experience, most city names and many proper names are in the dictionary, but many company and product names are not.
Software ‘slang’ is one we’re working on right now and we’ve got some work items in the pipeline to improve that experience. My pet peeve for that one is “args”, which really shouldn’t be getting flagged. I don’t know exactly when that particular fix will go in, and for the short term, you can always add that to a local exclusion dictionary, but I agree that software languages do have more extensive dictionaries and there’s value in going beyond the simple spoken dictionaries.
What happened to all of the other comments here?
https://web.archive.org/web/20230120064509/https://devblogs.microsoft.com/visualstudio/visual-studio-spell-checker-preview-now-available/
Anyway, as I previously commented, the Developer Community link doesn’t work. I get a 403. I want to provide feedback, but it’s kind of hard when I can’t view the issue.
I sincerely apologize, I failed to set the ticket as public before the blog post went live. I’ve corrected the ticket and you should now be able to post comments.
The link should be: https://developercommunity.visualstudio.com/t/Feedback-on-the-Preview-Spell-Checking-E/10252795
Sorry about the confusion.
What is the format for the exclusion.dic files? One word per line?
It’s one word per line, but the file must be in the specific file format (UTF16 with BOM encoding). We’re using the spell-checking APIs provided by Bing/Office and they’re very particular about the file format.
I found it much easier to use .editorconfig in a junk project, put all the words I want to add in a comment and then ignore all those words. That created the file and added all the words in the correct format for me. I’ve got a feature request to see if we can identify if an exclusion dictionary has the wrong encoding and offer to fix it, but I don’t know if that will get approved or not.
Thanks for the explanation about why it needs to be UTF-16 with a BOM. I was going to ask why it couldn’t just be UTF-8, but that explains it.
Personally, the UTF-16 with BOM is a first-class pain in the backside. Until I fully understood it, I kept annoying my developers telling them that the feature had bugs in it, because it kept breaking.
We’re looking into ways to work around the problem, and we THINK we’ve got a way to remove the encoding requirement. Hopefully in a future version, we can remove that requirement, but I don’t want to make any promises until we know for sure.