February 29th, 2008

IntelliSense, Part 2 (The Future)

Hi, Jim Springfield again.  This post covers our current work to fundamentally change how we implement Intellisense and code browsing for C++ in Visual Studio 10.  I previously covered the history of Intellisense and outlined many of the problems that we face.  See here http://blogs.msdn.com/vcblog/archive/2007/12/18/intellisense-history-part-1.aspx for that posting and more detailed information.

As Visual C++ has evolved over the years there has been tension between getting code information quickly and getting it accurately.  We have moved from fast and not very accurate to sometimes fast and mostly accurate in Visual Studio 2008.  The main reason for the slowness has been the need to reparse .cpp files when a header is changed.  For large projects with some common headers, this can mean reparsing just about everything when a header is edited or a configuration or compile option is changed.  We are mostly accurate except that we only capture one parse of a header file even though it could be parsed differently depending on the .cpp that includes it (i.e. different #defines, compile options, etc).

For Visual Studio 10, which is the next release after Visual Studio 2008, we are going to do a lot of things differently.  For one, the NCB file is being eliminated.  The NCB file was very similar to a BSC file and the IDE needed to load the entire thing into memory in order to use it.  It was very hard to add new features to it (i.e. template support was bolted on) and some lookups required walking through a lot of information.  Instead of this, we will be using SQL Server Compact for our data store.  We did a lot of prototyping to make sure that it was the right choice and it exceeded our expectations.  Using Compact will allow us to easily make changes to our schema, change indexes if needed, and avoid loading the entire thing into memory.  We currently have this implemented and we are seeing increased performance and lower memory usage.

SQL Server Compact is an in-process version of SQL that uses a single file for the storage.  It was originally developed for Windows CE and is very small and efficient, while retaining the flexibility of SQL.

Also, there is a new parser for populating this store.  This parser will perform a file-based parse of the files in the solution in a way which is independent of any configuration or compile options and does not look inside included headers.  Because of this, a change to a header will not cause a reparse of all the files that include it, which avoids one of the fundamental problems today.  The parser is also designed to be extremely resilient and will be able to handle ambiguous code, mismatched braces or parentheses, and supports a “hint” file.  Due to the nature of C/C++ macros and because we aren’t parsing into headers, there is good bit of code that would be misunderstood.  The hint file will contain definitions for certain macros that fundamentally change the parsing and therefore understanding of a file.  As shipped, the hint file will contain all known macros of this type from the Windows SDK, MFC, ATL, etc.  This can be extended or modified and we are hoping to be able to identify potential macros in the source code.  Since we will be looking at every header, we want to be able to propose recommended additions to the hint file.

However, this style of parsing means that we don’t get completely accurate information in all cases, which is especially desirable in the Intellisense scenarios of auto-complete, parameter info, and quick info.  To handle this, we will be doing a full parse of a translation unit (i.e. .cpp file) when it is opened in the IDE.  This parse will be done in the fullest sense possible and will use all compile options and other configuration settings.  All headers will be included and parsed in the exact context that they are being used for.  We believe we can do this initial parse for most translation units very quickly with most not taking more than a second or two.  It should be comparable to how long it takes to actually compile that translation unit today, although since we won’t be doing code-generation and optimization, it should be faster than that.  Additionally, this parse will be done in the background and won’t block use of the IDE.  As changes are made to the .cpp or included headers, we will be tracking the edits made and incrementally reparsing only those bits that need to be parsed.

The database created from the file-based parse will be used for ClassView, CodeModel, NavBar, and other browsing based operations.  In the case of “Find All References”, the database will be used to identify possible candidates and the full parser will then be used to positively identify the actual references.

We have also been thinking about some longer term ideas that build on this.  This includes using a full SQL server to store information about source code, which multiple people can use.  It would allow you to lookup code that isn’t even on your machine.  For example, you could do a “goto definition” in your source and be taken to a file that isn’t even on your machine.  This could be integrated with TFS so that the store is automatically updated as code is checked in, potentially even allowing you to query stuff over time.  Another idea would b e to populate a SQL database with information from a full build of your product.  This would include very detailed information (i.e. like a BSC file) but shared among everyone and including all parts of an application.  This would be very useful as you could identify all callers of a method when you are about to make a change to it.

What are your ideas if you had this type of information?  Let us know!

Jim Springfield

Visual C++ Architect

Category
C++

0 comments

Discussion are closed.