June 9th, 2010

Exploring the Visual C++ Browse Database

Hello, this is Jim Springfield.  This post will cover some low-level details about how we represent information in the browse database in VS 2010.  As I’ve mentioned in a previous post, we are using SQL Server Compact Edition (SSCE) for storing information about all of the C, C++, and IDL files in your solution.  I will show some SQL examples that illustrate how to mine this database for information about your code.

NOTE: The particular database schema we use in VS2010 may very well change in future versions, so the examples I show may not work in future versions of VS.

Opening the SDF file

The database file (SDF) can normally be found in the same directory as your solution file (SLN), although it may get relocated if your SLN is on a network share or a flash drive.  You can open the SDF using several different tools.  SSMS (SQL Server Management Studio) is a good one if you have access to it.  However, you can also open it in Visual Studio itself.  Before opening the SDF to play with, it is best to close the associated solution. 

1.       Within VS, go to the “Server Explorer” window, right click on “Data Connections”, and select “Add Connection…”.

2.       Select “Microsoft SQL Server Compact 3.5” for the Data Source and click “Continue”.

3.       Click the “Browse…” button and then navigate to the SDF file to open.

4.       If the SDF file is large (i.e. > 250MB), you will need to set an option before opening.  To do this, click the “Advanced…” button and set “Max Database Size” to something larger than the SDF you are opening.  4091MB is the largest value allowed and you can just use that if you wish.

5.       Finally, click the “OK” button and VS should open your SDF file.

Learning your way around

If you expand the new node for your SDF in Server Explorer, you will see a “Tables” node.  Expand that and you can see all of the tables that are currently defined.  The “code_items” table contains information on every definition and declaration that occur in your source.  I don’t have space to cover what all of the tables do, but take a look around.  Don’t expect to see data in the refs or symbols tables as those are there for some possible future use. 

Note:  Don’t try to make changes to the schema or indexes and expect that to persist.  When we open the SDF for actual browsing use, we do a consistency check and if anything is not “correct”, we delete the SDF and rebuild it.

Take a look at the “code_item_kinds” table.  You can see the contents by right-clicking and selecting “Show Table Data”.  There should be 59 entries here.  These values are used in the “code_items” table and you can use them in queries to find certain types of code items.

Creating a query

Right click on the “code_items” table and select “New Query”.  You will get a query window with some tools that help you build and run queries.  A window will popup asking you to add a table.  Just click close as you will just be copying queries into the query window for now.

Try this query first to get your feet wet.  To run a query, click the red exclamation icon in the toolbar or press “Ctrl+R”.

select * from code_items where kind=1

This query returns all code items that are C++ classes.  If you look at all of the columns, most should ma ke sense.  Note that the database gives start and end position information for the entire class and just the name portion of the code item.  It is hard to see what file a code item comes from, however, as the code_items table uses a file_id and not a filename.  To see the filename as well, try the following query which does a join with the files table to get the filename.

SELECT     f.name, ci.*

FROM       code_items AS ci

           INNER JOIN files AS f ON ci.file_id = f.id

WHERE     (ci.kind = 1)

One thing to notice is that this query returns information on code items that occur in your own code as well as in the SDK and other headers.  You can return information for files that are explicitly in your solution by using the “config_files” table.  This table has a column that indicates whether a file is implicit or explicit.  A file may have multiple entries in the config_files table as a file can be used in multiple configs/projects.  The “DISTINCT” keyword prevents returning duplicate copies of code items.

SELECT   DISTINCT f.name, ci.*

FROM     code_items AS ci

         INNER JOIN files AS f ON ci.file_id = f.id

         INNER JOIN config_files AS cf ON ci.file_id = cf.file_id

WHERE    (cf.implicit = 0) AND (ci.kind = 1)

The parent_id in the code_items table refers to another item in the code_items table.  Using this information, you can get some parent/child information.  The id of 0 is the global namespace.  So, to get all functions in the global namespace you could do this.

SELECT    *

FROM      code_items

WHERE     (kind = 22) AND (parent_id = 0)

You can also join the code_items table to itself to find a set of code_items whose parent matches some set of criteria.  The following example finds all functions whose parent code_item is named “ATL”.

SELECT     ci1.*

FROM         code_items AS ci1 INNER JOIN

             code_items AS ci2 ON ci2.id = ci1.parent_id

WHERE     (ci1.kind = 22) AND (ci2.name = ‘ATL’)

I have only scratched the surface of what you can do to mine the SDF for interesting information about your source code.  There are many other ways to leverage the data contained in the SDF to gather information about your source code.  All of our browsing features for C/C++ are implemented on top of the SDF.  We do take advantage of some caching, prepared commands, and the special “table direct” mode that SSCE provides in order to increase performance, but everything comes from the SDF at some point.

The query processor in SSCE is limited in some ways, but you could even replicate the data from the SDF into a full SQL database and perform even more complex queries than are allowed by SSCE.

 

Jim Springfield
Visual C++ Architect

Category
C++

Author

0 comments

Discussion are closed.

Feedback