{"id":39202,"date":"2022-03-22T10:23:01","date_gmt":"2022-03-22T17:23:01","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/dotnet\/?p=39202"},"modified":"2022-03-22T10:23:01","modified_gmt":"2022-03-22T17:23:01","slug":"go-to-definition-improvements-for-external-source-in-roslyn","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/dotnet\/go-to-definition-improvements-for-external-source-in-roslyn\/","title":{"rendered":"Go To Definition improvements for external source in Roslyn"},"content":{"rendered":"<p>In Visual Studio 17.1 there are a number of enhancements to <a href=\"https:\/\/docs.microsoft.com\/visualstudio\/ide\/navigating-code?view=vs-2022#go-to-definition\">Go To\nDefinition<\/a>\n(and Go To Implementation, and Go To Base, etc.) allowing you to\nnavigate to source code that isn&#8217;t in your current solution. In previous\nversions of Visual Studio when invoking Go To Definition on a symbol\nRoslyn would check if the symbol is defined in your current project or\nany referenced projects, and if a reference was found we would navigate\nyou to that symbol. If a reference was not found we would use\n<a href=\"https:\/\/github.com\/icsharpcode\/ILSpy\">ILSpy<\/a> to decompile part of the\nreferenced DLL and navigate you to the decompiled source of that symbol.<\/p>\n<p>Decompilation is a great way to get a feel for the shape of an API that\nyou&#8217;re referencing, but it does have some downsides, the biggest of\nwhich is that a decompilation can only be created from the IL that is in\nthe referenced DLL. This means things like comments, variable names and\nother parts of source code that aren&#8217;t represented in IL simply can&#8217;t be\nseen. Sometimes even the code you do see is different, due to compiler\nlowering. For example, when a simple interpolated string like\n$&#8221;{items.Count}&#8221; is compiled, the IL is indistinguishable from if you\nhad written string.Format(&#8220;{0}&#8221;, items.Count). These types of\ndifferences don&#8217;t change behavior, but they might be important to\nunderstanding the code.<\/p>\n<p>In the latest release we&#8217;ve added a few more steps to the process so\nthat where possible, we will now show you the real source code for the\nsymbol, matching exactly what the compiler used to create the DLL.<\/p>\n<h2>Finding the PDB file<\/h2>\n<p>The first step to locating the real source of something is to find the\nPDB file. PDB stands for &#8220;Program Database&#8221; and it is the file format\nused to store extra information about a library to help with debugging\nand other scenarios.<\/p>\n<p>The easiest location to find the PDB file, and the first one that is\nchecked, is right next to the DLL on your disk. Whilst the contents of\nthe PDB file don&#8217;t affect anything that happens at runtime, it&#8217;s very\ncommon for them to exist for debug builds, and not too uncommon for\nrelease builds too.<\/p>\n<p>The next easiest location for the PDB file is embedded within the DLL\nitself. For the portable PDB format it is possible to specify\n&lt;DebugType&gt;embedded&lt;\/DebugType&gt; in your csproj file and instead of\nwriting the PDB file to disk, the PDB file itself will be embedded in\nthe DLL. This has the advantage of easy distribution of debug\ninformation, at the cost of a small increase in file size.<\/p>\n<p>Fortunately, on disk and embedded PDBs are relatively easy to find using\nthe helpful\n<a href=\"https:\/\/github.com\/dotnet\/runtime\/blob\/main\/src\/libraries\/System.Reflection.Metadata\/src\/System\/Reflection\/PortableExecutable\/PEReader.cs#L700\">TryOpenAssociatedPortablePdb<\/a>\nmethod that is part of System.Reflection.Metadata.PEReader in .NET.<\/p>\n<p>If that method fails however, then we have to work a little harder, and\ntry to find the PDB on a symbol server. A symbol server is a system that\nstores and indexes PDBs for later download and use, normally by the\ndebugger. There are various ways to control a symbol server search and\nthese can be configured in the Tools &gt; Options &gt; Debugger &gt; Symbols\npage in Visual Studio. By default, you should see entries for the\nMicrosoft Symbol Server and NuGet Symbol Server, but you might need to\nenable them. You can also add symbol servers from an Azure DevOps\ninstance, or any other private symbol server that might be available to\nyou.<\/p>\n<p>It is important that the right PDB is downloaded for the specific build\nof the DLL that is being referenced, and so the search uses various\npieces of information that are pulled out of the DLL. This information\nhelps the debugger locate the right PDB on the symbol server and\nvalidate that it is the one that matches the DLL.<\/p>\n<p>So how does Roslyn find this information in order to tell the debugger?\nFor that we need to take a slight diversion into the world of Metadata.<\/p>\n<h2>Metadata<\/h2>\n<p>A DLL is, to simplify, made up of two parts; firstly, the executable\ncode written in IL, and secondly the metadata, which is information\nabout that IL and the DLL in general, which is needed to help the\nruntime understand the IL. The metadata stores, for example, the names\nof every field and method in the DLL, which types they are part of, or\nreturn, or take as parameters etc. You can think of the metadata as a\nlittle database in the middle of a DLL, which has some tables in it\ndescribing the code the DLL came from. We can see these tables in ILSpy\n(pictured), or you can use an online tool like\n<a href=\"https:\/\/penet.azureedge.net\/\">PeNet<\/a> which shows things in a little\nmore of a &#8220;raw&#8221; format.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2022\/03\/image1.png\" alt=\"Screenshot of ILSpy showing the Debug Directory metadata table\" \/><\/p>\n<p>In the above screenshot you can see the Debug Directory table is\nhighlighted, and the row within that which has a Type of &#8220;CodeView&#8221;.\nThis is where most of the information we need comes from. Using PeNet\nand clicking the &#8220;Debug&#8221; button on the left will also show you the\nCodeView entry itself, however, doesn&#8217;t decode the Type column, so pick\nthe tool you prefer to use.<\/p>\n<p>Another useful tool for viewing metadata is mdv which is a console\napplication you can build from source here:\n<a href=\"https:\/\/github.com\/dotnet\/metadata-tools\">https:\/\/github.com\/dotnet\/metadata-tools<\/a><\/p>\n<p>Roslyn <a href=\"https:\/\/github.com\/dotnet\/roslyn\/blob\/release\/dev17.1-vs-deps\/src\/VisualStudio\/Core\/Def\/PdbSourceDocument\/SourceLinkService.cs#L39\">iterates through the debug\ndirectory<\/a>,\ngathering all of the information that the debugger needs from the\nCodeView and PdbChecksum entries, passes it to the debugger, and it\nworks its magic. The debugger implements the symbol server protocol (see\n<a href=\"https:\/\/github.com\/dotnet\/symstore\/tree\/main\/docs\/specs\">https:\/\/github.com\/dotnet\/symstore\/tree\/main\/docs\/specs<\/a>) to retrieve\nthe PDB from the symbol server. It also uses local caches configured in\ndebug configuration to speed up the look up next time it&#8217;s asked for\nthe same PDB. You can see the output of the search either from the\nModules tool window when the debugger is active by right clicking on an\nentry and selecting Show Symbol Load Information, or after using Go To\nDefinition you can find the &#8220;Navigate to External Sources&#8221; category in\nthe Output Window and some information will be shown there. Note that\nRoslyn only logs in depth symbol search information if the search fails,\nwhereas the Modules window always has it available.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2022\/03\/image2.png\" alt=\"Screenshot of the Output pane of Visual Studio showing the Navigate to External Sources category\" \/><\/p>\n<h2>Finding the source code<\/h2>\n<p>Now that we have the PDB file, we have access to a lot more information\nabout where this DLL came from, and we can go ahead and try to find the\noriginal source code. As with PDB files themselves, the most\nstraightforward place the source code can be is either on the local\ndisk, or embedded, and so those are the two places checked first, and if\nneither of those are fruitful then we use another API provided by the\ndebugger and try to download the source via <a href=\"https:\/\/docs.microsoft.com\/dotnet\/standard\/library-guidance\/sourcelink\">Source\nLink<\/a>.<\/p>\n<p>Before we can use either of these methods to find the source code though\nwe need to know which source file we&#8217;re looking for, and for that we\nneed to go back to the metadata tables, both from the DLL file but also\nfrom the PDB this time, since we now have that information.<\/p>\n<h2>Finding the Document record<\/h2>\n<p>When Roslyn tries to find source for a Go To Definition command it is\nessentially saying &#8220;Find the source for this <a href=\"https:\/\/github.com\/dotnet\/roslyn\/blob\/main\/src\/Compilers\/Core\/Portable\/Symbols\/ISymbol.cs#L24\">ISymbol<\/a>&#8220;.\nA symbol is the representation of the thing that we&#8217;re trying to go to,\nbe it a type, method, property, etc. Each symbol that comes from\nmetadata holds a\n<a href=\"https:\/\/github.com\/dotnet\/roslyn\/blob\/main\/src\/Compilers\/Core\/Portable\/Symbols\/ISymbol.cs#L64\">MetadataToken<\/a>\nwhich you can think of as the key to the database table in the metadata\nthat holds information about the symbol.<\/p>\n<p>Let&#8217;s say we&#8217;re trying to navigate to the definition of a method called\n&#8220;ReadEntities&#8221; which has a MetadataToken of 0x06000038. In order to\nconserve space, like a lot of things in metadata, these tokens pack two\npieces of information into a single four byte number: The first byte,\n06, means this token is for the Method table, and the remaining three\nbytes, 000038, means it is for the 56^th^ row of that table (as 38 in\nhexadecimal is 56 is decimal).<\/p>\n<p>Looking in ILSpy we can see the information that is stored for this\nmethod:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2022\/03\/image3.png\" alt=\"Screenshot of ILSpy showing the Method metadata table\" \/><\/p>\n<p>This screenshot also reveals another detail of metadata, which is that\nit describes runtime concepts rather than language concepts. For\nexample, C# constructors are a language concept, but to the runtime they\nare the same as methods, albeit methods that happen to be called .ctor.\nSimilarly, you can see get_Notes, which is the property getter for the\nNotes property on a type, but again to the runtime it&#8217;s just another\nmethod.<\/p>\n<p>Now that we have found the method row, we can continue to dig into the\nmetadata to find the Document row associated, which comes from the PDB\nmetadata. Exactly how we do this is straightforward, but detailed. You\ncan <a href=\"https:\/\/github.com\/dotnet\/roslyn\/blob\/release\/dev17.1-vs-deps\/src\/Features\/Core\/Portable\/PdbSourceDocument\/SymbolSourceDocumentFinder.cs#L14\">read the\ncode<\/a>\nif you&#8217;re interested but the general idea is that, again in the\ninterests of space saving, we only store info for a particular document\nonce. In the past that meant only for methods that have bodies where\nbreakpoints can be placed, as there is a pre-existing concept in\nportable PDBs to store that info which is already in use. For navigation\nthough we need a little more info, so we added the ability to store\n<a href=\"https:\/\/github.com\/dotnet\/roslyn\/blob\/main\/src\/Dependencies\/CodeAnalysis.Debugging\/PortableCustomDebugInfoKinds.cs#L24\">document\ninfo<\/a>\nfor types that would otherwise have no document info recorded, like\ninterfaces. What that means in practice is that if we can&#8217;t find\ndocument info for a method, or we&#8217;re looking at a field etc. we check\nthe containing type, and if we can&#8217;t find document info for a type, we\ncheck for one of the methods it contains (and vice versa!)<\/p>\n<p>For methods we look for the corresponding row in the\nMethodDebugInformation table, which links to the Document table:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2022\/03\/image4.png\" alt=\"Screenshot of ILSpy showing the MethodDebugInformation metadata table\" \/><\/p>\n<p>Unlike a normal relational database, there isn&#8217;t a field to refer to the\nMethod that the row is talking about, the table just uses the same ID as\nthe Method table itself, so row 56 of the MethodDebugInformation is for\nrow 56 of the Method table. Here we can see a Document field, with the\nvalue 0x3000000D. Once again this is a metadata token pointing to the\nDocument table (30) row 13 (00000D in decimal), and we have our result.<\/p>\n<p>For types, the same theory is used except we use the\nCustomDebugInformation table which is a more general purpose table, so\nfinding the record means looking up not only the ID it refers to, but\nalso a record type. It then similarly points to a row in the Document\ntable, and we have the info we need.<\/p>\n<h2>Reading the Document record<\/h2>\n<p>The document record is where we finally find the information needed to\nload the original source file, in one of three different formats.\nFirstly, and again the simplest case, it could contain the full path to\nthe source file as it was when the DLL was originally compiled, and the\nsource code could be on disk at that location. This is probably not very\nlikely to be helpful, but maybe for very controlled enterprise\nenvironments, or independent developers who reference their own\npackages, it might be just enough.<\/p>\n<p>Secondly the document record could contain a relative path to the source\nfile, with an associated CustomDebugInformation record that stores the\nactual source for the file, compressed and embedded in the PDB itself.\nIn this case Roslyn will <a href=\"https:\/\/github.com\/dotnet\/roslyn\/blob\/release\/dev17.1-vs-deps\/src\/Features\/Core\/Portable\/PdbSourceDocument\/PdbSourceDocumentLoaderService.cs#L61\">read, decompress and write it to a temp\nfile<\/a>\nso that it can be navigated to. Enabling source embedding in a project\ncan be as easy as adding &lt;EmbedAllSources&gt;true&lt;\/EmbedAllSources&gt; to\na .csproj, again sacrificing file size for easy of distribution.<\/p>\n<p>Finally, the Document record could contain a path and the\nCustomDebugInformation could contain a Source Link map, in JSON format,\nwhich tells the system how to map the path to a URL where it can\ndownload the source file from a source control repository.<\/p>\n<h2>Source Link<\/h2>\n<p>Source Link stores a mapping from a relative folder path to an absolute\nrepository URL. For the venerable NewtonSoft.Json library the Source\nLink information can be seen below:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2022\/03\/image5.png\" alt=\"Screenshot of ILSpy showing the CustomDebugInformation metadata table\" \/><\/p>\n<p>This maps any path starting with &#8220;\/_\/&#8221; to the URL shown. Notice how the\nURL contains a commit hash, which ensures that the right source will be\nshown for the exact build being referenced. There are a number of Source\nLink packages that can be referenced by libraries, with each one\nunderstanding a different repository provider (eg, GitHub, Azure DevOps,\nGitLab etc.).<\/p>\n<p>The document info for a document from this PDB contains a normalized\npath, where the common directory prefix has been replaced &#8220;\/_\/&#8221; as per\nthe map above.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2022\/03\/image6.png\" alt=\"Screenshot of ILSpy showing the Document metadata table\" \/><\/p>\n<p>Once the relative file path and base URL are known downloading the file\nis straight forward, though once again we rely on services provided by\nthe Visual Studio Debugger to do so. This ensures that any source code\ndownloaded for the purposes of navigation is automatically available for\ndebugging, so features like breakpoints etc. will work as you would\nexpect. This also means authentication and caching is centrally handled.<\/p>\n<h2>Navigation<\/h2>\n<p>Now that we know where the symbol comes from, and we have a file on\ndisk, the navigation can happen. All of the downloading of symbols, and\nsource files, could take some time so for now there are various timeouts\nin place to ensure you&#8217;re not left hanging when you want to see some\nsource code. If you see the decompilation for something that you think\nshould have Source Link support, or you notice a timeout in the output\nwindow, you can always try again later and the download might have\nfinished in the background!<\/p>\n<p>Of course, all of the PDBs and source files have various checksums,\nhashes and other checks to ensure that the source you&#8217;re seeing is the\nactual source that was used to compile the DLL, so there are other\nreasons that could prevent this working, but they will be noted in the\noutput window. At the end of the day we are striving for accuracy so\neven a decompilation is better than an incorrect copy of the original\nsource, even if it does have variable names and comments.<\/p>\n<h2>Give us your feedback<\/h2>\n<p>This isn&#8217;t the end of the improvements to this area, and we already have\nideas for more improvements like:<\/p>\n<ul>\n<li><a href=\"https:\/\/github.com\/dotnet\/roslyn\/issues\/55834\">Supporting native PDBs, as well as portable<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/dotnet\/roslyn\/issues\/55834\">Finding the implementation DLL for a reference DLL that doesn&#8217;t have PDB info<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/dotnet\/roslyn\/issues\/55834\">Being able to switch between Embedded, Source Link or decompiled source at will<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/dotnet\/roslyn\/issues\/55834\">Better support for deep navigation for indirectly referenced DLLs<\/a><\/li>\n<\/ul>\n<p>We would love to get your feedback on the new Go To Definition behavior\nso\u00a0please give it a try\u00a0and let us know what you think!\u00a0You can share\nyour feedback with us by creating an issue on Roslyn&#8217;s open source repo\non\u00a0<a href=\"https:\/\/github.com\/dotnet\/roslyn\">GitHub<\/a>.\u00a0We appreciate your\nfeedback!\u00a0Also be sure to let us know if you like this in-depth\ntechnical type of blog post, or if you went cross-eyed halfway through.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>An in depth look at improvements to Go To Definition (and Go To Implementation, and Go To Base, etc.) allowing you to navigate to source code that isn&#8217;t in your current solution, but instead comes from external dependencies.<\/p>\n","protected":false},"author":38824,"featured_media":39203,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[685,196,195,3012,7196,646],"tags":[120],"class_list":["post-39202","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-dotnet","category-dotnet-core","category-dotnet-framework","category-internals","category-debugging","category-visual-studio","tag-roslyn"],"acf":[],"blog_post_summary":"<p>An in depth look at improvements to Go To Definition (and Go To Implementation, and Go To Base, etc.) allowing you to navigate to source code that isn&#8217;t in your current solution, but instead comes from external dependencies.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/39202","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/users\/38824"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/comments?post=39202"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/39202\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/media\/39203"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/media?parent=39202"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/categories?post=39202"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/tags?post=39202"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}