{"id":242271,"date":"2023-04-13T07:09:04","date_gmt":"2023-04-13T14:09:04","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/visualstudio\/?p=242271"},"modified":"2023-04-13T09:41:46","modified_gmt":"2023-04-13T16:41:46","slug":"improving-visual-studio-performance-with-the-new-instrumentation-tool","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/visualstudio\/improving-visual-studio-performance-with-the-new-instrumentation-tool\/","title":{"rendered":"Improving Visual Studio performance with the new Instrumentation Tool"},"content":{"rendered":"<h3>Summary<\/h3>\n<p>With the release of Visual Studio 2022 version 17.6 we are shipping our new and improved <a href=\"https:\/\/learn.microsoft.com\/visualstudio\/profiling\/instrumentation?view=vs-2022\" target=\"_blank\" rel=\"noopener\">Instrumentation Tool<\/a> in the Performance Profiler. Unlike the CPU Usage tool, the Instrumentation tool gives exact timing and call counts which can be super useful in spotting blocked time and average function time. To show off the tool let\u2019s use it to improve performance in Visual Studio.<\/p>\n<h3>Wait, I thought Visual Studio already had an Instrumentation tool?<\/h3>\n<p>If you were thinking, \u201cDoesn&#8217;t Visual Studio already have an Instrumentation Tool\u201d then you were thinking right! So, what\u2019s\u2026 new? Well, a whole list of things<\/p>\n<ul>\n<li><strong>Faster and less resources:<\/strong> The tool is significantly faster and uses less disk space, clone the repo and check out the measurements for yourself.\nSample App: <a href=\"https:\/\/github.com\/karpinsn\/ScrabbleFinderDotNet\">ScabbleFinderDotNet<\/a>\n<img decoding=\"async\" class=\"alignnone size-full wp-image-242402\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/04\/InstrumentationToolPerformance.png\" alt=\"Graph of instrumentation tool performance. 35x smaller file and 7x less overhead\" width=\"769\" height=\"283\" srcset=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/04\/InstrumentationToolPerformance.png 769w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/04\/InstrumentationToolPerformance-300x110.png 300w\" sizes=\"(max-width: 769px) 100vw, 769px\" \/><\/li>\n<li><strong>Enhanced targeting for .NET:<\/strong> The tool has better targeting for .NET scenarios so you can scope your instrumentation all the way down to specific functions to lower overhead and get better data.<\/li>\n<li><strong>Flame Graph:<\/strong> The flame graph helps you graphically see where the most amount of time is spent in your application and quickly narrow on areas to improve.<\/li>\n<li><strong>Better error handling:<\/strong> The tool helps with common issues such as missing the \u201c\/profiler\u201d linker flag for C++ projects. If you do hit issues that you can\u2019t figure out, we are ready to help over at <a href=\"https:\/\/developercommunity.visualstudio.com\/home\">Developer Community<\/a>.<\/li>\n<\/ul>\n<h3>Let\u2019s go mine for performance!<\/h3>\n<p>To start, I have a console application called AnalyzerBench which takes in a diagsession from the Performance Profiler and runs our analysis backend on it then exits. This lets me have a repeatable benchmark that I can measure and see the effects of my changes. The diagsession I have is from tracing every allocation of Visual Studio startup, over 4.3 million allocations, with the .NET Object Allocation tool. Running the Instrumentation tool from the Performance Profiler (Alt+F2) I get the following dialog:\n<a href=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/TargetedInstrumentationDialog.png\"><img decoding=\"async\" class=\"alignnone size-full wp-image-242277\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/TargetedInstrumentationDialog.png\" alt=\"Targeted instrumentation dialog of Visual Studio Instrumentation tool\" width=\"987\" height=\"557\" srcset=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/TargetedInstrumentationDialog.png 987w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/TargetedInstrumentationDialog-300x169.png 300w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/TargetedInstrumentationDialog-768x433.png 768w\" sizes=\"(max-width: 987px) 100vw, 987px\" \/><\/a><\/p>\n<p>This lets me select what projects I want to instrument, allowing me to target my instrumentation to reduce overhead. In my case I want to see our analysis for the .NET Allocation tool, so I select our DataWarehouse and the DotNetAllocAnalyzer projects but not AnalyzerBench since I don\u2019t care about that. If you are unsure of what you want to profile, the CPU Usage tool is great for getting a general idea of where time is being spent and then you can investigate deeper with the Instrumentation tool targeting specific areas. After running the tool, I see the following:\n<a href=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolSummaryPage.png\"><img decoding=\"async\" class=\"alignnone size-full wp-image-242276\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolSummaryPage.png\" alt=\"Summary view of Visual Studio Instrumentation tool\" width=\"1024\" height=\"768\" srcset=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolSummaryPage.png 1024w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolSummaryPage-300x225.png 300w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolSummaryPage-768x576.png 768w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><\/p>\n<p>The top functions show us which functions are taking the most amount of time and the hot path shows us the most expensive code path. My go-to is to open the details panel and switch over to the flame graph which shows the following.\n<a href=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolFlameChart.png\"><img decoding=\"async\" class=\"alignnone size-full wp-image-242273\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolFlameChart.png\" alt=\"Flame chart from Visual Studio Instrumentation Tool\" width=\"1024\" height=\"324\" srcset=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolFlameChart.png 1024w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolFlameChart-300x95.png 300w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolFlameChart-768x243.png 768w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><\/p>\n<p>Looking at the flame graph I see a <code>System.Threading.Monitor.Enter<\/code> taking ~20% of the time which is interesting. By right clicking on the node, I can cross reference to where this is happening in the call tree.\n<a href=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolLockContention.png\"><img decoding=\"async\" class=\"alignnone size-full wp-image-242274\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolLockContention.png\" alt=\"Call tree and source view of Visual Studio Instrumentation tool showing lock contention\" width=\"800\" height=\"600\" srcset=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolLockContention.png 800w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolLockContention-300x225.png 300w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolLockContention-768x576.png 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/><\/a><\/p>\n<p>Turns out our <code>Monitor.Enter<\/code> function is showing as a hot function and its parent <code>ImportDataSource<\/code> is taking ~17% of the total time. In the call tree you\u2019ll notice there are a few new columns with more hidden in the context menu of the header. Since the instrumentation tool gives us exact call counts, we can calculate stats like min, max, and average function time. Not only does Instrumentation give us exact call counts, but it also measures wall clock time. This lets us see non-CPU bound issues like contention. In this case, based on the call count we can see that there are three calls to <code>ImportDataSource<\/code> and they spend about 5s on average waiting on the lock. What is really happening is that the first call gets the lock and the other two wait ~8s for the first data source to complete before they can import. This means 2 thread pool threads are synchronously blocked and can\u2019t do any work which could lead to thread pool starvation.<\/p>\n<p>To fix this we could investigate using some parallel data structures so that the lock isn\u2019t needed or change the method signature so that it is async and use <a href=\"https:\/\/learn.microsoft.com\/dotnet\/api\/system.threading.semaphoreslim.waitasync?view=net-7.0\" target=\"_blank\" rel=\"noopener\">SemaphoreSlim.WaitAsync<\/a> so that we at least don\u2019t block thread pool threads. As both changes are a bit more involved, we can add a TODO to the code and come back later.<\/p>\n<p>Switching back to the flame graph, the next thing jumping out at me is <code>List.Sort<\/code>, from a glance it looks like we are spending ~20% of our time sorting some data. Again, right-clicking on the node and cross referencing over to the call tree we can see our details stats. This shows we are calling sort over 24K times, spending 20 seconds sorting data!\n<a href=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolSortIssue.png\"><img decoding=\"async\" class=\"alignnone size-full wp-image-242275\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolSortIssue.png\" alt=\"Call tree and source view of Visual Studio Instrumentation tool showing time spent in sort\" width=\"800\" height=\"600\" srcset=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolSortIssue.png 800w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolSortIssue-300x225.png 300w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolSortIssue-768x576.png 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/><\/a><\/p>\n<p>In this code we are sorting the allocations for each unique type which is needed to quickly filter when selecting time ranges on the graph in the profiler. For the most part, though these allocations should be sorted as we get a callback for each allocation and write it to our diagsession file. If there are many allocations at the same time on different threads they might be out of order, but this is probably something we can check for when adding to the list and then only sort the unsorted lists here. Making this change and rerunning the instrumentation tool we can see that now we have nearly removed all the time spent here, and in the case of this trace we didn\u2019t have to sort anything as all of the allocations were already sorted in the file.<\/p>\n<pre class=\"prettyprint language-cs language-csharp\"><code class=\"language-cs language-csharp\">\/\/\/ &lt;summary&gt;\r\n\/\/\/ Add an allocation instance to the current type\r\n\/\/\/ &lt;\/summary&gt;\r\n\/\/\/ &lt;param name=\"allocationObject\"&gt;allocation object&lt;\/param&gt;\r\ninternal void AddAllocation(AllocationObject allocationObject)\r\n{\r\n    if (this.Allocations.Count &gt; 0)\r\n    {\r\n        this.allocationsSorted &amp;= allocationObject.AllocTimeNs &gt;= this.Allocations[this.Allocations.Count - 1].AllocTimeNs;\r\n    }\r\n\r\n    this.Allocations.Add(allocationObject);\r\n}\r\n\r\n\/\/\/ &lt;summary&gt;\r\n\/\/\/ Finalizes the data for fast retrieval\r\n\/\/\/ &lt;\/summary&gt;\r\npublic void FinalizeData()\r\n{\r\n    if (!this.allocationsSorted)\r\n    {\r\n        this.Allocations.Sort(TypeObject.comparer);\r\n        this.allocationsSorted = true;\r\n    }\r\n}\r\n<\/code><\/pre>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolFixedSort.png\"><img decoding=\"async\" class=\"alignnone size-full wp-image-242272\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolFixedSort.png\" alt=\"Call tree and source view in Visual Studio instrumentation tool showing improved sort performance\" width=\"1037\" height=\"775\" srcset=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolFixedSort.png 1037w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolFixedSort-300x224.png 300w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolFixedSort-1024x765.png 1024w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2023\/03\/InstrumentationToolFixedSort-768x574.png 768w\" sizes=\"(max-width: 1037px) 100vw, 1037px\" \/><\/a><\/p>\n<p>A small change to track the sort state of the lists here has trimmed 20 seconds off a 110 second trace resulting in ~20% performance improvement.<\/p>\n<h3>Conclusion<\/h3>\n<p>The new Instrumentation tool is awesome (at least I think so \ud83d\ude0a) and a little performance investigation can go a long way. In less than an hour of profiling and poking at the code I improved load performance of the .NET Allocation tool by ~20%. Let me know what you found while profiling your code and what factor of improvements you were able to achieve with the new Instrumentation tool!<\/p>\n<p><div  class=\"d-flex justify-content-center\"><a class=\"cta_button_link btn-primary mb-24\" href=\"https:\/\/www.surveymonkey.com\/r\/SZZW665\" target=\"_blank\">Talk with us<\/a><\/div><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Summary With the release of Visual Studio 2022 version 17.6 we are shipping our new and improved Instrumentation Tool in the Performance Profiler. Unlike the CPU Usage tool, the Instrumentation tool gives exact timing and call counts which can be super useful in spotting blocked time and average function time. To show off the tool [&hellip;]<\/p>\n","protected":false},"author":54885,"featured_media":255385,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[155],"tags":[9,6839,6743,6815,6803],"class_list":["post-242271","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-visual-studio","tag-debug","tag-developer-productivity","tag-profiling","tag-visual-studio-2022","tag-visual-studio-profiler"],"acf":[],"blog_post_summary":"<p>Summary With the release of Visual Studio 2022 version 17.6 we are shipping our new and improved Instrumentation Tool in the Performance Profiler. Unlike the CPU Usage tool, the Instrumentation tool gives exact timing and call counts which can be super useful in spotting blocked time and average function time. To show off the tool [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/posts\/242271","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/users\/54885"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/comments?post=242271"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/posts\/242271\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/media\/255385"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/media?parent=242271"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/categories?post=242271"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/tags?post=242271"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}