{"id":15025,"date":"2017-10-23T09:41:13","date_gmt":"2017-10-23T16:41:13","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/dotnet\/?p=15025"},"modified":"2021-09-29T16:38:07","modified_gmt":"2021-09-29T23:38:07","slug":"net-core-performance-profiling-with-intel-vtune-amplifier-2018","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/dotnet\/net-core-performance-profiling-with-intel-vtune-amplifier-2018\/","title":{"rendered":".NET Core Performance Profiling with Intel\u00ae VTune\u2122 Amplifier 2018"},"content":{"rendered":"<p>Last Updated: 2018-04-30<\/p>\n<blockquote><p>This post was written by\u00a0<a href=\"https:\/\/github.com\/vkvenkat\">Varun Venkatesan<\/a>, <a href=\"https:\/\/github.com\/litian2025\">Li Tian<\/a>, and\u00a0<a href=\"https:\/\/github.com\/jarodrig\">Juan Rodriguez<\/a>, engineers at <a href=\"https:\/\/www.intel.com\">Intel<\/a>. They are excited to share .NET Core-specific enhancements that Intel has made to VTune Amplifier 2018. We&#8217;re excited to have\u00a0a new tool to use to help make .NET Core faster on Intel chips.<\/p><\/blockquote>\n<p>Intel has been a strong partner in the development and advancement of Microsoft\u2019s .NET ecosystem, starting with our co-sponsorship (along with Hewlett-Packard) of the <a href=\"http:\/\/www.ecma-international.org\/memento\/TC49-TG3.htm\">ECMA TC39\/TG3<\/a> Common Language Infrastructure standardization process; through co-developing and optimizing several .NET Framework releases for scalability and performance; and moving into a new phase of investment in cross-platform, open source .NET for our joint customers.<\/p>\n<p>Our objective is to ensure .NET delivers the best power\/performance, scalable, and robust experiences on Intel Architecture. If you are a .NET Core developer who is interested in understanding how efficient your managed code execution is at a processor architecture\/micro-architecture level, then read on.<\/p>\n<h2>Executive Summary<\/h2>\n<p><a href=\"https:\/\/software.intel.com\/en-us\/intel-vtune-amplifier-2018-release-notes\">Intel\u00ae VTune\u2122 Amplifier 2018 was released<\/a> in September 2017 and includes a preview feature for profiling Just-In-Time (JIT) compiled .NET Core code on Microsoft Windows* and Linux* operating systems. Note that previous versions of VTune Amplifier supported profiling of JIT compiled code for .NET Framework. This blog is intended to help developers identify and fix performance bottlenecks in their .NET Core applications using this preview feature. We also present some real-world scenarios where we used VTune Amplifier to identify performance issues.<\/p>\n<p style=\"vertical-align: baseline;margin: 0in 0in 18.75pt 0in\">Note that VTune is a commercial product. In some cases, you may be eligible to obtain a free copy of VTune under specific terms. To see if you qualify, please refer to <span style=\"color: black\"><a href=\"https:\/\/na01.safelinks.protection.outlook.com\/?url=https%3A%2F%2Fsoftware.intel.com%2Fen-us%2Fqualify-for-free-software&amp;data=02%7C01%7Crlander%40microsoft.com%7C1486ce8a72be4cc1265208d51a81cf5e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636444067248678918&amp;sdata=U7S%2B7LtEM17zmR9CaSnHfojznUfN2L6ej4E14ijG5v8%3D&amp;reserved=0\">https:\/\/software.intel.com\/en-us\/qualify-for-free-software<\/a>.<\/span><\/p>\n<p><strong><em>Update: <\/em><\/strong><a href=\"https:\/\/software.intel.com\/en-us\/intel-vtune-amplifier-2018-update-2-release-notes-what-s-new\"><em>VTune Amplifier 2018 Update 2<\/em><\/a><em> is now available and includes full feature support for Advanced Hotspots analysis for .NET Core applications running on Linux and Windows systems in the Launch Application mode. The environment variables used in prior releases to enable this as a preview feature are no longer needed. Additional info available in the below instructions.<\/em><\/p>\n<h2>Background<\/h2>\n<p>Developers using previous versions of VTune Amplifier for profiling their .NET Core applications would observe unresolved managed modules and functions, as shown in the figure below.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/vtune-previous-21.1.png\" alt=\"\" width=\"914\" height=\"291\" class=\"aligncenter size-full wp-image-15165\" \/><\/p>\n<p>VTune Amplifier 2018 addresses this issue and also provides assembly-level hot spots for managed functions.<\/p>\n<p>Here is the software configuration we have used for this blog:<\/p>\n<ul>\n<li>Windows Server 2016 version 1607 (we validated on Windows 10 Pro version 1607 too)<\/li>\n<li>Ubuntu* 14.04 (we validated on Ubuntu 16.04 too)<\/li>\n<li>.NET Core 2.0<\/li>\n<\/ul>\n<p>Native profiling with VTune Amplifier on macOS* is not currently available.<\/p>\n<h2>Profiling .NET Core application on Windows<\/h2>\n<p>This section shows how to use VTune Amplifier 2018 to profile a sample .NET Core application on Windows.<\/p>\n<ul>\n<li>Install <a href=\"https:\/\/software.intel.com\/en-us\/intel-vtune-amplifier-xe\">VTune Amplifier 2018<\/a>.<\/li>\n<li>Install the <a href=\"https:\/\/www.microsoft.com\/net\/core#windowscmd\">.NET Core 2.0 SDK<\/a>.<\/li>\n<li>Open a new command window for the dotnet environment variables to take effect. Make sure that .NET Core 2.0 was successfully installed with \u201cdotnet &#8211;version\u201d.<\/li>\n<li>Run the command &#8220;dotnet new console -o listadd&#8221; to create a new skeleton project with the following structure:<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/skeleton-project.png\" alt=\"\" width=\"883\" height=\"363\" class=\"aligncenter size-full wp-image-15045\" srcset=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/skeleton-project.png 883w, https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/skeleton-project-300x123.png 300w, https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/skeleton-project-768x316.png 768w\" sizes=\"(max-width: 883px) 100vw, 883px\" \/><\/p>\n<ul>\n<li>We will replace the contents of Program.cs in the \u201clistadd\u201d folder with C# code that adds the elements of an integer List, available <a href=\"https:\/\/gist.github.com\/anonymous\/e271a4225c38eadeff02e023ba4d314f\">here<\/a>.<\/li>\n<li>Add the following flag to the PropertyGroup section of the csproj file to enable Source-Assembly mapping in VTune Amplifier (currently available only for Windows):<\/li>\n<\/ul>\n<pre style=\"padding-left: 60px\"><strong>&lt;DebugType&gt;pdbonly&lt;\/DebugType&gt;<\/strong><\/pre>\n<ul>\n<li>Run the command \u201cdotnet build \u2013c Release\u201d to create \u201cdll\u201d in the \u201cC:listaddbinReleasenetcoreapp2.0\u201d folder.<\/li>\n<li>Now run the sample app: dotnet C:listaddbinReleasenetcoreapp2.0listadd.dll<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/run-sample-app.png\" alt=\"\" width=\"1121\" height=\"185\" class=\"size-full wp-image-15055 alignnone\" srcset=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/run-sample-app.png 1121w, https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/run-sample-app-300x50.png 300w, https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/run-sample-app-768x127.png 768w, https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/run-sample-app-1024x169.png 1024w\" sizes=\"(max-width: 1121px) 100vw, 1121px\" \/><\/p>\n<ul>\n<li>Next let\u2019s use VTune Amplifier 2018 to profile the sample app. First, create a file called &#8220;environment.cmd&#8221; with the following contents:<\/li>\n<\/ul>\n<p style=\"padding-left: 60px\"><strong><\/strong><\/p>\n<pre style=\"padding-left: 60px\">set CORECLR_ENABLE_PROFILING 1\nset CORECLR_PROFILER {AA5E4821-E3B1-479c-B7FF-5AD047D22CED}<\/pre>\n<p style=\"padding-left: 60px\">Run the command \u201cenvironment.cmd\u201d to setup the environment for VTune Amplifier.<\/p>\n<p style=\"padding-left: 60px\">Note: You can also set system level environment as below instead of calling \u201cenvironment.cmd\u201d each time<\/p>\n<pre style=\"padding-left: 60px\">setx CORECLR_ENABLE_PROFILING=1\nsetx CORECLR_PROFILER={AA5E4821-E3B1-479c-B7FF-5AD047D22CED}<\/pre>\n<p style=\"padding-left: 60px\">When this preview feature becomes generally available in future VTune Amplifier releases, this environment setting will no longer be needed.<\/p>\n<p style=\"padding-left: 60px\"><strong><em>Update: The above environment variables (CORECLR_ENABLE_PROFILING &amp; CORECLR_PROFILER) no longer need to be set as of VTune 2018 Update 2.<\/em><\/strong><\/p>\n<ul>\n<li>Launch VTune Amplifier with administrator privileges.<\/li>\n<li>Create a new project, right-click the project name and then select \u201cNew Analysis\u201d.<\/li>\n<li>Use the \u201cLaunch Application\u201d mode as the target type in the \u201cAnalysis Target\u201d tab. Fill up the \u201cApplication\u201d and \u201cApplication parameters\u201d fields:\n<ul>\n<li>Application: C:Program Filesdotnetdotnet.exe<\/li>\n<li>Application Parameters: C:listaddbinReleasenetcoreapp2.0listadd.dll<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>Note: The location of dotnet.exe in the above may need to be changed to reflect one\u2019s own environment and can be found with \u201cwhere dotnet\u201d.<\/p>\n<ul>\n<li>Click \u201cChoose Analysis\u201d.<\/li>\n<li>Select \u201cAdvanced Hotspots\u201d mode under \u201cAlgorithm Analysis\u201d in the \u201cAnalysis Type\u201d tab.<\/li>\n<li>Click \u201cStart\u201d.<\/li>\n<li>After data collection is completed, select the \u201cBottom-up\u201d tab. Then select \u201cProcess\/Module\/Function\/Thread\/Call Stack\u201d in the Grouping drop-down.<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/bottom-up-tab.png\" alt=\"\" width=\"925\" height=\"188\" class=\"aligncenter size-full wp-image-15065\" srcset=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/bottom-up-tab.png 925w, https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/bottom-up-tab-300x61.png 300w, https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/bottom-up-tab-768x156.png 768w\" sizes=\"(max-width: 925px) 100vw, 925px\" \/><\/p>\n<ul>\n<li>Expand \u201cexe\u201d and then \u201clistadd.dll\u201d. This will display the managed function in our sample application \u2013 \u201cListSample::Program::ListAdd\u201d.<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/vtune-after-2.1.png\" alt=\"\" width=\"926\" height=\"215\" class=\"aligncenter size-full wp-image-15166\" \/><\/p>\n<ul>\n<li>Double-click the \u201cListSample::Program::ListAdd\u201d function. The source-level profile would be displayed by default.<\/li>\n<li>To view the source and assembly profiles side-by-side, click on the \u201cAssembly\u201d button at the top. Developers could then look at snippets of code contributing the most to overall time and work on optimizing their code.<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/assembly.png\" alt=\"\" width=\"968\" height=\"900\" class=\"aligncenter size-full wp-image-15075\" srcset=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/assembly.png 968w, https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/assembly-300x279.png 300w, https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/assembly-768x714.png 768w\" sizes=\"(max-width: 968px) 100vw, 968px\" \/><\/p>\n<h2>Profiling .NET Core application on Linux<\/h2>\n<p>This section shows how to use VTune Amplifier 2018 to profile a sample .NET Core application on Linux.<\/p>\n<ul>\n<li>Install <a href=\"https:\/\/software.intel.com\/en-us\/intel-vtune-amplifier-xe\">VTune Amplifier 2018<\/a>.<\/li>\n<li>Install the <a href=\"https:\/\/www.microsoft.com\/net\/core#linuxubuntu\">.NET Core 2.0 SDK<\/a>.<\/li>\n<li>Make sure that .NET Core 2.0 was successfully installed with \u201cdotnet &#8211;version\u201d.<\/li>\n<li>Run the command &#8220;dotnet new console -o listadd&#8221; to create a new skeleton project with the following structure:<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/skeleton-project-linux.1.png\" alt=\"\" width=\"1118\" height=\"168\" class=\"aligncenter size-full wp-image-15085\" \/><\/p>\n<ul>\n<li>Then replace the contents of Program.cs in the \u201clistadd\u201d folder with C# code that adds the elements of an integer List, available <a href=\"https:\/\/gist.github.com\/anonymous\/e271a4225c38eadeff02e023ba4d314f\">here<\/a>.<\/li>\n<li>Run the command \u201cdotnet build -c Release\u201d to create \u201cdll\u201d in the \u201c~\/listadd\/bin\/Release\/netcoreapp2.0\u201d folder.<\/li>\n<li>Now run the sample app: dotnet ~\/listadd\/bin\/Release\/netcoreapp2.0\/listadd.dll<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/run-sample-app-linux.1.png\" alt=\"\" width=\"1122\" height=\"165\" class=\"aligncenter size-full wp-image-15086\" \/><\/p>\n<ul>\n<li>Next let\u2019s use VTune Amplifier 2018 to profile the sample app. First, create a file called sh with the following contents:<\/li>\n<\/ul>\n<p style=\"padding-left: 90px\">echo 0 | sudo tee \/proc\/sys\/kernel\/watchdog\necho 0 | sudo tee \/proc\/sys\/kernel\/yama\/ptrace_scope\necho 0 | sudo tee \/proc\/sys\/kernel\/kptr_restrict\n<strong>export AMPLXE_EXPERIMENTAL=coreclr\n<\/strong>cd \/opt\/intel\/vtune_amplifier\nsudo &#8211;sh `source amplxe-vars.sh; amplxe-gui`<\/p>\n<p style=\"padding-left: 60px\">Run the command \u201cchmod +x environment.sh\u201d followed by \u201c.\/environment.sh\u201d to launch VTune Amplifier with sudo privileges.<\/p>\n<p style=\"padding-left: 60px\">Note: \u00a0When this preview feature becomes generally available in future VTune Amplifier releases, the environment setting will no longer be needed.<\/p>\n<p style=\"padding-left: 60px\"><strong><em>Update: The above environment variable (AMPLXE_EXPERIMENTAL) no longer needs to be set as of VTune 2018 Update 2.<\/em><\/strong><\/p>\n<ul>\n<li>Create a new VTune Amplifier project. Right-click the project and select \u201cNew Analysis\u201d.<\/li>\n<li>Use the \u201cLaunch Application\u201d mode as the target type in the \u201cAnalysis Target\u201d tab. Fill up the \u201cApplication\u201d and \u201cApplication parameters\u201d fields:\n<ul>\n<li>Application: \/usr\/bin\/dotnet<\/li>\n<li>Application Parameters: \/home\/perftest\/listadd\/bin\/Release\/netcoreapp2.0\/listadd.dll<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p style=\"padding-left: 60px\">Note: The location of dotnet and listadd.dll in the above may need to be changed to reflect one\u2019s own environment. The location of dotnet can be found with \u201cwhich dotnet\u201d.<\/p>\n<ul>\n<li>Click \u201cChoose Analysis\u201d.<\/li>\n<li>Select \u201cAdvanced Hotspots\u201d mode under \u201cAlgorithm Analysis\u201d:<\/li>\n<li>Click \u201cStart\u201d.<\/li>\n<li>After data collection is completed, select the \u201cBottom-up\u201d tab. Then select \u201cProcess\/Module\/Function\/Thread\/Call Stack\u201d in the Grouping drop-down.<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/bottom-up-tab-linux-3.png\" alt=\"\" width=\"1011\" height=\"193\" class=\"aligncenter size-full wp-image-15095\" \/><\/p>\n<ul>\n<li>Expand \u201cdotnet\u201d and then \u201cdll\u201d. This will display the managed function in our application \u2013 \u201cProgram::ListAdd\u201d.<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/list-add-linux-2.1.png\" alt=\"\" width=\"1009\" height=\"193\" class=\"aligncenter size-full wp-image-15175\" \/><\/p>\n<ul>\n<li>Double-click the \u201cProgram::ListAdd\u201d function. A prompt to search for sources is displayed. Source-Assembly mapping is not yet enabled for Linux. Just select \u201cShow Assembly\u201d to look at JIT generated code. Developers could then investigate snippets of code contributing the most to overall time and work on optimizing their code.<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/assembly-linux.png\" alt=\"\" width=\"846\" height=\"705\" class=\"aligncenter size-full wp-image-15107\" srcset=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/assembly-linux.png 846w, https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/assembly-linux-300x250.png 300w, https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/assembly-linux-768x640.png 768w\" sizes=\"(max-width: 846px) 100vw, 846px\" \/><\/p>\n<h2>Real-world scenarios<\/h2>\n<h3>Scenario 1: C# optimizations<\/h3>\n<p>Let\u2019s start with the C# sample application referenced in the above instructions. VTune Amplifier shows that majority of the CPU time is spent on the following statement:<\/p>\n<pre>foreach (int item in candidateList)<\/pre>\n<p>This can be optimized to use a for loop statement instead to avoid the overhead of enumerators, as explained <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/ms973839.aspx\">here<\/a>. Replace the contents of Program.cs with the C# code available <a href=\"https:\/\/gist.github.com\/anonymous\/adc5364fac5f76234685829e12bf7e0e\">here<\/a>.<\/p>\n<p>We profiled the sample application with VTune Amplifier before and after the above change. The application ran for 2.667s<sup>1<\/sup> before the change:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/sample1-before1.1.png\" alt=\"\" width=\"744\" height=\"217\" class=\"aligncenter size-full wp-image-15185\" \/><\/p>\n<p>The application ran for 0.924s<sup>1<\/sup> after the change, leading to a 65% reduction in time over the original by avoiding the enumerator.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/sample1-after1.png\" alt=\"\" width=\"738\" height=\"192\" class=\"aligncenter size-full wp-image-15195\" \/><\/p>\n<p>The above is a simple illustration of how VTune Amplifier can be used to optimize .NET Core applications. Now let\u2019s take a look at a real world scenario where we used VTune Amplifier to optimize .NET Core.<\/p>\n<h3>Scenario 2: Vector Min Max optimizations<\/h3>\n<p>Let\u2019s now look at a sample application that exercises Vector Min\/Max operations, available <a href=\"https:\/\/gist.github.com\/anonymous\/5eb20b3977a5b46eef7b3b4b493e5c22\">here<\/a>. We used VTune Amplifier for performance analysis to ensure JIT code quality.<\/p>\n<p>Here is the source-assembly mapping for Vector.Min and Vector.Max:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/sample2-assembly-before-2.png\" alt=\"\" width=\"960\" height=\"406\" class=\"aligncenter size-full wp-image-15135\" \/><\/p>\n<p>We noticed that the JIT code is not efficient because the Intel\u00ae Advanced Vector Extensions (Intel\u00ae AVX) form of the integer min\/max instructions introduced in Intel\u00ae Streaming SIMD Extensions 4.1 (Intel\u00ae SSE4.1) were not being used. We added this support for the Vector&lt;T&gt; Min\/Max intrinsic which led to more efficient code generation. Based on this work, we submitted a <a href=\"https:\/\/github.com\/dotnet\/coreclr\/pull\/9567\">PR<\/a> to CoreCLR, which was later merged, resulting in improved Vector&lt;T&gt; code quality.<\/p>\n<p>Here is the source-assembly mapping for Vector.Min &amp; Vector.Max after our PR has been merged to the .NET Core repository:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/sample2-assembly-after.png\" alt=\"\" width=\"963\" height=\"367\" class=\"aligncenter size-full wp-image-15136\" \/><\/p>\n<p>The application ran for 8.189s<sup>2<\/sup> before our PR:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/sample2-before1-3.png\" alt=\"\" width=\"758\" height=\"191\" class=\"aligncenter size-full wp-image-15197\" \/><\/p>\n<p>The application ran for 5.353s<sup>2<\/sup> after our PR, leading to a 35% reduction in time over the original due to more efficient code generation:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2017\/10\/sample2-after1.1.png\" alt=\"\" width=\"732\" height=\"189\" class=\"aligncenter size-full wp-image-15196\" \/><\/p>\n<p>.NET developers can use VTune Amplifier to uncover similar performance bottlenecks in their applications.<\/p>\n<h2><strong>Summary<\/strong><\/h2>\n<p>The preview feature of VTune Amplifier 2018 for .NET Core JIT code profiling helps developers quickly locate performance hot spots in their applications and significantly improves developer productivity resulting in quick turn-around for optimizing their applications.<\/p>\n<h2><strong>References<\/strong><\/h2>\n<p>VTune Amplifier Product page: <a href=\"https:\/\/software.intel.com\/en-us\/intel-vtune-amplifier-xe\">https:\/\/software.intel.com\/en-us\/intel-vtune-amplifier-xe<\/a><\/p>\n<p>For more details on using the VTune Amplifier, see the product <a href=\"https:\/\/software.intel.com\/en-us\/vtune-amplifier-help\">online help<\/a>.<\/p>\n<p>For more complete information about compiler optimizations, see our\u00a0<a href=\"https:\/\/software.intel.com\/en-us\/articles\/optimization-notice#opt-en\">Optimization Notice<\/a>.<\/p>\n<div style=\"font-size: 9pt\">\n<p>No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.<\/p>\n<p>This document contains information on products, services and\/or processes in development.\u00a0 All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.<\/p>\n<p>The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request.<\/p>\n<p>Copies of documents which have an order number and are referenced in this document may be obtained by calling\u00a0<span>1-800-548-4725<\/span>\u00a0or by visiting\u00a0<a href=\"http:\/\/www.intel.com\/design\/literature.htm\"><strong>www.intel.com\/design\/literature.htm<\/strong><\/a>.<\/p>\n<p>Intel, the Intel logo, Xeon, VTune Amplifier are trademarks of Intel Corporation in the U.S. and\/or other countries.<\/p>\n<p>*Other names and brands may be claimed as the property of others<\/p>\n<p>\u00a9 Intel Corporation.<\/p>\n<p>\u00a7\u00a0(1) As measured by using the VTune Amplifier on the ListAdd application provided in this document\n\u00a7 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.\n\u00a7 Configurations: Ran VTune Amplifier with the ListAdd sample on an Intel\u00ae server powered by Intel(R) Xeon(R) Platinum 8170 CPU @2.1GHz with 192 GB RAM running Windows Server 2016\n\u00a7 For more information go to\u00a0<a href=\"http:\/\/www.intel.com\/benchmarks\"><strong>www.intel.com\/benchmarks<\/strong><\/a><\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Last Updated: 2018-04-30 This post was written by\u00a0Varun Venkatesan, Li Tian, and\u00a0Juan Rodriguez, engineers at Intel. They are excited to share .NET Core-specific enhancements that Intel has made to VTune Amplifier 2018. We&#8217;re excited to have\u00a0a new tool to use to help make .NET Core faster on Intel chips. Intel has been a strong partner [&hellip;]<\/p>\n","protected":false},"author":336,"featured_media":58792,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[685],"tags":[],"class_list":["post-15025","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-dotnet"],"acf":[],"blog_post_summary":"<p>Last Updated: 2018-04-30 This post was written by\u00a0Varun Venkatesan, Li Tian, and\u00a0Juan Rodriguez, engineers at Intel. They are excited to share .NET Core-specific enhancements that Intel has made to VTune Amplifier 2018. We&#8217;re excited to have\u00a0a new tool to use to help make .NET Core faster on Intel chips. Intel has been a strong partner [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/15025","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/users\/336"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/comments?post=15025"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/15025\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/media\/58792"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/media?parent=15025"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/categories?post=15025"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/tags?post=15025"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}