{"id":476,"date":"2021-12-08T13:27:50","date_gmt":"2021-12-08T21:27:50","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/?p=476"},"modified":"2022-06-29T11:12:06","modified_gmt":"2022-06-29T18:12:06","slug":"perfetto-tooling-for-analyzing-android-linux-and-chromium-browser-performance-microsoft-performance-tools-linux-android","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/perfetto-tooling-for-analyzing-android-linux-and-chromium-browser-performance-microsoft-performance-tools-linux-android\/","title":{"rendered":"Perfetto tooling for analyzing Android, Linux, and Chromium browser performance &#8211; Microsoft-Performance-Tools-Linux-Android"},"content":{"rendered":"<h2>Introduction<\/h2>\n<p>In the last <a href=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/new-tools-for-analyzing-android-linux-and-chromium-browser-performance\/\">blog post<\/a>, we introduced the cross platform open-source .NET Core <a href=\"https:\/\/github.com\/microsoft\/Microsoft-Performance-Tools-Linux-Android\">Microsoft-Performance-Tools-Linux-Android<\/a> tooling. Recently, we just released version 1.2 adding Perfetto support, which we will cover here.<\/p>\n<p><img decoding=\"async\" width=\"276\" height=\"80\" class=\"wp-image-478\" src=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/a-picture-containing-logo-description-automatical-1.png\" alt=\"A picture containing logo Description automatically generated\" \/><\/p>\n<p><a href=\"https:\/\/perfetto.dev\">Perfetto<\/a> is Google\u2019s open-source tracing ecosystem covering Linux kernel tracing (and user-mode) and built into Android. Perfetto is best-in-class for Android tracing. The Perfetto ecosystem covers System Profiling, In-App Tracing, Trace Viewer, and Trace Analysis.<\/p>\n<p>The Perfetto ecosystem is Google\u2019s equivalent of the Microsoft Event Tracing for Windows (ETW) ecosystem. We worked with Google to complete their vision of making Perfetto <a href=\"https:\/\/perfetto.dev\/docs\/analysis\/trace-processor\">trace processor<\/a> truly available cross platform, including now on Windows!<\/p>\n<p>With this trace processor support in place, we were able to add value building on top of this great core, to create the Perfetto Microsoft-Performance-Tools-Linux-Android tooling. Among other things, the tooling newly exposes the Perfetto trace processor via .NET Core &amp; C#, does unique post-processing and analysis, and optionally exposes the data to Windows Performance Analyzer (WPA).<\/p>\n<p>In the rest of this blog post, we will walk you through analyzing a simple example trace on <a href=\"https:\/\/android-developers.googleblog.com\/2021\/10\/android-12-is-live-in-aosp.html\">Android Open Source Project (AOSP) 12<\/a>. I will give you a quick tour to get started capturing &amp; analyzing a scenario, otherwise this blog post would get too long.<\/p>\n<p>Android 12, just recently released, has some exciting new trace collection features that devs will love! We have found these similar tracing capture features quite useful for Windows analysis.<\/p>\n<ul>\n<li><a href=\"https:\/\/perfetto.dev\/docs\/reference\/trace-config-proto#PerfEventConfig.CallstackSampling\">CPU Sampling \/w callstacks<\/a> to see at the exact function level of detail what code is executing on the CPU<\/li>\n<li><a href=\"https:\/\/perfetto.dev\/docs\/data-sources\/cpu-scheduling#scheduling-wakeups-and-latency-analysis\">CPU Scheduling Wakeup and latency analysis<\/a> allowing you to see quickly the longest single or cumulative latency delays between threads. E.g. Thread was blocked for x ms by and was woken up (alerted) by this other process\/thread.<\/li>\n<\/ul>\n<p>If you take a trace on an older Android OS, such as Android 11+ \/ Q+, then you still get great tracing such as: CPU, GPU, Power, Memory, Android Apps &amp; Svcs, and Chrome. See <a href=\"https:\/\/ui.perfetto.dev\/#!\/record\/cpu\">record new trace<\/a>.<\/p>\n<p>In this post we will use the following tools and will walk through each of them<\/p>\n<ul>\n<li><a href=\"https:\/\/developer.android.com\/studio\/\">Android Studio<\/a> \/w Android 12 target<\/li>\n<li><a href=\"https:\/\/developer.android.com\/studio\/command-line\/adb\">Android Debug Bridge (adb)<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/microsoft\/Microsoft-Performance-Tools-Linux-Android\/releases\">Microsoft-Performance-Tools-Linux-Android<\/a><\/li>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/p\/windows-performance-analyzer-preview\/9n58qrw40dfw\">Windows Performance Analyzer (Preview)<\/a><\/li>\n<\/ul>\n<h2>Perfetto \u2013 Trace Capture<\/h2>\n<ol>\n<li>First off, we will need a target device, real or virtual, to capture a trace on. So that anyone can follow along on multiple OSes, we will be using <a href=\"https:\/\/developer.android.com\/studio\/\">Android Studio<\/a> and an Android 12 target in Android Virtual Device Manager. Once the device image is downloaded you should see something like this<\/li>\n<\/ol>\n<p><img decoding=\"async\" width=\"2362\" height=\"418\" class=\"wp-image-479\" src=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-application-description-1.png\" alt=\"Graphical user interface, application Description automatically generated\" srcset=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-application-description-1.png 2362w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-application-description-1-300x53.png 300w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-application-description-1-1024x181.png 1024w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-application-description-1-768x136.png 768w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-application-description-1-1536x272.png 1536w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-application-description-1-2048x362.png 2048w\" sizes=\"(max-width: 2362px) 100vw, 2362px\" \/><\/p>\n<ol start=\"2\">\n<li>Any project which gets the VM to run should be good, but I chose the Primary\/Detail Flow template in Android Studio<\/li>\n<li>I hit \u2018Run app\u2019 and the Android Emulator running Android 12 starts and launches the example app. After hitting the launcher home button, we are back on the home screen ready to start a trace and launch a scenario.<\/li>\n<li>In general, with tracing you will need to decide on what scenario you are trying to capture and then somewhat customize the tracing to be appropriate to that scenario.<\/li>\n<li>I decided for this scenario that I will simply capture Chrome launch from the home screen.<\/li>\n<li>There are various methods to capture tracing on Android documented <a href=\"https:\/\/perfetto.dev\/docs\/quickstart\/android-tracing\">here<\/a>. To get started faster capturing a trace, I am a fan of the trace capture GUIs; of which there are at least two\n<ol>\n<li><a href=\"https:\/\/developer.android.com\/studio\/debug\/dev-options\">On-Device Developer Options<\/a> Debugging -&gt; System Tracing (Record system activity and analyzer it later to improve performance).\n<ol>\n<li>Under System Tracing you can choose Tracing Categories and record a trace. Traces will be recorded under \/data\/misc\/perfetto-traces\/, and can be pulled off a device with \u2018adb pull\u2019<\/li>\n<li>If you do not need advanced tracing configs, this works well say for the current version of Windows Subsystem for Android (WSA), where we can simply pick categories that we want to trace. Available via the WSA Settings app -&gt; Manage developer settings<\/li>\n<\/ol>\n<\/li>\n<li><a href=\"https:\/\/ui.perfetto.dev\/#!\/record\">Perfetto Record Trace GUI<\/a>, which supports generating trace configs and connecting over USB to a device. We will use the GUI to get a trace config. We will not connect over USB since this a local emulator.<\/li>\n<\/ol>\n<\/li>\n<li>Since we want to capture CPU Sampling, this is currently an advanced configuration that is not yet fully supported by the <a href=\"https:\/\/ui.perfetto.dev\/#!\/record\/cpu\">Perfetto Record Trace GUI<\/a>. Therefore, we will use an advanced and customized configuration, but start with the GUI; in order to make our life easier<\/li>\n<li>I do like to start easy with a base configuration applicable to the scenario and enabled by the GUI. I picked these settings under Probes:\n<ol>\n<li>CPU\n<ol>\n<li>Scheduling Details<\/li>\n<li>CPU Frequency and idle states<\/li>\n<\/ol>\n<\/li>\n<li>GPU\n<ol>\n<li>GPU Frequency<\/li>\n<li>GPU Memory<\/li>\n<\/ol>\n<\/li>\n<li>Power\n<ol>\n<li>Battery drain &amp; power rails<\/li>\n<\/ol>\n<\/li>\n<li>Memory\n<ol>\n<li>Kernel meminfo<\/li>\n<li>Low memory killer<\/li>\n<li>Per process stats<\/li>\n<li>Virtual memory stats<\/li>\n<\/ol>\n<\/li>\n<li>Android apps &amp; svcs\n<ol>\n<li>Event log (logcat)<\/li>\n<\/ol>\n<\/li>\n<li>Chrome (I am only tracing Chrome because it is applicable to this scenario)\n<ol>\n<li>Task scheduling<\/li>\n<li>Web content rendering, layout and compositing<\/li>\n<li>UI rendering &amp; surface compositing<\/li>\n<li>Input events<\/li>\n<li>Navigation &amp; Loading<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p><img decoding=\"async\" width=\"1368\" height=\"1248\" class=\"wp-image-480\" src=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-text-application-descr-1.png\" alt=\"Graphical user interface, text, application Description automatically generated\" srcset=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-text-application-descr-1.png 1368w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-text-application-descr-1-300x274.png 300w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-text-application-descr-1-1024x934.png 1024w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-text-application-descr-1-768x701.png 768w\" sizes=\"(max-width: 1368px) 100vw, 1368px\" \/><\/p>\n<p>Figure 2 &#8211; Perfetto Record UI &#8211; CPU Probes<\/p>\n<ol start=\"9\">\n<li>Under \u2018Recording settings\u2019 I increased the \u2018Max duration\u2019 from 10s to 1m.<\/li>\n<li>From here you can choose \u2018Recording Command\u2019 to get the current Perfetto recording configuration.<\/li>\n<\/ol>\n<p><img decoding=\"async\" width=\"1338\" height=\"1194\" class=\"wp-image-481\" src=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-text-description-automa-2.png\" alt=\"Graphical user interface, text Description automatically generated\" srcset=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-text-description-automa-2.png 1338w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-text-description-automa-2-300x268.png 300w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-text-description-automa-2-1024x914.png 1024w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-text-description-automa-2-768x685.png 768w\" sizes=\"(max-width: 1338px) 100vw, 1338px\" \/><\/p>\n<p>Figure 3 &#8211; Perfetto UI &#8211; Recording command<\/p>\n<ol start=\"11\">\n<li>Copy this into a text editor as we will be adding a bit of custom configuration. Only keep the text between the EOFs. I have not found the adb cmd as-is, to work reliably.<\/li>\n<li>Near the end of the file, but before final duration_ms config, I added a configuration of CPU Sampling every ~1ms of the entire system. Freqency is set to 1000Hz = ~1\/1000 or ~1ms. See <a href=\"https:\/\/perfetto.dev\/docs\/reference\/trace-config-proto#PerfEventConfig\">PerfEventConfig<\/a> &amp; <a href=\"https:\/\/perfetto.dev\/docs\/reference\/trace-config-proto#PerfEventConfig.CallstackSampling\">PerfEventConfig.CallstackSampling<\/a> for full syntax<\/li>\n<\/ol>\n<pre class=\"prettyprint\">data_sources: {\r\n    config {\r\n        name: \"linux.perf\"\r\n        perf_event_config {\r\n        timebase {\r\n            frequency: 1000\r\n        }\r\n        callstack_sampling {\r\n            kernel_frames: true\r\n        }\r\n        }\r\n    }\r\n}<\/pre>\n<ol start=\"13\">\n<li>Save the advanced tracing config as the filename <strong>perfetto_trace_config<\/strong>.<\/li>\n<li>To execute the advanced tracing and transfer resultant trace file off device we need to download\/use <a href=\"https:\/\/developer.android.com\/studio\/command-line\/adb\">Android Debug Bridge (adb)<\/a>.<\/li>\n<li>With the device running and adb downloaded, list the devices\n<ol>\n<li>\n<pre class=\"prettyprint\">adb devices<\/pre>\n<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p><img decoding=\"async\" width=\"672\" height=\"100\" class=\"wp-image-483\" src=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-text-description-automa-3.png\" alt=\"Graphical user interface, text Description automatically generated\" srcset=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-text-description-automa-3.png 672w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-text-description-automa-3-300x45.png 300w\" sizes=\"(max-width: 672px) 100vw, 672px\" \/><\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ol start=\"2\">\n<li>If you don\u2019t see your device present, you can use \u2018adb connect\u2019<\/li>\n<\/ol>\n<\/li>\n<\/ul>\n<ol start=\"16\">\n<li>Upload the tracing configuration to the device. Note: You can add \u2018-s DEVICE_NAME\u2019 if you have multiple devices\n<ol>\n<li>\n<pre class=\"prettyprint\">adb push perfetto_trace_config \/data\/local\/tmp<\/pre>\n<\/li>\n<\/ol>\n<\/li>\n<li>Connect to the shell\n<ol>\n<li>\n<pre class=\"prettyprint\">adb shell<\/pre>\n<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p><img decoding=\"async\" width=\"908\" height=\"64\" class=\"wp-image-484\" src=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/word-image-3.png\" srcset=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/word-image-3.png 908w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/word-image-3-300x21.png 300w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/word-image-3-768x54.png 768w\" sizes=\"(max-width: 908px) 100vw, 908px\" \/><\/p>\n<ol start=\"18\">\n<li>Inside the adb shell, start tracing with the advanced tracing command\n<ol>\n<li>\n<pre class=\"prettyprint\">cat \/data\/local\/tmp\/perfetto_trace_config | perfetto -o \/data\/misc\/perfetto-traces\/perfetto_trace.pftrace --txt -c \u2013<\/pre>\n<\/li>\n<li>Tracing will now start.<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p><img decoding=\"async\" width=\"1200\" height=\"32\" class=\"wp-image-485\" src=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/word-image-4.png\" srcset=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/word-image-4.png 1200w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/word-image-4-300x8.png 300w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/word-image-4-1024x27.png 1024w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/word-image-4-768x20.png 768w\" sizes=\"(max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<ol start=\"19\">\n<li>Now we are ready to start capturing our scenario. Execute a scenario you want to capture a trace for. For this example, I will simply capture Chrome launch from the home screen.\n<ol>\n<li>Click on the Chrome icon in the Android Emulator<\/li>\n<\/ol>\n<\/li>\n<li>Once complete with executing the scenario, stop the trace with Ctrl-C<\/li>\n<\/ol>\n<p>&nbsp;<\/p>\n<ol start=\"21\">\n<li>Exit the Shell and transfer\/pull the trace off the device\n<ol>\n<li>\n<pre class=\"prettyprint\">exit<\/pre>\n<\/li>\n<li>\n<pre class=\"prettyprint\">adb pull \/data\/misc\/perfetto-traces\/perfetto_trace.pftrace c:\\temp<\/pre>\n<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<h2>Trace Analysis with Microsoft-Performance-Tools-Linux-Android and WPA<\/h2>\n<ol>\n<li>We will be using <a href=\"https:\/\/www.microsoft.com\/en-us\/p\/windows-performance-analyzer-preview\/9n58qrw40dfw\">WPA (Preview)<\/a> UI for the rest of the analysis and screenshots, so install it from the Microsoft Store.\n<ol>\n<li>Note: Older versions of WPA are not compatible with the SDK and the toolkit<\/li>\n<\/ol>\n<\/li>\n<li>Download the Microsoft-Performance-Tools-Linux-Android 1.2 toolset (or later) from GitHub <a href=\"https:\/\/github.com\/microsoft\/Microsoft-Performance-Tools-Linux-Android\/releases\">releases<\/a><\/li>\n<li>Extract the zip file and navigate to Microsoft-Performance-Tools-Linux-Android\\Launcher\\Windows\n<ol>\n<li>Given the release came from the Internet, you may need to unblock the .bat or .ps1 file using right-click properties unblock<\/li>\n<\/ol>\n<\/li>\n<li>Double-click LaunchWpaPerfToolsLinuxAndroid.bat which will launch the WPA UI pre-configured to load the plugins<\/li>\n<li>Once WPA is loaded, click Help -&gt; About and you should see a bunch of plugins pre-loaded including PerfettoTraceDataSource and PftraceDataSource<\/li>\n<\/ol>\n<p><img decoding=\"async\" width=\"644\" height=\"902\" class=\"wp-image-487\" src=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/text-letter-description-automatically-generated-1.png\" alt=\"Text, letter Description automatically generated\" srcset=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/text-letter-description-automatically-generated-1.png 644w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/text-letter-description-automatically-generated-1-214x300.png 214w\" sizes=\"(max-width: 644px) 100vw, 644px\" \/><\/p>\n<ol start=\"6\">\n<li>From here you should be able to open Perfetto trace files from the File -&gt; Open menu<\/li>\n<li>Let\u2019s open our trace we already pulled from the device &#8211; perfetto_trace.pftrace<\/li>\n<li>Once open the toolset will show a progress bar loading the trace and display tables once the trace load is complete\n<ol>\n<li>Under the hood, the plugin runs Google\u2019s trace_processor_shell.exe and executes many advanced queries against the trace transferring data via protobuf.<\/li>\n<li>The queries are then joined together as appropriate, post processed, and enhanced with additional calculations &amp; metadata to make the trace analysis experience more useful.<\/li>\n<li>You can see detailed queries and trace load information in the WPA Diagnostic Console<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<h2>Analyzing our example Android 12 trace<\/h2>\n<ol>\n<li>Recall that the scenario I chose to execute\/trace was stock AOSP 12 image simply launching Chrome and navigating to Wikipedia<\/li>\n<li>Once the trace is loaded you will see various top-level graph categories on the left in Graph Explorer<\/li>\n<\/ol>\n<p><img decoding=\"async\" width=\"400\" height=\"672\" class=\"wp-image-488\" src=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/calendar-description-automatically-generated-with-1.png\" alt=\"Calendar Description automatically generated with medium confidence\" srcset=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/calendar-description-automatically-generated-with-1.png 400w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/calendar-description-automatically-generated-with-1-179x300.png 179w\" sizes=\"(max-width: 400px) 100vw, 400px\" \/><\/p>\n<ol start=\"3\">\n<li>I like to start with a top-level view of what the system is doing which is usually what work is on the CPU. You can double click or drag CPU Scheduler events<\/li>\n<li>In CPU Scheduler events I prefer to change the View Preset to \u201cUtilization by Process, Thread\u201d and the \u201cChart Type\u201d to \u201cStacked Lines\u201d. I also zoom into the graph so that the data fully covers the width of the chart, and so that the CPU % calculations are correct.\n<ol>\n<li>Here we can easily see the top processes: traced_perf, com.android.chrome, etc<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p><img decoding=\"async\" class=\"wp-image-489\" src=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/chart-description-automatically-generated-1.png\" alt=\"Chart Description automatically generated\" width=\"2047\" height=\"307\" srcset=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/chart-description-automatically-generated-1.png 2500w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/chart-description-automatically-generated-1-300x45.png 300w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/chart-description-automatically-generated-1-1024x154.png 1024w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/chart-description-automatically-generated-1-768x115.png 768w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/chart-description-automatically-generated-1-1536x231.png 1536w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/chart-description-automatically-generated-1-2048x307.png 2048w\" sizes=\"(max-width: 2047px) 100vw, 2047px\" \/><\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ol start=\"2\">\n<li>You can easily drill into each process to see which thread is consuming CPU<\/li>\n<li>There is some neat WaitDuration information added to show how long a thread was waiting or blocked for. This wait duration time is calculated based <a href=\"https:\/\/perfetto.dev\/docs\/data-sources\/cpu-scheduling#scheduling-wakeups-and-latency-analysis\">when a thread is woken up<\/a> and is an in-depth topic we can explore another time.<\/li>\n<\/ol>\n<\/li>\n<\/ul>\n<ol start=\"5\">\n<li>Now we may want to see in detail what functions\/stacks were executing on the CPU during this time, which is where the Android 12+ CPU Sampling comes in handy. If you don\u2019t have the CPU samples available, then you are just stuck with only seeing the thread name consuming CPU and hoping there are other useful logs available that might give context on the work being done<\/li>\n<li>Expand the Perfetto \u2013 System node in Graph Explorer so that we can see \u201cCPU Sampling Events\u201d. Again, double click or drag to add to the Analysis tab.<\/li>\n<\/ol>\n<p><img decoding=\"async\" width=\"292\" height=\"722\" class=\"wp-image-490\" src=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/calendar-description-automatically-generated-1.png\" alt=\"Calendar Description automatically generated\" srcset=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/calendar-description-automatically-generated-1.png 292w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/calendar-description-automatically-generated-1-121x300.png 121w\" sizes=\"(max-width: 292px) 100vw, 292px\" \/><\/p>\n<ol start=\"7\">\n<li>In CPU Sampling, I switched the View Preset to \u201cBy Process, Thread, Stack\u201d and as expected we see our top processes again of traced_perf and com.android.chrome<\/li>\n<\/ol>\n<p><img decoding=\"async\" class=\"wp-image-491\" src=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/a-picture-containing-application-description-auto-1.png\" alt=\"A picture containing application Description automatically generated\" width=\"2139\" height=\"279\" srcset=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/a-picture-containing-application-description-auto-1.png 2500w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/a-picture-containing-application-description-auto-1-300x39.png 300w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/a-picture-containing-application-description-auto-1-1024x134.png 1024w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/a-picture-containing-application-description-auto-1-768x100.png 768w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/a-picture-containing-application-description-auto-1-1536x200.png 1536w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/a-picture-containing-application-description-auto-1-2048x267.png 2048w\" sizes=\"(max-width: 2139px) 100vw, 2139px\" \/><\/p>\n<ol start=\"8\">\n<li>Once expanded (not shown), interesting things show up here like the traced_perf stack-unwinding thread is performing a lot of Maps parsing. However, since we are interested in Chrome right now we will ignore this and expand the Chrome process. If you can expand enough (or right keyboard arrow) you will see top stacks start to expand. The top occurring stacks are shown at the top of the table; thus ensuring you are looking at the most important data first.<\/li>\n<li>For example, we see 48 samples corresponding to ~48ms of CPU time spent at the beginning of the trace showing the callstack involved in the launcher and starting Chrome. In addition, simply by selecting rows in the table, you also get auto-highlighting on the graph where those samples are included on the CPU Scheduling graph!<\/li>\n<\/ol>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/12\/WPAPerfettoStack.jpg\"><img decoding=\"async\" class=\"alignnone size-full wp-image-508\" src=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/12\/WPAPerfettoStack.jpg\" alt=\"Image WPAPerfettoStack\" width=\"1248\" height=\"592\" srcset=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/12\/WPAPerfettoStack.jpg 1248w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/12\/WPAPerfettoStack-300x142.jpg 300w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/12\/WPAPerfettoStack-1024x486.jpg 1024w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/12\/WPAPerfettoStack-768x364.jpg 768w\" sizes=\"(max-width: 1248px) 100vw, 1248px\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<ol start=\"10\">\n<li>Almost done! Each process can do its own in process business logic logging to help provide detail and context about what and <em>why<\/em> it is doing the work it\u2019s doing. Here, Chrome is logging detailed info out via Perfetto due to our trace configuration. You can load up these events in the \u201cGeneric Events\u201d graph and get both a visual as well as text representation of the data. Open \u201cGeneric Events\u201d under \u201cPerfetto \u2013 Events\u201d.<\/li>\n<li>Here you can see we zoomed in at the start of Chrome launch and the CrBrowserMain thread is loading profiles via the Profile::CreateProfile function. You can see how long the operation took and see the sampled callstacks of the cpu matched on the same timeline.<\/li>\n<\/ol>\n<p><img decoding=\"async\" width=\"2500\" height=\"1202\" class=\"wp-image-492\" src=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-description-automaticall-1.png\" alt=\"Graphical user interface Description automatically generated\" srcset=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-description-automaticall-1.png 2500w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-description-automaticall-1-300x144.png 300w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-description-automaticall-1-1024x492.png 1024w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-description-automaticall-1-768x369.png 768w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-description-automaticall-1-1536x738.png 1536w, https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-content\/uploads\/sites\/64\/2021\/11\/graphical-user-interface-description-automaticall-1-2048x985.png 2048w\" sizes=\"(max-width: 2500px) 100vw, 2500px\" \/><\/p>\n<ol start=\"12\">\n<li>You can explore other graphs such as: Logcat events, FTrace events, CPU Frequency scaling, Process Memory, and System Memory. These show up as we had a trace capture configuration which specified to collect them.<\/li>\n<\/ol>\n<h2>Wrap-up<\/h2>\n<p>Hopefully you can use these powerful Perfetto tools on Android &amp; Chrome to gain insight into what the system and how your code is performing. These are powerful tools you can use as a dev to improve perf!<\/p>\n<p>We walked through how to configure trace capturing and added some new Android 12+ goodness with CPU Sampling.<\/p>\n<p>If this trace is running say inside a Windows VM, such as in Windows Subsystem for Android (WSA), you can optionally co-load an Android Perfetto trace in the same timeline as the Windows ETW trace.<\/p>\n<h2>Bonus \/ Next Steps &#8211; Deeper dive into the WPA UI<\/h2>\n<p>You may be interested in these other blogs post covering WPA in more detail. The great thing about the integration with WPA, is that the Microsoft-Performance-Tools-Linux-Android plugins are considered first class right along with Windows ETW support.<\/p>\n<p>This means that almost every powerful WPA feature covered in these blog posts, seamlessly is the same experience and also applicable to the Microsoft-Performance-Tools-Linux-Android plugins. Enjoy!<\/p>\n<h5><a href=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wpa-intro\/\">Windows Performance Analyzer \u201cWPA\u201d Intro<\/a><\/h5>\n<h5><a href=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wpa-table-graph-configurations-part-1\/\">Windows Performance Analyzer \u2013 Table &amp; Graph Configurations (Part 1)<\/a><\/h5>\n<h5><a href=\"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wpa-table-graph-configurations-part-2\/\">WPA: Table &amp; Graph Configurations (Part 2)<\/a><\/h5>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction In the last blog post, we introduced the cross platform open-source .NET Core Microsoft-Performance-Tools-Linux-Android tooling. Recently, we just released version 1.2 adding Perfetto support, which we will cover here. Perfetto is Google\u2019s open-source tracing ecosystem covering Linux kernel tracing (and user-mode) and built into Android. Perfetto is best-in-class for Android tracing. The Perfetto ecosystem [&hellip;]<\/p>\n","protected":false},"author":77202,"featured_media":76,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[39,38,2],"tags":[43,44,26,14],"class_list":["post-476","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-linux-android","category-microsoft-performance-tools","category-windows-performance-analyzer","tag-android","tag-perfetto","tag-performance","tag-windows-performance-analyzer"],"acf":[],"blog_post_summary":"<p>Introduction In the last blog post, we introduced the cross platform open-source .NET Core Microsoft-Performance-Tools-Linux-Android tooling. Recently, we just released version 1.2 adding Perfetto support, which we will cover here. Perfetto is Google\u2019s open-source tracing ecosystem covering Linux kernel tracing (and user-mode) and built into Android. Perfetto is best-in-class for Android tracing. The Perfetto ecosystem [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-json\/wp\/v2\/posts\/476","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-json\/wp\/v2\/users\/77202"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-json\/wp\/v2\/comments?post=476"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-json\/wp\/v2\/posts\/476\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-json\/wp\/v2\/media\/76"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-json\/wp\/v2\/media?parent=476"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-json\/wp\/v2\/categories?post=476"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/performance-diagnostics\/wp-json\/wp\/v2\/tags?post=476"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}