{"id":4773,"date":"2008-11-12T12:43:00","date_gmt":"2008-11-12T12:43:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/vcblog\/2008\/11\/12\/pogo\/"},"modified":"2019-02-18T18:53:58","modified_gmt":"2019-02-18T18:53:58","slug":"pogo","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/cppblog\/pogo\/","title":{"rendered":"POGO"},"content":{"rendered":"<p class=\"MsoBodyText\"><font face=\"Verdana\"><b><span>Pogo aka PGO aka Profile Guided Optimization<\/span><\/b><span> <\/span><\/font><\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\">My name is Lawrence Joel and I am a Software Developer Engineer in Testing working with the C\/C++ Backend Compiler group. &nbsp;For today&#8217;s topic I want to blog about a pretty cool compiler optimization called Profile Guided Optimization (PGO or Pogo as we in the C\/C++ team would like to call it). &nbsp;The tool is available for Microsoft Visual C\/C++ 2005 and up. &nbsp;In this blog I will give a description of what PGO is, how it will improve your application and how to use it. <\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><font face=\"Verdana\"><b><span>What is PGO?<\/span><\/b><span> <\/span><\/font><\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\">PGO is an approach to optimization where the compiler uses profile information to make better optimization decisions for the program. &nbsp;Profiling is the process of gathering information of how the program is used during runtime. &nbsp;In a nutshell, PGO is optimizations based on user scenarios whereas static optimizations rely on the source file structure. <\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><font face=\"Verdana\"><span>PGO has a three phase approach.<span>&nbsp; <\/span>The first phase can be known as the instrumental phase (see figure 1).<span>&nbsp; <\/span>With the instrumental phase, <\/span>the linker takes the cil files (these are produced by the frontend compiler with \/GL flag, eg. <i>Cl.exe foo.cpp \/GL<\/i>) and passes the modules to the C\/C++ Backend Compiler.<span>&nbsp; <\/span>The Backend Compiler will then inserts probe instructions wherever it is necessary.<span>&nbsp; <\/span>A .pgd file will be created with the executable; this is a database file that will be used in later phases.<span>&nbsp; <\/span>Note that the executable is bloated due to the probes.<\/font><\/p>\n<p class=\"MsoBodyText\"><img decoding=\"async\" height=\"217\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/9\/2019\/02\/Pogo%201.jpg\" width=\"440\" align=\"left\"><\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\"><strong><\/strong><\/font><\/span>&nbsp;<\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\"><strong><\/strong><\/font><\/span>&nbsp;<\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\"><strong><\/strong><\/font><\/span>&nbsp;<\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\"><strong><\/strong><\/font><\/span>&nbsp;<\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\"><strong><\/strong><\/font><\/span>&nbsp;<\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\"><strong><\/strong><\/font><\/span>&nbsp;<\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\"><strong><\/strong><\/font><\/span>&nbsp;<\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\"><strong><a class=\"\" href=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/9\/2019\/02\/Pogo%201.jpg\">Figure 1: Instrumentation Phase<\/a><\/strong>&nbsp;<\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\">The second phase can be known as the training phase.<span>&nbsp; <\/span>This is where you run the executable under different scenarios.<span>&nbsp; <\/span>The probes will record runtime information and save the data to a .pgc file.<span>&nbsp; <\/span>After each run an <i>appname<\/i>!<i>#<\/i>.pgc file will be created (where appname is the name of the running application and # is 1 + the number of <i>appname<\/i>!<i>#<\/i>.pgc files in the directory).<span>&nbsp; <\/span>For example, with figure 2, for each scenario run of an executable the method call information is collected and recorded in the .pgc file.<span>&nbsp; <\/span>Note: to use PGO effectively you should make sure that your scenarios have good coverage over your application.<\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span><img decoding=\"async\" height=\"373\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/9\/2019\/02\/POGO%202.jpg\" width=\"544\"><\/span><\/p>\n<p class=\"MsoBodyText\"><span><\/span><span><font face=\"Verdana\"><strong><a class=\"\" href=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/9\/2019\/02\/POGO%202.jpg\">Figure 2: Training Phase<\/a><\/strong><\/font><\/span><span><\/p>\n<p><\/span><span><b><\/b><\/span><\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\">The third phase can be known as the PG Optimization phase (see Figure 3).<span>&nbsp; <\/span>With this phase the .pgc files are merged to the .pgd file which will be used by the C\/C++ Backend Compiler to make better optimization decisions on the code and thus make a more efficient executable.<\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/9\/2019\/02\/POGO%203.jpg\"><\/span><\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\"><strong><a class=\"\" href=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/9\/2019\/02\/POGO%203.jpg\">Figure 3: PG Optimization Phase<\/a><\/strong><\/font><\/span><font face=\"Verdana\"><\/font><b><span><font face=\"Verdana\"><\/font><\/span><\/b><\/p>\n<p class=\"MsoBodyText\"><font face=\"Verdana\"><b><span>What advantages does Pogo provide?<\/span><\/b><span> <\/span><\/font><\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\">Pogo optimizes the most commonly touched areas in a program. &nbsp;The compiler has a better idea as to what are the common inputs and control flow for the application. &nbsp;Here is a partial list of the optimizations that PGO provides: <\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span>\u00b7<\/span><span><font face=\"Verdana\">&nbsp;&nbsp;&nbsp; &nbsp;Inlining \u2013 By weighing method calls with the number of calls per execution, the compiler can make better inlining decisions. <\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span>\u00b7<\/span><span><font face=\"Verdana\">&nbsp;&nbsp;&nbsp; &nbsp;Virtual Call Speculation \u2013 If a particular derived type is often passed into a method then its override method can be inlined.&nbsp; This helps by limiting the number of calls to the vtable. <\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span>\u00b7<\/span><span><font face=\"Verdana\">&nbsp;&nbsp;&nbsp; &nbsp;Basic Block Reordering \u2013 This optimization finds the most executed paths and places the basic blocks of those paths spatially closer together.<span>&nbsp; <\/span>This helps in locality by optimizing instruction cache usage and branch prediction.<span>&nbsp; <\/span>Also, code that is not used during the training phases are moved to the bottom most section.<span>&nbsp; <\/span>Doing this together with \u201cfunction layout\u201d described below can significantly reduced the working set (number of pages used in one time interval) of sizeable applications.<\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span>\u00b7<\/span><span><font face=\"Verdana\">&nbsp;&nbsp;&nbsp; &nbsp;Size\/Speed Optimization &#8211; With profile information, the compiler can find out the frequency of function usage. &nbsp;With this information the compiler can optimize for speed on the functions that are more frequently used and optimize for size on the functions that are less frequently used. <\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span>\u00b7<\/span><span><font face=\"Verdana\">&nbsp;&nbsp;&nbsp; &nbsp;Function Layout &#8211; Place functions in the same sections if they mainly used together based on the profile scenario. <\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span>\u00b7<\/span><span><font face=\"Verdana\">&nbsp;&nbsp;&nbsp; &nbsp;Conditional Branch Optimization &#8211; An example can be for if\/else blocks. &nbsp;If the condition is more often false then true, it would be better to have the else block before the if block. <\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\">&nbsp;<\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><font face=\"Verdana\"><b><span>How to use PGO?<\/span><\/b><span> <\/span><\/font><\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\">Here are the steps for a standard usage of PGO; <\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span>1.<\/span><span><font face=\"Verdana\">&nbsp;&nbsp;&nbsp; &nbsp;Compile the source code files that you want to be profiled with flag \/GL. <\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span>2.<\/span><span><font face=\"Verdana\">&nbsp;&nbsp;&nbsp; &nbsp;Link all the files with \/LTCG:PGINSTRUMENT (or \/LTCG:PGI)<\/font><\/span><span>.<span>&nbsp; <\/span><\/span><span><font face=\"Verdana\">This will create a .PGD file with your executable file. Note that when you link with \/LTCG:PGI, some optimization may be overridden to make way for the instrumentation.<span>&nbsp; <\/span>Such optimizations are in effect if you specify \/Ob, \/Os or \/Ot.<\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span>3.<\/span><span><font face=\"Verdana\">&nbsp;&nbsp;&nbsp; &nbsp;Train the application by running it with different scenarios<\/font><\/span><span>.<\/span><span><font face=\"Verdana\"> <\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span>4.<\/span><span><font face=\"Verdana\">&nbsp;&nbsp;&nbsp; &nbsp;Re-Link the files with \/LTCG:PGOPTIMIZE <span>&nbsp;<\/span>(or \/LTCG:PGO) to produce an optimized image of the application. <\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\">You might find yourself in a situation where you updated the source files after the .PGD file was created.<span>&nbsp; <\/span>If you were to re-link the object files with \/LTCG:PGO then the profile information would be ignored.<span>&nbsp; <\/span>If you made small changes to the source file, it would be way too costly to repeat the process in creating a new .PGD file and .pgc files.<span>&nbsp; <\/span>To overcome this problem you can instead re-link the files with \/LTCG:PGUPDATE (or \/LTCG:PGU).<span>&nbsp; <\/span>This flag will allow the link to compile the new source code using the original .PGD file.<\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\">Another useful part of PGO is the ability to manage your .pgc files.<span>&nbsp; <\/span>Visual Studios provides a tool called Pgomgr which allows you to set priorities on the trained scenarios. For example, an ATM software company notices that the most common transactions performed on their software is withdraws and deposits.<span>&nbsp; <\/span>It would be in there best interest to set such transactions at a higher priority over the other transactions that are made on their software.<span>&nbsp; <\/span>This can be done by running the following: <i>pgomgr \/merge:2<span>&nbsp; <\/span>appname!1.pgc appname.pgd<\/i>, this will give appname!1.pgc a weight of 2.<span>&nbsp; <\/span>The default weight for a .pgc file is 1.<span>&nbsp; <\/span>When the files are re-linked with \/ltcg:pgo or \/ltcg:pgu then appname!1.pgc will have higher priority over the other .pgc scenarios.<\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\">If you want to only gather profile information within an interval of execution or time then there are a couple of ways to go about it.<span>&nbsp; <\/span>There is a tool called Pgosweep that interrupts a running program and stores the current profile information to a new .pgc file and clears the information from the runtime data structure.<span>&nbsp; <\/span>For example, if you have an application that does not end and you want to differentiate between its daytime behavior vs its nighttime behavior you can do the following: <i>pgosweep app.exe daytime.pgc<\/i>.<span>&nbsp; <\/span>Another approach you can use is a helper method called PgoAutoSweep.<span>&nbsp; <\/span>PgoAutoSweep will aid when trying to partition profile information within execution.<span>&nbsp; <\/span>The example below was taken from MSDN\u2019s Walkthroughs in Visual C++ 2008, \u201cWalkthrough: Using Profile-Guided Optimizations\u201d (current link: <\/font><a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/xct6db7f.aspx\"><font face=\"Verdana\" color=\"#000080\">http:\/\/msdn.microsoft.com\/en-us\/library\/xct6db7f.aspx<\/font><\/a><font face=\"Verdana\">).<span>&nbsp; <\/span>The example below will create two .PGC files.<span>&nbsp; <\/span>The first contains data that describes the runtime behavior until count is equal to 3, and the second contains the data collected after this point until application termination.<\/font><\/span><\/p>\n<table class=\"MsoNormalTable\" cellSpacing=\"0\" cellPadding=\"0\" border=\"1\">\n<tbody>\n<tr>\n<td class=\"\" vAlign=\"top\" width=\"717\">\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">#include &lt;stdio.h&gt;<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">#include &lt;windows.h&gt;<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">#include &lt;pgobootrun.h&gt;<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">&nbsp;<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">int count = 10;<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">int g = 0;<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">&nbsp;<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">void func2(void)<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">{<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp; <\/span>printf(&#8220;hello from func2 %d\\n&#8221;, count);<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp; <\/span>Sleep(2000);<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">}<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">&nbsp;<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">void func1(void)<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">{<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp; <\/span>printf(&#8220;hello from func1 %d\\n&#8221;, count);<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp; <\/span>Sleep(2000);<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">}<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">void main(void) <\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">{<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp; <\/span>while (count&#8211;)<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp; <\/span>{<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/span>if(g)<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/span>func2();<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/span>else<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/span>func1();<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/span>if (count == 3) <\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/span>{<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp;&nbsp; <\/span><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/span>PgoAutoSweep(&#8220;func1&#8221;);<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/span>g = 1;<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/span>}<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp; <\/span>}<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\"><span>&nbsp;&nbsp;&nbsp; <\/span>PgoAutoSweep(&#8220;func2&#8221;);<\/font><\/p>\n<p class=\"MsoNoSpacing\"><font face=\"Verdana\">}<\/font><\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\">Note: To build the example I had to write <i>cl app.cpp \/GL <b>&#8220;%VSPATH%\\VC\\lib\\pgobootrun.lib&#8221;<\/b><\/i>, where %VSPATH% is the path to your latest Microsoft Visual Studio program directory.<\/font><\/span><\/p>\n<p class=\"MsoBodyText\"><span><font face=\"Verdana\">For more information on PGO please read Kang Su\u2019s excellent article under MSDN\u2019s Unmanaged C++ Articles titled \u201cProfile-Guided Optimization with Microsoft Visual C++ 2005\u201d.&nbsp; Current link: <\/font><a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/aa289170.aspx\"><font face=\"Verdana\" color=\"#000080\">http:\/\/msdn.microsoft.com\/en-us\/library\/aa289170.aspx<\/font><\/a><font face=\"Verdana\">.<span>&nbsp; <\/span>For information on PGO usage you can look at MSDN\u2019s C\/C++ Build Tools section \u201cProfile-Guided Optimizations\u201d page. Current link: <\/font><\/span><a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/e7k32f4k.aspx\"><font face=\"Verdana\" color=\"#000080\">http:\/\/msdn.microsoft.com\/en-us\/library\/e7k32f4k.aspx<\/font><\/a><span><font face=\"Verdana\"> .<\/font><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Pogo aka PGO aka Profile Guided Optimization My name is Lawrence Joel and I am a Software Developer Engineer in Testing working with the C\/C++ Backend Compiler group. &nbsp;For today&#8217;s topic I want to blog about a pretty cool compiler optimization called Profile Guided Optimization (PGO or Pogo as we in the C\/C++ team would [&hellip;]<\/p>\n","protected":false},"author":289,"featured_media":35994,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-4773","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cplusplus"],"acf":[],"blog_post_summary":"<p>Pogo aka PGO aka Profile Guided Optimization My name is Lawrence Joel and I am a Software Developer Engineer in Testing working with the C\/C++ Backend Compiler group. &nbsp;For today&#8217;s topic I want to blog about a pretty cool compiler optimization called Profile Guided Optimization (PGO or Pogo as we in the C\/C++ team would [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts\/4773","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/users\/289"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/comments?post=4773"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts\/4773\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/media\/35994"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/media?parent=4773"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/categories?post=4773"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/tags?post=4773"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}