{"id":4153,"date":"2009-12-01T11:02:00","date_gmt":"2009-12-01T11:02:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/vcblog\/2009\/12\/01\/gl-and-pgo\/"},"modified":"2019-02-18T18:45:43","modified_gmt":"2019-02-18T18:45:43","slug":"gl-and-pgo","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/cppblog\/gl-and-pgo\/","title":{"rendered":"\/GL and PGO"},"content":{"rendered":"<p class=\"MsoNormal\"><font size=\"3\"><font face=\"Calibri\">Hi, I\u2019m Lin Xu, a Program Manager working on the C++ compiler.<\/font><\/font><\/p>\n<p class=\"MsoNormal\"><font size=\"3\"><font face=\"Calibri\">Recently<span>,<\/span> we collated performance numbers from our testing passes over this release cycle. We track many different benchmarks closely for all of the architectures and switch options (\/O1, \/O2, \/GL, \/PGO). We also track these across multiple CPU models. (Yes, this is quite a big matrix. Look for an upcoming blog post from the QA team to learn more.)<\/font><\/font><\/p>\n<p class=\"MsoNormal\"><font face=\"Calibri\" size=\"3\">We\u2019re pretty excited about the improvements we made for this release in code quality. (Read Ten\u2019s recent post about it <\/font><a href=\"http:\/\/blogs.msdn.com\/vcblog\/archive\/2009\/11\/02\/visual-c-code-generation-in-visual-studio-2010.aspx\"><font face=\"Calibri\" size=\"3\">here<\/font><\/a><font size=\"3\"><font face=\"Calibri\">) As I looked at the numbers, one thing jumped out at me: To really take advantage of these improvements, applications need to be compiled with <b>\/GL<\/b>, and <b>PGO<\/b>, if possible.<b><\/b><\/font><\/font><\/p>\n<p class=\"MsoNormal\"><font face=\"Calibri\" size=\"3\">If you aren\u2019t familiar with <b>PGO<\/b>, you can read Lawrence\u2019s blog post on <b>P<\/b>rofile <b>G<\/b>uided <b>O<\/b>ptimization <\/font><a href=\"http:\/\/blogs.msdn.com\/vcblog\/archive\/2008\/11\/12\/pogo.aspx\"><font face=\"Calibri\" color=\"#0000ff\" size=\"3\">here<\/font><\/a><font size=\"3\"><font face=\"Calibri\">.<\/font><\/font><\/p>\n<p class=\"MsoNormal\"><font size=\"3\"><font face=\"Calibri\">I\u2019ve summarized some of our data comparing VS2010 Beta2 with VS2008 SP1.0. Here is a comparison between integer benchmark performance with various switches on x86 and x64: <\/font><\/font><\/p>\n<p class=\"MsoNormal\"><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/9\/2019\/02\/GL%20and%20PGO%20picture.png\"><\/p>\n<p class=\"MsoNormal\"><font size=\"3\"><font face=\"Calibri\">These particular graphs are based on a benchmark suite similar to SPEC CPU 2006. But our benchmarks include real world code as well. We build and measure performance of many Microsoft products, including SQL, Windows and Office. <\/font><\/font><\/p>\n<p class=\"MsoNormal\"><font size=\"3\"><font face=\"Calibri\">&nbsp;Let\u2019s say you currently build release builds with the <b>\/O2 <\/b>switch in VS2008. If you moved to VS2010, you might see on x64:<\/font><\/font><\/p>\n<p class=\"MsoListParagraph\"><span><span><font size=\"3\">\u00b7<\/font><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/span><\/span><\/span><font size=\"3\"><font face=\"Calibri\">10% faster code if you turned on <b>\/GL<\/b>,<\/font><\/font><\/p>\n<p class=\"MsoListParagraph\"><span><span><font size=\"3\">\u00b7<\/font><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/span><\/span><\/span><font size=\"3\"><font face=\"Calibri\">16% faster code if you turned on <b>\/GL and PGO<\/b><\/font><\/font><\/p>\n<p class=\"MsoNormal\"><font size=\"3\"><font face=\"Calibri\">and on x86,<\/font><\/font><\/p>\n<p class=\"MsoListParagraph\"><span><span><font size=\"3\">\u00b7<\/font><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/span><\/span><\/span><font size=\"3\"><font face=\"Calibri\">7% faster code if you turned on <b>\/GL,<\/b><\/font><\/font><\/p>\n<p class=\"MsoListParagraph\"><span><span><font size=\"3\">\u00b7<\/font><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/span><\/span><\/span><font size=\"3\"><font face=\"Calibri\">13% faster code if you turned on <b>\/GL and PGO.<\/b><\/font><\/font><\/p>\n<p class=\"MsoNormal\"><font size=\"3\"><font face=\"Calibri\">Now, for the last couple of releases, a new VC++ project will have <b>\/GL <\/b>on for<b> <\/b>release builds. However, settings for upgraded projects are not changed. So whether you use the Visual Studio build system or your own custom build system, go ahead and check that you are specifying <b>\/GL <\/b>for your release builds! <b><\/b><\/font><\/font><\/p>\n<p class=\"MsoNormal\"><font size=\"3\"><font face=\"Calibri\">&nbsp;The other recommendation I have is to use <b>PGO<\/b>. Doing so requires a larger investment (it requires you to figure out scenarios and create training data) \u2013 but it can improve the performance of your app above and beyond <b>\/GL<\/b>. <b>PGO works best on medium or larger applications.<\/b> Small applications may see little benefit from <b>PGO<\/b>, depending on the application\u2019s workload. <\/font><\/font><\/p>\n<p class=\"MsoNormal\"><font face=\"Calibri\" size=\"3\">We recently created training data and turned on <b>PGO <\/b>for part of the C++ intellisense engine in Visual Studio 2010 and saw ~25% better performance on some scenarios. When we turned on <b>PGO <\/b>for the compiler, we measured ~10% speedup in compiler throughput. Again, you can learn about how to turn on <b>PGO<\/b> in your own builds in Lawrence\u2019s blog post <\/font><a href=\"http:\/\/blogs.msdn.com\/vcblog\/archive\/2008\/11\/12\/pogo.aspx\"><font face=\"Calibri\" color=\"#0000ff\" size=\"3\">here<\/font><\/a><font size=\"3\"><font face=\"Calibri\">. <\/font><\/font><\/p>\n<p class=\"MsoNormal\"><font size=\"3\"><font face=\"Calibri\"><b>\/GL <\/b>shouldn\u2019t increase your build time significantly, but note that <b>\/GL <\/b>is not compatible with Edit and Continue (<b>\/ZI<\/b>) and incremental builds (the linker option \/<b>INCREMENTAL<\/b>). You can read some quick tips about <b>\/GL<\/b> in another previous blog post <\/font><\/font><a href=\"http:\/\/blogs.msdn.com\/vcblog\/archive\/2009\/02\/24\/quick-tips-on-using-whole-program-optimization.aspx\"><font face=\"Calibri\" color=\"#0000ff\" size=\"3\">here<\/font><\/a><font size=\"3\"><font face=\"Calibri\">.<\/font><\/font><\/p>\n<p class=\"MsoNormal\"><font face=\"Calibri\" size=\"3\">As Lawrence describes in his <\/font><a href=\"http:\/\/blogs.msdn.com\/vcblog\/archive\/2008\/11\/12\/pogo.aspx\"><font face=\"Calibri\" color=\"#0000ff\" size=\"3\">blog post<\/font><\/a><font size=\"3\"><font face=\"Calibri\">, with <b>PGO <\/b>your application is built twice \u2013 once for the instrumented build and once for the final optimized build. This means build times increase more significantly, as but as I\u2019ve noted above, the performance increase is also more significant. <\/font><\/font><\/p>\n<p class=\"MsoNormal\"><font size=\"3\"><font face=\"Calibri\">So, if your application is CPU bound, I hope that these numbers will convince you to take a second look at your release build settings and turn on <b>\/GL <\/b>and <b>PGO<\/b>! <\/font><\/font><\/p>\n<p class=\"MsoNormal\"><font face=\"Calibri\" size=\"3\">&nbsp;<\/font><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hi, I\u2019m Lin Xu, a Program Manager working on the C++ compiler. Recently, we collated performance numbers from our testing passes over this release cycle. We track many different benchmarks closely for all of the architectures and switch options (\/O1, \/O2, \/GL, \/PGO). We also track these across multiple CPU models. (Yes, this is quite [&hellip;]<\/p>\n","protected":false},"author":289,"featured_media":35994,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[60,20,61],"class_list":["post-4153","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cplusplus","tag-gl","tag-pgo","tag-profile-guided-optimization"],"acf":[],"blog_post_summary":"<p>Hi, I\u2019m Lin Xu, a Program Manager working on the C++ compiler. Recently, we collated performance numbers from our testing passes over this release cycle. We track many different benchmarks closely for all of the architectures and switch options (\/O1, \/O2, \/GL, \/PGO). We also track these across multiple CPU models. (Yes, this is quite [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts\/4153","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/users\/289"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/comments?post=4153"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts\/4153\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/media\/35994"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/media?parent=4153"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/categories?post=4153"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/tags?post=4153"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}