{"id":733,"date":"2014-03-25T14:39:00","date_gmt":"2014-03-25T14:39:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/vcblog\/2014\/03\/25\/linker-enhancements-in-visual-studio-2013-update-2-ctp2\/"},"modified":"2021-10-04T17:19:34","modified_gmt":"2021-10-04T17:19:34","slug":"linker-enhancements-in-visual-studio-2013-update-2-ctp2","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/cppblog\/linker-enhancements-in-visual-studio-2013-update-2-ctp2\/","title":{"rendered":"Linker Enhancements in Visual Studio 2013 Update 2 CTP2"},"content":{"rendered":"<p>For developer scenarios, linking takes the lion&#8217;s share of the application&#8217;s build time. From our investigation we know that the Visual C++ linker spends a large fraction of its time in preparing, merging and finally writing out debug information. This is especially true for non-<a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/0zza0de8.aspx\">Whole Program Optimization<\/a> scenarios.<\/p>\n<p>In <a href=\"http:\/\/go.microsoft.com\/fwlink\/?LinkId=390521\">Visual Studio 2013 Update 2 CTP2<\/a>, we have added a set of features which help improve link time significantly as measured by products we build here in our labs (AAA Games and Open source projects such as Chromium):<\/p>\n<ul style=\"margin-left: 38pt\">\n<li><strong>Remove unreferenced data and functions<\/strong> (<em>\/Zc:inline<\/em>). This can help all of your projects.<\/li>\n<li><strong>Reduce time spent generating PDB files<\/strong>. This applies mostly to binaries with medium to large amounts of debug information.<\/li>\n<li><strong>Parallelize code-generation and optimization build phase<\/strong> (<em>\/cgthreads<\/em>). This applies to medium to large binaries generated through LTCG.<\/li>\n<\/ul>\n<p>Not all of these features are enabled by default. Keep reading for more details.<\/p>\n<h2>Remove unreferenced data and functions (\/Zc:inline)<\/h2>\n<p>As a part of our analysis we found that we were un-necessarily bloating the size of object files as a result of emitting symbol information even for unreferenced functions and data. This as a result would cause additional and useless input to the linker which would eventually be thrown away as a result of linker optimizations.<\/p>\n<p>Applying <em>\/Zc:inline<\/em> on the compiler command line would result in the compiler performing these optimizations and as a result producing less input for the linker, improving end to end linker throughput.<\/p>\n<p><strong>New Compiler Switch:<\/strong> <em>\/Zc: inline[-]<\/em> &#8211; remove unreferenced function or data if it is COMDAT or has internal linkage only (off by default)<\/p>\n<p><strong>Throughput Impact:<\/strong> Significant (double-digit (%) link improvements seen when building products like Chromium)<\/p>\n<p><strong>Breaking Change:<\/strong> Yes (possibly) for non-conformant code (with the C++11 standard), turning on this feature could mean in some cases you see an unresolved external symbol error as shown below but the workaround is very simple. Take a look at the example below:<\/p>\n<p style=\"padding-left: 30px\">&nbsp;&nbsp;&nbsp;<a href=\"https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/10\/4705.linksample.png\"><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/10\/4705.linksample.png\" alt=\"Image 4705 linksample\" width=\"713\" height=\"348\" class=\"alignnone size-full wp-image-29132\" srcset=\"https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/10\/4705.linksample.png 713w, https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/10\/4705.linksample-300x146.png 300w\" sizes=\"(max-width: 713px) 100vw, 713px\" \/><\/a><\/p>\n<p>If you are using VS2013 RTM, this sample program will compile (<em>cl \/O2 x.cpp xfunc.cpp<\/em>) and link successfully. However, if you compile and link with VS2013 Update 2 CTP2 with <em>\/Zc:inline<\/em> enabled (<em>cl \/O2 \/Zc:inline x.cpp xfunc.cpp<\/em>), the sample will choke and produce the following error message:<\/p>\n<pre>     xfunc.obj : error LNK2019: unresolved external symbol \"public: void __thiscall x::xfunc1(void)\" <br \/>         (?xfunc1@x@@QAEXXZ) referenced in function _main\r\n     x.exe : fatal error LNK1120: 1 unresolved externals<\/pre>\n<p>There are three ways to fix this problem.<\/p>\n<ol>\n<li>Remove the &#8216;inline&#8217; keyword from the declaration of function &#8216;xfunc&#8217;.<\/li>\n<li>Move the definition of function &#8216;xfunc&#8217; into the header file &#8220;x.h&#8221;.<\/li>\n<li>Simply include &#8220;x.cpp&#8221; in xfunc.cpp.<\/li>\n<\/ol>\n<p><strong>Applicability:<\/strong> All but <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/0zza0de8.aspx\">LTCG\/WPO<\/a> and some (debug) scenarios should see significant speed up.<\/p>\n<h2>Reduce time spent generating PDB files<\/h2>\n<p>This feature is about improving type merging speed significantly by increasing the size of our internal data structures (hash-tables and such). For larger PDB&#8217;s this will increase the size at most by a few MB but can reduce link times significantly. Today, this feature is enabled by default.<\/p>\n<p><strong>Throughput Impact:<\/strong> Significant (double-digit(%) link improvements for AAA games)<\/p>\n<p><strong>Breaking Change:<\/strong> No<\/p>\n<p><strong>Applicability:<\/strong> All but <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/0zza0de8.aspx\">LTCG\/WPO<\/a> scenarios should see significant speed up.<\/p>\n<h2>Parallelize code-generation and optimization build phase (\/cgthreads)<\/h2>\n<p>The feature parallelizes (through multiple threads) the code-generation and optimization phase of the compilation process. By default today, we use four threads for the codegen and Optimization phase. With machines getting more resourceful (CPU, IO etc.) having a few extra build threads can&#8217;t hurt. This feature is especially useful and effective when performing a <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/0zza0de8.aspx\">Whole Program Optimization (WPO)<\/a> build.<\/p>\n<p>There are already multiple levels of parallelism that can be specified for building an artifact. The <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/bb651793.aspx\">\/m<\/a> or <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/bb651793.aspx\">\/maxcpucount<\/a> specifies the number of msbuild.exe processes that can be run in parallel. Where, as the <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/bb385193.aspx\">\/MP<\/a> or Multiple Processes compiler flag specifies the number of cl.exe processes that can simultaneously compile the source files.<\/p>\n<p>The <em>\/cgthreads<\/em> flag adds another level of parallelism, where it specifies the number of threads used for the code generation and optimization phase for each individual cl.exe process. If <em>\/cgthreads<\/em>, <em>\/MP<\/em> and <em>\/m<\/em> are all set too high it is quite possible to bring down the build system to its knees making it unusable, so <strong>use with caution<\/strong>!<\/p>\n<p><strong>New Compiler Switch:<\/strong> <em>\/cgthreadsN<\/em>, where N is the number of threads used for optimization and code generation. &#8216;N&#8217; represents the number of threads and &#8216;N&#8217; can be specified between [1-8].<\/p>\n<p><strong>Breaking Change:<\/strong> No, but this switch is currently <strong>not supported<\/strong> but we are considering making it a supported feature so your feedback is important!<\/p>\n<p><strong>Applicability:<\/strong> This should make a definite impact for <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/0zza0de8.aspx\">Whole Program Optimization<\/a> scenarios.<\/p>\n<h2>Wrapping Up<\/h2>\n<p>This blog should give you an overview on a set of features we have enabled in the latest CTP which should help improve link throughput. Our current focus has been to look at slightly larger projects currently and as a result these wins should be most noticeable for larger projects such as Chrome and others.<\/p>\n<p>Please give them a shot and let us know how it works out for your application. It would be great if you folks can post before\/after numbers on linker throughput when trying out these features.<\/p>\n<p>If you are link times are still painfully slow please email me, Ankit, at <a href=\"mailto:aasthan@microsoft.com\">aasthan@microsoft.com<\/a>. We would love to know more!<\/p>\n<p>Thanks to C++ MVP Bruce Dawson, Chromium developers and the Kinect Sports Rivals team for validating that our changes had a positive impact in real-world scenarios.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>For developer scenarios, linking takes the lion&#8217;s share of the application&#8217;s build time. From our investigation we know that the Visual C++ linker spends a large fraction of its time in preparing, merging and finally writing out debug information. This is especially true for non-Whole Program Optimization scenarios. In Visual Studio 2013 Update 2 CTP2, [&hellip;]<\/p>\n","protected":false},"author":264,"featured_media":35994,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[65,25],"class_list":["post-733","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cplusplus","tag-compiler","tag-linker"],"acf":[],"blog_post_summary":"<p>For developer scenarios, linking takes the lion&#8217;s share of the application&#8217;s build time. From our investigation we know that the Visual C++ linker spends a large fraction of its time in preparing, merging and finally writing out debug information. This is especially true for non-Whole Program Optimization scenarios. In Visual Studio 2013 Update 2 CTP2, [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts\/733","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/users\/264"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/comments?post=733"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts\/733\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/media\/35994"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/media?parent=733"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/categories?post=733"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/tags?post=733"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}