A compiler can only optimize away data or a function if a compiler can prove that the data or function will never be referenced. In a non-LTCG compile (i.e. a build with Whole Program Optimization (WPO) disabled) the compiler’s visibility is only limited to a single module (.obj), so for data and function that has global scope, the compiler will never know if other modules will be using them. As a result of this compiler can never optimize them away.
Linker has a good view of all the modules that will be linked together, so linker is in a good position to optimize away unused global data and unreferenced functions. The linker however manipulates on a section level, so if the unreferenced data/functions are mixed with other data or functions in a section, linker won’t be able to extract it out and remove it. In order to equip the linker to remove unused global data and functions, we need to put each global data or function in a separate section, and we call these small sections “COMDATs“.
(/Gw) Compiler Switch
Today using the (/Gy) compiler switch instructs the compiler to only package individual functions in the form of packaged functions or COMDATs, each with its own section header information. This enables function-level linkage and enables linker optimizations ICF (folding together identical COMDATs) and REF(eliminating unreferenced COMDATs). In VS2013 (download here), we have introduced a new compiler switch (/Gw) which extends these benefits (i.e. linker optimizations) for data as well.
For further understanding let us take a look at an example below. Feel free to try them out yourselves:
Figure 1: Linker optimizations (i.e. REF) triggered from using the /Gy compiler flag
If a user compiles the code snippets in figure 1 (foo.cpp and bar.cpp) with/without the /Gy compiler flag and subsequently links (link /opt:ref /map foo.obj bar.obj) it with the linker optimizations enabled (i.e. /opt:ref), in the resultant map file generated one can observe that function ‘foo’ have been removed. However one can still observe the occurrence of global data ‘globalRefCount’ in the map file. As mentioned before, /Gy only instructs the compiler to package individual functions as COMDATs and not data. Additionally, supplying the /Gw compiler flag in addition to the /Gy flag allows packaging of both data and functions as COMDATs allowing the linker to remove both function ‘foo’ and ‘globalRefCount’.
(/Gw) with LTCG (Whole Program Optimization)
Given that with LTCG enabled, the compiler visibility extends beyond that of a single module it might not be obvious to understand what a user might gain from enabling this feature with WPO builds. For example, if you compile the example depicted in figure 1 with WPO the compiler can optimize away both the function ‘foo’ and data entity ‘globalRefCount’. However if the example described above is slightly changed to what’s depicted in the figure below, then just compiling with WPO does not help. Once an address of a global variable is taken it is very hard for the compiler to prove that the global is not read or written to by other functions in the magic world of pointers and the compiler gives up optimizing such scenarios even with WPO enabled.
But with the help of /Gw, linker can still remove unreferenced data entities here, because linker’s REF optimization will not be blocked by things like address taken. Linker knows precisely whether it’s referenced or not because any reference to global data would show up as a linker fixup (coff relocation), and that has nothing to do with address taken or not. The example below may look like some hand-made case, but it can be easily translated to real world code.
Figure 2: Address of a global variable is taken
With and with only WPO enabled builds, we also benefit from linker ICF optimization (link /ltcg /map /opt:icf foo.obj bar.obj /out:example.exe) besides REF when /Gw is on. If we take a look at example depicted in figure 3 below, without /Gw, there’ll be two identical ‘const int data1[], const int data2[]’ in the final image. If we enable ‘/Gw’, ‘data1’ and ‘data2’ will be folded together. Please note, the ICF optimization will only be applied for identical COMDATs where their address is not taken, and they are read only. If a data is not address taken, then breaking address uniqueness by ICF won’t lead to any observable difference, thus it is valid and conformant to the standard.
Figure 3: Linker ICF Optimization for Data COMDAT
Wrap Up
To summarize, with ‘/Gw’ compiler switch we now enable linker optimizations (REF and ICF) to also work upon unreferenced and identical Data COMDATs. For folks already taking advantage of function-level linkage this should be fairly easy to understand. We have seen double digit gains (%) in size reduction when enabling this feature for building binaries which constitute some high volume Microsoft products so I would encourage you folks to try it out as well and get back to us. At this point you should have everything you need to get started! Additionally, if you would like us to blog about some other compiler technology please let us know we are always interested in learning from your feedback.
0 comments