Converting to MSBuild
My name is Andreea Isac and I’ve been a member of the Visual C++ Libraries team for one year and a half. Since almost all my team mates (Martyn Lovell, Ale Contenti, George Mileka, Alan Chan) have posted already, I won’t repeat what our team is responsible for!
I have been involved for many months in the conversion process of our builds. We started this work as a feature crew, with one person responsible for every area that we (Visual C++) own: IDE, front end, back end, OpenMP, CRT, ATL, MFC, … Ramping up wasn’t easy, since we were not familiar with the msbuild technology we were about to embrace and because we seemed to face many challenges that no one hit before, due to our native source base and interop with managed code. VC++ has a reputation for the most complex builds and we’re the most numerous nmake fans in the division.
But after most of the decisions have been made and obstacles overcome, with great help from the msbuild team, work became smoother and easier to estimate. All VC++ projects were successfully converted, except for CRT, which was by far the most complex of all and lasted 3 times longer. I owned the CRT conversion and remember even now the moment when I chose doom! <g>. Alan and I had to choose CRT or ATL/MFC. I said CRT first. The truth, however, is that I have not regretted my decision up to this day. The effort involved was huge, especially since I didn’t have any previous significant build experience, but the satisfaction of learning the old and new technology, of understanding such a complex build, of cleaning, reorganizing and transforming (although not as much as I would have liked, but time is always a constraint) a build as old as 20 years… paid for all the effort.
The conversion work was then on hold for around 2 months, during a very intensive servicing milestone. We scaled back work on our build conversion to focus on other Orcas priorities for a while, making this even harder. But since this is such a valuable step, I am back to finish the conversion work, this time owning VC++ builds entirely. Because this conversion lasted so many months before being integrated into the main branch, many significant build changes have been done (in the old technology) since the last sync and I am responsible now for converting them and preparing the final integration.
We are not the only beneficiaries of this conversion work. You also will be able to rebuild CRT, ATL and MFC, as we’ve always supported, with msbuild. For our internal builds, we’ve been using the internal C++ msbuild support, but as soon as the external support (including tasks and targets for native code) will be finalized, those builds will be available for customers. Not with Orcas unfortunately.
Now back to our internal builds. Before .NET, VC++ was using huge and complex makefiles. Nmake was the perfect technology to use at that time. Since managed code came along and the entire product grew in size, the dependencies between components became more complex. Thus a new build model was designed and implemented, but still on top of nmake. The concepts of passes and traversals were introduced. A build would depth first traverse the tree looking in “dirs” files and build the leafs defined by “sources” files, which wrap and interact very conveniently with the nmake scripts. Such a traversal could not satisfy all the existing dependencies, so 3 passes were needed: generation pass (builds idl, generates headers and assembly attributes, preprocesses), compiling pass (compiles source and resource files, assembles assembler files, builds static and import libs, creates tlb) and linking pass (links dlls and executables, performs various tasks like copying or zipping files, …). Since VC++ practices self builds, the concept of phases was introduced. In the first phase (BOOT) all our code base is built against a frozen (from previous builds) set of tools. In the second phase (SELF), the same code base is rebuilt with the tools obtained in BOOT phase. More dependencies are expressed as some components are built in particular phases and not in others.
Converting the code base of a division to msbuild can’t happen instantly, so the msbuild team provided enough support of interaction with the old build. It supports the passes and phases model, it can be run wrapped by the old build so that a hole tree can be converted gradually, starting with leafs towards bigger sub-trees. Msbuild provides also hooking points so that custom targets may be run at the right time, before or after the right pass or phase.
Msbuild alone brings with it a new traversal technique and dependencies specification, creating the possibility of building in only one pass without breaking dependencies and thus improving the performance a l ot. But until the multi processor build engine is finalized, the old build will still wrap in our division all the msbuild conversions of leafs.
Since I converted the CRT build from scratch, I am mostly experienced in msbuild conversions from makefiles rather than the old build sources or vcprojs. I can share with you some of the difficulties I encountered, although it’s very unlikely that you will ever face such scenarios… and these are the only weaknesses I could find in the msbuild technology versus nmake, being allowed there only because there is no reason for those specific improvements.
- You can’t build source files together in the same nativeproj, producing directly the same binary, if they have different flags, defines, includes. The workaround is to compile the source files into objs in as many nativeprojects as different scenarios exist, and link the produced objs into the final binary in a one more nativeproj. There are only a few scenarios where you can fit together source files while having different: additional options, managed or native, precompiled header and a few more cases.
- You can’t benefit of inference rules with msbuild. If you have in your makefiles targets (libs, dlls, exes) that depend on huge lists of objs obtained with inference and complex rules (source files with the same name and different extensions, eventually placed in a root and in a subdirectory as well), it can be a hard time to manually deduce exactly which source files to be included, what extension and path has the source file that wins the rule.
Other than that, there are a few more fancy nmake features that are not supported in msbuild, but they didn’t prove to be useful and needed, like: string preprocessing on itemgroups (msbuild supports this for properties), although many of the scenarios can be solved with defining metadata and performing the proper transforms, select a particular item in a group, use of regular expressions besides wildcards, runtime macros.
The rest of nmake features have a direct correspondent or an indirect accomplishing technique, like:
– nmake replacing modifiers –> CharacterReplace task
– tokenize macro modifiers –> various displays of items in a group @(MyItemGroup, ‘+’)
– makefile directives –> conditions can be applied to any msbuild element: property, property group, item group and group of item groups, targets, tasks, imports, …
– filename components macro modifiers –> predefined metadata of items
– other macro modifiers –> dynamic creation of item groups using transforms
– @ modifier –> a read lines task
– Concepts of interaction with the command line and build environment, response files, reusing or defining new targets and dependencies are common to both technologies
Plus, msbuild brings a better incremental mechanism, better logging, IntelliSense, more readability, extensibility and transparence into the build code. There is also available (internally for the moment) a clever engine of tracking and analyzing transitive dependencies for tasks, without involving the targets themselves, feature which will likely be also shipped in a future release.
Being able to define metadata to items in a group and to process it with transforms in a target adds enormous power. Still in a future release, item groups will be able to be modified in-place (add new metadata, change existing metadata and exclude items), without the need of creating intermediary item groups and without performing an exponential number of conditions on metadata to ensure items are not included multiple times. This is fabulous.
Batching works great on tasks (using metadata of items involved in the task) and on targets as well (using transforms of outputs based on inputs). This means that a target performs the work only on the out of date pairs of inputs and outputs. With a change of only one input, you won’t see all outputs being rebuilt unless they all depend on it. You can read more about batching here.
Working with tasks is so cool. Not only that you can derive or override existing tasks, there is debug support for them too. If you are asking why to use tasks instead of launching tools from command line, you can read this post.
Overall, I believe converting old builds to msbuild is valuable and so it was proven in our case also. The conversion was a useful dogfooding experience, the builds are more readable than ever, consistent with the entire division, with improved performance in some certain scenarios and they can benefit of a better VS integration.
If you have any questions or if you need guidance converting various scenarios I can be of help. Also, there’s likely to be a series of blog entries on the MSBuild blog in the future, talking about the devdiv conversion.
Software Developer Engineer