A tool to find duplicate copies in a build



As part of our builds, quite a few projects copy files to the binaries directory or other locations.  These can be anything from image files to test scripts.  To have our builds complete more quickly, we use the multi-process option (/maxcpucount) of msbuild to build projects in parallel.

This all sounds normal, so what’s the problem?  In a large team, people will sometimes inadvertently add statements to different project files that copy files to the same destination.  When those project files have no references to each other, directly or indirectly, msbuild may build them in parallel.  If it does happen to run those projects in parallel on different nodes and the copies happen at the same time, the build breaks because one copy succeeds and one fails.  Since the timing is not going to be the same on every build, the result is random build breaks.  Build breaks suck.  They drain the productivity of the team and are frustrating.

Whether the build is continuous integration or gated checkin, these breaks may happen randomly.  They are most likely to happen on incremental builds where the percentage of time spent during the build on doing copies is much higher than a clean build.  Tracking them down as they happen is painful.

So, I wrote a simple tool to find cases in the log where the destination is the same for more than one copy operation.  The comment in the header explains what the code is looking for.  Running this on the normal verbosity msbuild logs from a clean build ensures that all of the copies are in the log for analysis.  We also build what we call partitions separately, resulting in the number of log files being a multiple of the number of partitions being built (a partition is a subset of the source and is typically a top-level directory in the branch).

In our internal builds, we record multiple log files for our builds, including minimal, normal, and detailed.  When there’s a problem, you can start with the smaller build logs and increase to the more verbose logging as needed.

I’m posting this for any of you who might run into the same thing.  Keep in mind that there are other things, such as antivirus software, that can interfere with the build process and result in errors for files being copied.

Buck Hodges

Follow Buck   


Leave a comment