{"id":910,"date":"2024-05-14T13:25:59","date_gmt":"2024-05-14T20:25:59","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/?p=910"},"modified":"2024-10-14T10:07:09","modified_gmt":"2024-10-14T17:07:09","slug":"copy-on-write-performance-and-debugging","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/copy-on-write-performance-and-debugging\/","title":{"rendered":"Copy-on-Write performance and debugging"},"content":{"rendered":"<p>This is a follow-up to our previous coverage of Dev Drive and copy-on-write (CoW) linking. See our previous articles from <a href=\"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/dev-drive-and-copy-on-write-for-developer-performance\/\">May 24, 2023<\/a>, <a href=\"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/dev-drive-is-now-available\/\">October 13, 2023<\/a>, and <a href=\"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/copy-on-write-in-win32-api-early-access\/\">November 2, 2023<\/a>.<\/p>\n<p>Dev Drive was released in Windows 11 in October, 2023, and will be part of Windows Server 2025 this fall. <a href=\"https:\/\/learn.microsoft.com\/en-us\/windows-server\/get-started\/whats-new-windows-server-2025#block-cloning-support\">Server 2025<\/a> and Windows 11 24H2 ship with an enhancement to automatically use copy-on-write linking (CoW-in-Win32). Here, we&#8217;ll cover the results of several months of repo build performance testing for several large internal codebases, provide some information on determining whether a file is a CoW link, and share a few tips we found from adding Dev Drive to thousands of Dev Box VMs for daily developer use.<\/p>\n<h2>Repo build performance<\/h2>\n<p>Let&#8217;s start with the chart:<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-content\/uploads\/sites\/72\/2024\/04\/DevDriveCoWWins202404.png\"><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-content\/uploads\/sites\/72\/2024\/04\/DevDriveCoWWins202404-1024x576.png\" alt=\"Chart of Dev Drive + CoW Wins across internal repos ranging from largest reduction with Large C# to smallest with Large C++\" width=\"640\" height=\"360\" class=\"alignnone size-large wp-image-914\" srcset=\"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-content\/uploads\/sites\/72\/2024\/04\/DevDriveCoWWins202404-1024x576.png 1024w, https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-content\/uploads\/sites\/72\/2024\/04\/DevDriveCoWWins202404-300x169.png 300w, https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-content\/uploads\/sites\/72\/2024\/04\/DevDriveCoWWins202404-768x432.png 768w, https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-content\/uploads\/sites\/72\/2024\/04\/DevDriveCoWWins202404-1536x864.png 1536w, https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-content\/uploads\/sites\/72\/2024\/04\/DevDriveCoWWins202404.png 1600w\" sizes=\"(max-width: 640px) 100vw, 640px\" \/><\/a><\/p>\n<p>The highest win of 43% did not replicate for all the repos under test. However, many repos did get a reduction of 10% or more. Several patterns stand out when comparing to the underlying repo code:<\/p>\n<ul>\n<li>Repos containing C# with deep project-to-project dependencies cause MSBuild to copy assemblies many times. These can get a significant benefit from CoW linking.<\/li>\n<li>Repos that perform lots of additional copying to create microservice layouts as part of the build output also get a strong benefit.<\/li>\n<li>Repos heavy in C++ showed only a small win except where they were copying files for microservice layouts. C++ builds in MSBuild do not by default copy output files over and over again, and MSVC tends to generate fewer, larger files, where Dev Drive&#8217;s reduced file I\/O overhead is less effective.<\/li>\n<li>Two repos with low benefit had a project dependency graph with a lot of initial parallelism that was noticeably faster but with a near-linear chain of large projects at the end that reduced the effect of speeding up I\/O in each project. <sup id=\"fnref:1\"><a href=\"#fn:1\" rel=\"footnote\">1<\/a><\/sup><\/li>\n<\/ul>\n<p>Test methodology notes: Tests for reach repo were run on NTFS and Dev Drive partitions on the same Dev Box VM. NuGet and other package caches were placed in the same partition as the source code. All repos used the <a href=\"https:\/\/github.com\/microsoft\/MSBuildSdks\/tree\/main\/src\/CopyOnWrite\">Microsoft.Build.CopyOnWrite<\/a> SDK and, where applicable, an upgraded <a href=\"https:\/\/github.com\/microsoft\/MSBuildSdks\/tree\/main\/src\/Artifacts\">Microsoft.Build.Artifacts<\/a> SDK. CoW-in-Win32 was not available at the time of testing, and may produce different results when released this fall. Five or more iterations were run per test case, with the first one dropped to avoid measuring a cold disk cache. Measurements were of the build phase only, with package restore and inline tests separated out. All builds were run with clean repo and output directories to ensure a full build. Typical build times per iteration were selected to be about 20 minutes, which for large repos usually meant building a specific subdirectory.<\/p>\n<h2>How to determine whether a file is a CoW link<\/h2>\n<p>CoW links are also known as <em>block clones<\/em>, where blocks of data on disk are referred-to from multiple file entries. <code>fsutil<\/code> contains subcommands that let us view files from a block clone point of view. Let&#8217;s take a look at a block clone of an assembly copied from a package to my MSBuild output directory:<\/p>\n<pre><code>&gt; dir Azure.Core.dll\nVolume in drive D is DevDrive\n02\/26\/2024  08:24 AM           400,936 Azure.Core.dll\n\n&gt; fsutil file queryExtentsAndRefCounts Azure.Core.dll\nVCN: 0x0        Clusters: 0x61       LCN: 0x1297cf7  Ref: 0x4\nVCN: 0x61       Clusters: 0x1        LCN: 0x15337ac  Ref: 0x1\n<\/code><\/pre>\n<p>This shows that there are 97 clusters corresponding to the main 400K body of the assembly. Note the <code>Ref: 0x4<\/code> meaning the underlying block at Logical Cluster Number 0x1297cf7 has 4 block clones on the disk volume, of which this is one. The last cluster with one reference holds the block clone reference metadata, which means for every cloned file there is one cluster actually used for tracking purposes.<\/p>\n<h2>Using ProcMon with Dev Drive<\/h2>\n<p>ProcMon uses an included filter driver whose name, e.g. <code>ProcMon24<\/code>, changes over time. Attach the filter driver like:<\/p>\n<pre><code>fsutil devdrv query\n# Take note of the current allow-list of filters\nfsutil devdrv setfiltersallowed ProcMon24,&lt;other filters comma delimited&gt;\nfsutil volume dismount &lt;dev drive letter&gt;\n<\/code><\/pre>\n<p>You can generally leave <code>ProcMon24<\/code> in the allow list, as it is only attached to the volume when ProcMon is in use. Our internal Dev Box images are generated with the filter always added.<\/p>\n<h2>Using Microsoft Performance Recorder (Xperf) with Dev Drive<\/h2>\n<p>Attach the FileInfo filter driver:<\/p>\n<pre><code>fsutil devdrv query\n# Take note of the current allow-list of filters\nfsutil devdrv setfiltersallowed FileInfo,&lt;other filters comma delimited&gt;\nfsutil volume dismount &lt;dev drive letter&gt;:\n<\/code><\/pre>\n<p>Then measure. After measurement, it&#8217;s important to disable FileInfo as it is always attached to the Dev Drive when allowed, slowing performance.<\/p>\n<pre><code>fsutil devdrv setfiltersallowed &lt;original filters comma delimited&gt;\nfsutil volume dismount &lt;dev drive letter&gt;:\n<\/code><\/pre>\n<h2>Finding and fixing leaked CoW references<\/h2>\n<p>Dev Drive, which is based on ReFS, allows only 8176 clones of a data block. If you have a file that fails to copy because of an error related to too many clones, e.g. <a href=\"https:\/\/github.com\/microsoft\/CopyOnWrite\/blob\/37d86909ec35e6ef97087e8873a1cb6bde27b6bc\/lib\/MaxCloneFileLinksExceededException.cs#L16C21-L16C56\"><code>MaxCloneFileLinksExceededException<\/code> from the CoW library<\/a> used in the Microsoft.Build.CopyOnWrite and Microsoft.Build.Artifacts SDKs, winerror <code>ERROR_BLOCK_TOO_MANY_REFERENCES<\/code> = 347, or NTSTATUS <code>STATUS_BLOCK_TOO_MANY_REFERENCES<\/code> = 0xC000048C, you might have too many actual references, or you might need to clean up orphaned references. We ran into this problem on one machine that had run continuous CoW builds for weeks under a prerelease CoW-in-Win32 implementation, so we don&#8217;t expect this to appear in the wild very often.<\/p>\n<p>In an elevated console or PowerShell run the following, where <code>x:<\/code> is the drive letter of your Dev Drive.<\/p>\n<pre><code>refsutil leak x: \/s %TEMP%\\ReFSRepair.tmp\n<\/code><\/pre>\n<p>This will scan and fix dangling references. You can add the <code>\/d<\/code> parameter to detect but not fix these references.<\/p>\n<p>Example output from a volume with a significant number of orphaned references:<\/p>\n<pre><code>C:\\temp&gt;refsutil leak d: \/s .\\refs.tmp\nCreating volume snapshot on drive \\\\?\\Volume{7e38c41b-1cbc-4abd-8c4a-2c5ca0eed7c7}...\nCreating the scratch file...\nBeginning volume scan... This may take a while...\nBegin leak verification pass 1 (Cluster leaks)...\nEnd leak verification pass 1. Found 1270060 leaked clusters on the volume.\n\nBegin leak verification pass 2 (Reference count leaks)...\nEnd leak verification pass 2. Found 10822697373 leaked references on the volume.\n\nBegin leak verification pass 3 (Compacted cluster leaks)...\nEnd leak verification pass 3.\n\nBegin leak verification pass 4 (Remaining cluster leaks)...\nEnd leak verification pass 4. Fixed 10823967433 leaks during this pass.\n\nBegin leak verification pass 5 (Hardlink leaks)...\nEnd leak verification pass 5. Fixed 0 hardlinks, and 0 posix deleted files\/dirs during this pass.\n\nFinished.\nFound leaked clusters: 1270060\nFound reference leaks: 10822697373\nTotal cluster fixed  : 10823967433\n<\/code><\/pre>\n<h2>Conclusion<\/h2>\n<p>Copy-on-write will be on by default for Dev Drive in the 24H2 Windows operating system release wave. Dev Drive and CoW will be available in the Server SKU for the first time starting in Server 2025 later this year. These releases will make many builds on Windows notably faster, particularly C# builds. CoW-in-Win32 will avoid the need to integrate the CoW SDKs or modify other build engines or tools.<\/p>\n<p>In the intervening months, consider integrating the <a href=\"https:\/\/github.com\/microsoft\/MSBuildSdks\/tree\/main\/src\/CopyOnWrite\"><code>CopyOnWrite<\/code> SDK<\/a> into your MSBuild repo and creating a Dev Drive partition on your development machine.<\/p>\n<p>We hope you find your build performance notably faster!<\/p>\n<div class=\"footnotes\">\n<hr \/>\n<ol>\n<li id=\"fn:1\">\n<p>We recommended that the repo owners try <a href=\"https:\/\/github.com\/dfederm\/ReferenceTrimmer\">ReferenceTrimmer<\/a> to see if any parallelism could be recovered by removing unneeded project dependencies.&#160;<a href=\"#fnref:1\" rev=\"footnote\">&#8617;<\/a><\/p>\n<\/li>\n<\/ol>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>This is a follow-up to our previous coverage of Dev Drive and copy-on-write (CoW) linking. See our previous articles from May 24, 2023, October 13, 2023, and November 2, 2023. Dev Drive was released in Windows 11 in October, 2023, and will be part of Windows Server 2025 this fall. Server 2025 and Windows 11 [&hellip;]<\/p>\n","protected":false},"author":119305,"featured_media":725,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"image","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-910","post","type-post","status-publish","format-image","has-post-thumbnail","hentry","category-engineering-at-microsoft","post_format-post-format-image"],"acf":[],"blog_post_summary":"<p>This is a follow-up to our previous coverage of Dev Drive and copy-on-write (CoW) linking. See our previous articles from May 24, 2023, October 13, 2023, and November 2, 2023. Dev Drive was released in Windows 11 in October, 2023, and will be part of Windows Server 2025 this fall. Server 2025 and Windows 11 [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-json\/wp\/v2\/posts\/910","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-json\/wp\/v2\/users\/119305"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-json\/wp\/v2\/comments?post=910"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-json\/wp\/v2\/posts\/910\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-json\/wp\/v2\/media\/725"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-json\/wp\/v2\/media?parent=910"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-json\/wp\/v2\/categories?post=910"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/engineering-at-microsoft\/wp-json\/wp\/v2\/tags?post=910"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}