{"id":107327,"date":"2022-10-27T07:00:00","date_gmt":"2022-10-27T14:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=107327"},"modified":"2022-10-27T06:32:30","modified_gmt":"2022-10-27T13:32:30","slug":"20221027-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20221027-00\/?p=107327","title":{"rendered":"Why am I seeing two WRITE requests at the same offset from a single call to WriteFile?"},"content":{"rendered":"<p>A customer was doing a little performance analysis and found an oddity: A single non-extending write request at the application layer was turning into <i>two<\/i> write requests at the I\/O layer, both at the same offset:<\/p>\n<table class=\"cp3\" style=\"border-collapse: collapse;\" border=\"1\" cellspacing=\"0\" cellpadding=\"3\">\n<tbody>\n<tr>\n<th>Op<\/th>\n<th>File<\/th>\n<th>Offset<\/th>\n<th>Length<\/th>\n<th>Flags<\/th>\n<th>Priority<\/th>\n<th>Status<\/th>\n<\/tr>\n<tr>\n<td valign=\"top\">IRP_MJ_WRITE<\/td>\n<td valign=\"top\">test.txt<\/td>\n<td align=\"right\" valign=\"top\">69,632<\/td>\n<td align=\"right\" valign=\"top\">61,440<\/td>\n<td valign=\"top\">Non-cached, Write Through<\/td>\n<td valign=\"top\">Normal<\/td>\n<td valign=\"top\">SUCCESS<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">IRP_MJ_WRITE<\/td>\n<td valign=\"top\">test.txt<\/td>\n<td align=\"right\" valign=\"top\">69,632<\/td>\n<td align=\"right\" valign=\"top\">61,440<\/td>\n<td valign=\"top\">Non-cached, Write Through<br \/>\nPaging, Synchronous Paging<\/td>\n<td valign=\"top\">Normal<\/td>\n<td valign=\"top\">SUCCESS<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Friend-of-the-blog Malcolm Smith observed that the first write is non-cached. One possibility is that the first write is a flush of previously-dirty data due to a cached write or a writable memory-mapped view. The system then follows up with the second write, which is triggered by the application-level write.<\/p>\n<p>However, if nobody else is writing to the file at the time the test is being run, then that scenario is ruled out.<\/p>\n<p>Another possibility is that the file is compressed. In that case, the application-level write goes into the system cache, and then is flushed. This looks like two write operations from the file system&#8217;s point of view, which is what the log is watching. But really, only one write is issued to the physical drive.<\/p>\n<p>The customer confirmed that they are writing to a compressed file.<\/p>\n<p>Malcolm explained that NTFS compression is rather expensive.<\/p>\n<p>The idea behind NTFS compression is that the file is broken up into 64KB chunks, with each chunk compressed separately,\u00b9 and each chunk is managed independently.<\/p>\n<p>This means that a simple write operation that isn&#8217;t a full chunk explodes into a sequence of operations:<\/p>\n<ul>\n<li>Read the enclosing chunk<\/li>\n<li>Decompress the enclosing chunk<\/li>\n<li>Update the uncompressed chunk to incorporate the newly-written data<\/li>\n<li>Compress the modified chunk<\/li>\n<li>Find space on the disk for the modified chunk<\/li>\n<li>Write the modified chunk to disk<\/li>\n<li>Release the space that the old chunk occupied<\/li>\n<\/ul>\n<p>One consequence of this is that compressed files are <i>pathologically fragmented<\/i>. The location of each chunk is unlikely to be correlated with the location of any other chunk in the file, especially after a bunch of updating write operations have occurred. Every compressed chunk winds up stored in a random location on the disk.<\/p>\n<p>Furthermore, all this activity entails a lot of updates to the NTFS metadata, which is not just additional work, but it creates additional synchronization bottlenecks. In particular, a write to a compressed file cannot overlap with another write or read to that file, since the write has the metadata lock. For a non-compressed file, non-extending writes can coexist with reads and other non-extending writes, since none of these operations update file location metadata. They&#8217;re just writing to the sectors that hold the data.<\/p>\n<p>NTFS compression can be used to reduce disk space requirements, but it is not well-suited to data that is constantly being modified. And if you&#8217;re studying performance issues, compressed files are going to show up as a bottleneck.<\/p>\n<p>The customer thanked Malcolm for his assistance, and noted that they were doing their performance analysis on their development system, not a production system, and that explains the unexpected presence of file compression.<\/p>\n<p><b>Bonus reading<\/b>: <a title=\"The Alpha AXP, epilogue: A correction about file system compression on the Alpha AXP\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20170831-00\/?p=96915\"> The Alpha AXP, epilogue: A correction about file system compression on the Alpha AXP<\/a>.<\/p>\n<p>\u00b9 Or at least, you <i>hope<\/i> that the chunk can be compressed. If you&#8217;re unlucky, the chunks won&#8217;t compress, and you went to all this extra effort and got nothing for it.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It started as one write, but turned into a lot of other things.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[26],"class_list":["post-107327","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-other"],"acf":[],"blog_post_summary":"<p>It started as one write, but turned into a lot of other things.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/107327","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=107327"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/107327\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=107327"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=107327"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=107327"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}