{"id":12153,"date":"2010-12-01T07:00:00","date_gmt":"2010-12-01T07:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/2010\/12\/01\/how-do-i-delete-bytes-from-the-beginning-of-a-file\/"},"modified":"2010-12-01T07:00:00","modified_gmt":"2010-12-01T07:00:00","slug":"how-do-i-delete-bytes-from-the-beginning-of-a-file","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20101201-00\/?p=12153","title":{"rendered":"How do I delete bytes from the beginning of a file?"},"content":{"rendered":"<p>\nIt&#8217;s easy to append bytes to the end of a file:\nJust open it for writing, seek to the end, and start writing.\nIt&#8217;s also easy to delete bytes from the end of a file:\nSeek to the point where you want the file to be truncated and call\n<code>SetEndOfFile<\/code>.\nBut how do you delete bytes from the beginning of a file?\n<\/p>\n<p>\nYou can&#8217;t, but you sort of can, even though you can&#8217;t.\n<\/p>\n<p>\nThe underlying abstract model for storage of file contents is in the\nform of a chunk of bytes, each indexed by the file offset.\nThe reason appending bytes and truncating bytes is so easy\nis that doing so doesn&#8217;t alter the file offsets of any other\nbytes in the file.\nIf a file has ten bytes and you append one more,\nthe offsets of the first ten bytes stay the same.\nOn the other hand, deleting bytes from the front or middle of a file\nmeans that all the bytes that came after the deleted bytes\nneed to &#8220;slide down&#8221; to close up the space.\nAnd there is no &#8220;slide down&#8221; file system function.\n<\/p>\n<p>\nOne reason for the absence of a &#8220;slide down&#8221; function is that\ndisk storage is typically not byte-granular.\nStorage on disk is done in units known as <i>sectors<\/i>,\na typical sector size being 512 bytes.\nAnd the storage for a file\nis allocated in units of sectors,\nwhich we&#8217;ll call <i>storage chunks<\/i> for lack of a better term.\nFor example, a 5000-byte file occupies ten sectors of storage.\nThe first 512 bytes go in sector&nbsp;0,\nthe next 512 bytes go in sector&nbsp;1,\nand so on, until the last 392 bytes go into sector&nbsp;9,\nwith the last 120 bytes of sector&nbsp;9 lying unused.\n(There are exceptions to this general principle, but they\nare not important to the discussion,\nso there&#8217;s no point bringing them up.)\n<\/p>\n<p>\nTo append ten bytes to this file, the file system can just\nstore them after the last byte of the existing contents.\nleaving 110 bytes of unused space instead of 120.\nSimilarly, to truncate those ten bytes back off,\nthe logical file size can be set back to 110,\nand the extra ten bytes are &#8220;forgotten.&#8221;\n<\/p>\n<p>\nIn theory, a file system could support truncating an\nintegral number of storage chunks\noff the front of the file by updating its internal\nbookkeeping about file contents without having to\nmove data physically around the disk.\nBut in practice, no popular file system implements this,\nbecause, as it turns out,\nthe demand for the feature isn&#8217;t high enough to warrant\nthe extra complexity. (Remember: Minus 100 points.)\n<\/p>\n<p>\nBut what&#8217;s this &#8220;you sort of can&#8221; tease?\nAnswer: Sparse files.\n<\/p>\n<p>\nYou can use an NTFS sparse file\nto decommit the storage for the data at the start of the file,\neffectively &#8220;deleting&#8221; it.\nWhat you&#8217;ve really done is set the bytes to logical zeroes,\nand if there are any whole storage chunks in that range, they can\nbe decommitted and don&#8217;t occupy any physical space on the drive.\n(If somebody tries to read from decommitted storage chunks, they just\nget zeroes.)\n<\/p>\n<p>\nFor example, consider a\n1<a HREF=\"http:\/\/blogs.msdn.com\/oldnewthing\/archive\/2009\/06\/11\/9725386.aspx\">MB<\/a>\nfile on a disk that uses 64KB storage chunks.\nIf you decide to decommit the first 96KB of the file,\nthe first storage chunk of the file will be returned to the drive&#8217;s\nfree space,\nand the first 32KB of the second storage chunk will be set to zero.\nYou&#8217;ve effectively &#8220;deleted&#8221; the first 96KB of data off the front\nof the file, but the file offsets haven&#8217;t changed.\nThe byte at offset 98,304 is still at offset 98,304 and did not\nmove to offset zero.\n<\/p>\n<p>\nNow, a minor addition to the file system would get you that\nmagic &#8220;deletion from the front of the file&#8221;:\nAssociated with each file would be a 64-bit value representing\nthe <i>logical byte zero<\/i> of the file.\nFor example, after you decommitted the first 96KB of the file above,\nthe <i>logical byte zero<\/i> would be 98,304,\nand all file offset calculations on the file would be biased by\n98,304 to convert from logical offsets to physical offsets.\nFor example, when you asked to see byte 10, you would actually get\nbyte 98314.\n<\/p>\n<p>\nSo why not just do this?\nThe <i>minus 100 points<\/i> rule applies.\nThere are a lot of details that need to be worked out.\n<\/p>\n<p>\nFor example, suppose somebody has opened the file and seeked\nto file position 102,400.\nNext, you attempt to delete 98,304 bytes from the front of the file.\nWhat happens to that other file pointer?\nOne possibility is that the file pointer offset stays at 102,400,\nand now it points to the byte that used to be at offset\n200,704.\nThis can result in quite a bit of confusion, especially\nif that file handle was being written to:\nThe program writing to the handle issued two consecutive\nwrite operations, and the results ended up 96KB apart!\nYou can imagine the exciting data corruption scenarios that would\nresult from this.\n<\/p>\n<p>\nOkay, well another possibility is that the file pointer offset\nmoves by the number of bytes you deleted from the front of the file,\nso the file handle that was at 102,400 now shifts to file position 4096.\nThat preserves the consecutive read and consecutive write patterns\nbut it completely messes up another popular pattern:\n<\/p>\n<pre>\noff_t oldPos = ftell(fp);\nfseek(fp, newPos, SEEK_SET);\n... do stuff ...\nfseek(fp, oldPos, SEEK_SET); \/\/ restore original position\n<\/pre>\n<p>\nIf bytes are deleted from the front of the file during the\n<i>do stuff<\/i> portion of the code, the attempt to restore\nthe original position will restore the wrong original position\nsince it didn&#8217;t take the deletion into account.\n<\/p>\n<p>\nAnd this discussion still completely ignores the issue of\nfile locking.\nIf a region of the file has been locked, what happens when\nyou delete bytes from the front of the file?\n<\/p>\n<p>\nIf you really like this <i>simulate deleting\nfrom the front of the file by decommitting bytes from the\nfront and applying an offset to future file operations<\/i>\ntechnique, you can do it yourself.\nJust keep track of the magic offset and apply it to all your\nfile operations.\nAnd I suspect the fact that you can simulate the operation\nyourself is a major reason why the feature doesn&#8217;t exist:\nTime and effort is better-spent adding features that applications\ncouldn&#8217;t simulate on their own.\n<\/p>\n<p>\n[Raymond is currently away; this message was pre-recorded.]<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It&#8217;s easy to append bytes to the end of a file: Just open it for writing, seek to the end, and start writing. It&#8217;s also easy to delete bytes from the end of a file: Seek to the point where you want the file to be truncated and call SetEndOfFile. But how do you delete [&hellip;]<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[26],"class_list":["post-12153","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-other"],"acf":[],"blog_post_summary":"<p>It&#8217;s easy to append bytes to the end of a file: Just open it for writing, seek to the end, and start writing. It&#8217;s also easy to delete bytes from the end of a file: Seek to the point where you want the file to be truncated and call SetEndOfFile. But how do you delete [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/12153","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=12153"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/12153\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=12153"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=12153"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=12153"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}