{"id":19123,"date":"2009-02-17T10:00:00","date_gmt":"2009-02-17T10:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/2009\/02\/17\/why-doesnt-the-file-system-have-a-function-that-tells-you-the-number-of-files-in-a-directory\/"},"modified":"2009-02-17T10:00:00","modified_gmt":"2009-02-17T10:00:00","slug":"why-doesnt-the-file-system-have-a-function-that-tells-you-the-number-of-files-in-a-directory","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20090217-00\/?p=19123","title":{"rendered":"Why doesn&#8217;t the file system have a function that tells you the number of files in a directory?"},"content":{"rendered":"<p>There are any number of bits of information you might want to query from the file system, such as the number of files in a directory or <a href=\"http:\/\/blogs.msdn.com\/oldnewthing\/archive\/2007\/10\/29\/5750353.aspx#5781390\"> the total size of the files in a directory<\/a>. Why doesn&#8217;t the file system keep track of these things?<\/p>\n<p> Well, of course, one answer is that it certainly couldn&#8217;t keep track of every possible fragment of information anybody could possibly want, because that would be an infinite amount of information. But another reason is simply a restatement of the principle we learned last time: <a href=\"http:\/\/blogs.msdn.com\/oldnewthing\/archive\/2009\/02\/16\/9425124.aspx\"> Because the file system doesn&#8217;t keep track of information it doesn&#8217;t need<\/a>. <\/p>\n<p> The file system doesn&#8217;t care how many files there are in the directory. It also doesn&#8217;t care how many bytes of disk space are consumed by the files in the directory (and its subdirectories). Since it doesn&#8217;t care, it doesn&#8217;t bother maintaining that information, and consequently it avoids all the annoying problems that come with attempting to maintain the information. <\/p>\n<p> For example, one thing I noticed about many of the proposals for maintaining the size of a directory in the file system is that very few of them addressed the issue of hard links. Suppose a directory contains two hard links to the same underlying file. Should that file be double-counted? If a file has 200 hard links, then a change to the size of the file would require updating the size field in 200 directories, not <a> just one<\/a> as one commenter postulated. (Besides, the file size isn&#8217;t kept in the directory entry anyway.) <\/p>\n<p> <a href=\"http:\/\/blogs.msdn.com\/oldnewthing\/archive\/2007\/10\/29\/5750353.aspx#5766417\"> Another issue most people ignored was security<\/a>. If you&#8217;re going to keep track of the recursive directory size, you have to make sure to return values consistent with <i>each user&#8217;s<\/i> permissions. If a user does not have permission to see the files in a particular directory, you&#8217;d better not include the sizes of those files in the &#8220;recursive directory size&#8221; value when that user goes asking for it. That would be an information disclosure security vulnerability. Now all of a sudden that single 64-bit value is now a complicated set of values, each with a different ACL that controls which users each value applies to. And if you change the ACL on a file, the file system would have to update the file sizes for each of the directories that contains the file, because the change in ACL may result in a file becoming visible to one user and invisible to another. <\/p>\n<p> Yet another cost many people failed to take into account is just the amount of disk I\/O, particular writes, that would be required. Generating additional write I\/O is a bad idea in general, particularly on media with a limited number of write cycles like USB thumb drives. One commenter did note that <a href=\"http:\/\/blogs.msdn.com\/oldnewthing\/archive\/2007\/10\/29\/5750353.aspx#5777186\"> this metadata could not be lazy-written<\/a> because a poorly-timed power outage would result in the cached value being out of sync with the actual value. <\/p>\n<p> Indeed the added cost of all the metadata writes is one of the reasons why <a href=\"http:\/\/blogs.technet.com\/filecab\/archive\/2006\/11\/07\/disabling-last-access-time-in-windows-vista-to-improve-ntfs-performance.aspx\"> Windows Vista no longer updates the Last Access time by default<\/a>. <\/p>\n<p> <b>Bonus chatter<\/b>: My colleague <a href=\"http:\/\/blogs.msdn.com\/aaron_margosis\/\"> Aaron Margosis<\/a> points out a related topic over on the <a href=\"http:\/\/blogs.msdn.com\/ntdebugging\/\"> ntdebugging<\/a> blog: <a href=\"http:\/\/blogs.msdn.com\/ntdebugging\/archive\/2008\/07\/03\/ntfs-misreports-free-space.aspx\"> <i>NTFS Misreports Free Space?<\/i><\/a> on the difficulties of accurate accounting, especially in the face of permissions which don&#8217;t grant you total access to the drive. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>There are any number of bits of information you might want to query from the file system, such as the number of files in a directory or the total size of the files in a directory. Why doesn&#8217;t the file system keep track of these things? Well, of course, one answer is that it certainly [&hellip;]<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-19123","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>There are any number of bits of information you might want to query from the file system, such as the number of files in a directory or the total size of the files in a directory. Why doesn&#8217;t the file system keep track of these things? Well, of course, one answer is that it certainly [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/19123","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=19123"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/19123\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=19123"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=19123"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=19123"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}