June 12th, 2019

If you can use GUIDs to reference files, why not use them to remember “recently used” files so they can survive renames and moves?

You can ask for a GUID identifier for a file, and use that GUID to access the file later. You can even recover a (perhaps not the) file name from the GUID.

David Trapp wishes programs would use GUIDs to reference files so that references to recently used files can survive renames and moves.

Be careful what you wish for.

It is a common pattern to save a file by performing two steps.

  • Create a temporary file with the new contents.
  • Rename the original file to a *.bak or some other name.
  • Rename the temporary file to the original name.
  • (optional) Delete the *.bak file.

Programs use this multi-step process so that an the old copy of the file remains intact until the new file has been saved successfully. Once that’s done, they swap the new file into place.

Unfortunately, this messes up your GUID-based accounting system.

If you tracked the file by its GUID, then here’s what you see:

  • Create a temporary file, which gets a new GUID.
  • Rename the original file. It retains its GUID but has a new name.
  • Rename the temporary file file. It retains its GUID but has a new name.

The GUID that you remembered does not refer to the new file; it refers to the old file. Even worse, if the program took the optional step of deleting the renamed original, you now have a GUID that refers to a deleted file, which means that when you try to open it, the operation will fail.

Programs can avoid this problem by using the Replace­File function to promote the temporary file. The Replace­File function preserves the file identifier, among other things.

In practice, use of the Replace­File function is not as widespread as you probably would like, so using only GUIDs to track files will technically track the file, but may not track the file you intend. Because people still think of the file name as the identifier for a file, not its GUID.

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

11 comments

Discussion is closed. Login to edit/delete existing comments.

  • Joe Beans

    The GUID should just be a secret backup in case the filename isn't found. Every time you open the file, you also scan the latest GUID value for that filename. This keeps the user in charge while still getting the full benefit from the GUID. I never treat the GUID as a primary key for a user-managed file, in fact I would use the GUID to enforce the integrity of an internal application file AGAINST the...

    Read more
    • David Trapp

      Yes, this sounds like it would work even in the current state of affairs!

  • Ji Luo

    If I recall correctly (which might not be the case), a previous version of the documentation of `ReplaceFile` function says that the object identifier is not preserved... Aha, there must be a bug in the documentation, it says "The replacement file assumes the name of the replaced file and its identity" and "... also preserves the following attributes of the original file: ... Object identifier" and "The resulting file has the same file ID as...

    Read more
    • David Trapp

      Regarding the recommended way to store a LRU list: I didn't realize that - I only knew that this is what Windows uses in the global "last used documents" list (which I also don't use simply because most of the time programs I use don't write anything there and even if they do, the list is too short and ends up containing a bunch of TXT files I opened in the shell for example).

      It sounds...

      Read more
      • Ji Luo

        Of course, use the `IPersistStream` interface of `CLSID_ShellLink`, as demonstrated in a previous entry. I believe Jump Lists are stored that way, too. But perhaps storing files isn’t that bad. You could store the MRU list in your AppData folder. This avoids cluttering user-visible folders as well as the search index.

  • Henrik Andersson

    Today Raymond shows that he can math by offering a two step procedure with four entries.
    And in a more serious tone, this seems like a thing that the usual hack for this exact situation should be accounting for. In addition to migrating the timestamp, why not the guid too? Oh well, it’s probably too late to change, some program would probably segfault if this changed.

    • David Walker

      That’s what I was saying (although less elegantly).  Sure, after renaming a file with all of the steps, then “migrade” the GUID too.  If the underlying file system supports GUIDs.  🙂

  • David Walker

    I don't think he's proposing using the GUID as a file name.  Here's how this would be implemented:

    * Create a temporary file, which gets a new GUID (stored internally somewhere in the file)
    * Rename the original file
    * Rename the new file to have the original file name
    * Save the GUID from the original file
    * Delete the original file
    * Reassign the GUID in the new file (which has the original file...

    Read more
    • Ji Luo

      The concept of file IDs might not exist on the underlying file system, and different FSes might have different notions of IDs, which makes it difficult to present such an interface at the level of Win32 (they’re at a lower level than Win32). `ReplaceFile` (as well as other Win32 file APIs) abstracts away this.

      • David Walker

        It always annoys me to hear a statement like "the concept of file IDs might not exist on the underlying file system".  Sure, the concept of file IDs might not exist on the underlying file system.  But then again, it might!  And if that concept exists on whatever file system you are using, then SUPPORT the implementation.  If that concept does not exist on the underlying file system, then you can't do this, and you...

        Read more
      • David Trapp

        My wish was more of hypothetical nature - it's clear that in the current state of the ecosystem this won't really work, but if an ID based concept had been more prominent from the start and filenames and paths had been just a user-facing display name/organizational structure - meaning that applications would just not work properly if they did that rename-delete thing without somehow preserving the ID - then possibly things could have ended up...

        Read more