When would CopyFile succeed but produce a file filled with zeroes?

Raymond Chen

Raymond

A customer reported that they very rarely encounter cases where they used the Copy­File function to copy a file from a network location to the local system, the function reports success, but when they go to look at the file, it’s filled with zeroes instead of actual file data. They were wondering what could cause this situation. They suspected that the computer may have rebooted and wanted to know whether file contents are flushed to disk under those conditions.

When the Copy­File function returns success, it does not mean that the data has reached physical media. It means that the data was written to the cache. The customer didn’t specify whether the reboot was a crash or an organized restart. If the system crashed, then any unwritten data in the cache is lost. On the other hand, if the reboots through the normal shutdown-and-reboot process, then the cache will be written out as part of the shutdown.

The customer wondered whether passing the COPY_FILE_NO_BUFFERING would cover the crash scenario.

A member of the file system team explained that the COPY_FILE_NO_BUFFERING flag tells the system not to keep data in memory, but rather send it straight to the device. This flag was recommended for large files (which don’t fit in RAM anyway), but it’s a bad idea for small files, since every file will have to wait for the device to respond before the system can move on to the next file. You’ll have to experiment to find the breakeven point for your specific data set and device.

Note, however, that the COPY_FILE_NO_BUFFERING doesn’t solve the problem.

For example, the system might crash while the file copy is still in progress. The Copy­File function creates a 4GB file (say), but manages to copy only 1GB of data into it before crashing. The other 3GB was never copied even though the file claims to be 4GB in size.

Another possibility is that all the file data makes it to the device, but the metadata does not get written to the device before the crash. The Copy­File returned success, but all of the bookkeeping didn’t make it to the device.

Even if you call Flush­File­Buffers, the system could crash before Flush­File­Buffers returns.

One possible way to address these problems is to copy the file to a temporary name, flush the file buffers, and then rename the file to its final name. The downside of this is that it forces synchronous writes to the device, which slows down your overall workflow, so it’s not a cheap algorithm to use.

But let’s step back. Is there a way to avoid slowing down the common case just because there’s a rare problem case?¹

A higher-level solution may be in order: The next time your program runs, you can detect that it did not shut down properly. When that happens, hash the file contents and compare it to the expected value. This moves the expensive operation to the rare case, allowing the file copy to proceed at normal speed.

¹ We’re assuming that system crashes are rare. If they’re not rare, then I think you have bigger problems.

Raymond Chen
Raymond Chen

Follow Raymond   

10 comments

Comments are closed.

  • Avatar
    George Gonzalez

    Long ago, but after all the unreliable hard disks in the IBM At’s, we were all getting a bit complacent about checking for errors after every printf(), and I wondered how important it was to check for file write errors.  Somewhere in MSDN I found a page that actually enumerated all the possible ways a write API could fail.  Memory is fuzzy but I think they listed about NINETEEN different and unusual ways to fail.   So I resolved to start the new year with checking after every IO operation.
      

  • Avatar
    Dimitrios Kalemis

    You write: ” On the other hand, if the reboots through the normal shutdown-and-reboot process, then the cache will be written out as part of the shutdown.”
    I have a question about that. Is what you wrote always true? If during the normal shutdown process (for some reason, like the hardware being too slow at the sector level access) the cache takes too long to be written out, then Windows will kill the cache-writting-out-process (in order for the shutdown to continue). If such a thing happens, then even the normal shutdown process is not a guarantee. What is your opinion?

    • Avatar
      Simon Farnsworth

      Windows won’t kill the cache flush. It will be allowed to finish no matter how long it takes, unless the user reboots in the middle of the cache flush manually (e.g. by pulling the power cable because it’s taking too long, or because the battery goes flat).
      As long as everything is under software control, a standard shutdown will write out the entire cache as part of the process.

      • Avatar
        toasterking

        But even if Windows does not kill the cache flush, a server with a hardware watchdog timer could reset the system if the shutdown takes too long.

        • Avatar
          Simon Farnsworth

          Windows should still be tickling the watchdog to stop the watchdog timer resetting the system until it’s (at least) synced all data to disk and unmounted filesystems so that they’ll be clean on next boot.

          • Avatar
            toasterking

            I guess it depends on how the watchdog timer is implemented.  I worked with a server whose timer was tickled by a service, and services should be stopped prior to the final cache flush.  I admittedly don’t know if this resembles a typical implementation.

  • Avatar
    cheong00

    Actually the “cure” is to use externally powered storage, so even when there’s unexpected shutdown, the cached data on storage device would still be written to disk.
    If you have unstable power, then you probably want to know that UPS is a thing.

    • Avatar
      Ben Voigt

      That only helps if the data is in the disk-side cache.  Raymond’s talking about caching in system RAM, not the disk-side cache (we know because 1 GB and larger files can’t fit disk-side)

      • Avatar
        cheong00

        When you use external storage with large buffer and independent power (such as SAN), the recommandation is always to disable write cache to minimize the possibility of data lost. With those 15k RPM SCSI disk (which also has buffer itself) running RAID 10, you’d rarely find the performance bottleneck on the storage side.

  • Avatar
    Martin Mitáš

    I fully agree system crashes can be (from app POV) in most cases treated as events that simply never occur. It isn’t responsibility of any user space app to guarantee stability of the system as a whole.
    But writing some data to external/unplugguble device or a remote filesystem living somewhere on a network are not in that category. So the question how the app can know whether it successfully got the data to the destination is still relevant even when we take system crashes out of the equation.
    Is there a reasonable way how to reliably detect when such a problem happens (1) without knowing any details about the target filesystem and (2) without any slowdown due the synchronized output?