Some time ago, we learned why the module timestamps in Windows 10 are so nonsensical: Because they aren’t timestamps any more. They are a hash of the resulting binary.
But why not invent a new field called, say, UniqueValue for the hash, rather than putting it in the timestamp field?
https://t.co/iPc0RdM9vc
yes, stupid decision imho; could use a diff. field for that— Adam (@Hexacorn) February 15, 2024
Well, for one thing, that would be a breaking change. If you take a binary produced by a linker that puts the hash in a new field and run it on an older system, the older system will ignore the hash and use the timestamp, so the hash does nothing.
But wait, why are we gathered here in the first place? The reason for using a hash instead of a timestamp is to permit reproducible builds, and the Wikipedia page specifically calls out timestamps for scorn:
According to the Reproducible Builds project, timestamps are “the biggest source of reproducibility issues.”
If you put a timestamp in the binary, then it’s no longer reproducible: Making no changes and rebuilding will produce a different binary because the timestamp will be different.
If we want a reproducible build, we simply have to get rid of the timestamp.
Remember what the timestamp is used for: It’s used by the module loader to detect whether precalculated addresses of imported functions should be trusted: When the addresses are precalculated (by binding), the timestamp of the module that was used as the basis of the precalculation is recorded by the importing module. When the loader loads the importing module, it checks whether that timestamp matches the timestamp recorded in the module from which the functions are being imported. If it matches, then the precalculated values are used. If it doesn’t match, then the precalculated values are ignored and new values are calculated from scratch
Okay, so maybe we can use some other source as the timestamp, rather than the timestamp of the build itself. How about the timestamp of the most recent commit?
That still doesn’t work because you can build multiple binaries from the same source code. Any precalculated values from a debug build will not be correct for a release build, and vice versa. Any switches that affect code generation must change the timestamp because the resulting binary is different and in particular the addresses of exported functions may change.
Okay, but maybe we can start with the timestamp and, say, hash the compiler switches into a 16-bit value that gets added to the original timestamp. That way, you still get a pseudo-timestamp that is within a day of the actual timestamp.
But now you’ve swung the pendulum too far the other way. Previously, the problem was that the timestamp didn’t change when it should have. Now the problem is that the timestamp changes when it didn’t need to. Maybe you made a commit to a README.md file. This isn’t even part of the source code, but it’ll change the “most recent commit” timestamp. Okay, so maybe you look only at commits that modify source code. But now you add a new enum to a header file (say, windows.h) that is included by every component, but only one component actually takes advantage of it. The change to the header file will update the “most recent source code commit” timestamp of every component, even though only one of the components actually changed as a result of the new enum. The other components are binary identical, or would be if it weren’t for the timestamp.
The way to get the timestamp to change when the binary changes, but only when the binary changes, is to make the timestamp depend only on the binary itself (minus the timestamp field).¹
Bonus chatter: Making the timestamp a hash of the binary contents simplifies the process of determining which binaries were affected by a change: Look for binaries whose timestamp hashes changed. Not only does this make things easier for the servicing team (to identify which binaries need to be included in the next monthly update), it’s also handy as part of your regular workflow: If you change a header file with the intention of fixing an issue in one component, and several dozen files changed timestamps, then that’s a signal that what you thought was a change with very limited scope turned out to have a much larger scope than you thought, and maybe you should figure out what unintended consequences your change precipitated.²
¹ This is a tautology, but sometimes it helps to state the tautology explicitly.
² One common example of this is adding a new method to a COM interface. This causes the IID to change, which in turn causes every module that produces or consumes that interface to change. What you thought was a simple change to one binary ended up pulling a dozen binaries into the next monthly patch. Instead, you should create a new interface for your new method and leave the original interface alone.
If binding decides the precomputed values must be correct once the timestamp check passes, it would make me very worried. If the PE timestamp is the actual time, I’m worried about build machine clock rewinding due to an NTP sync happening between two builds. If the PE timestamp is a hash, a 32-bit hash is too short for me to sleep at night, when the stake is quite high (a misbound DLL will crash upon the first call into that DLL).
I’ve been thinking last commit date is correct because symbol binding is obsolete because ASLR is more important than the load speedup.
If the infrastructure is in place, you can have your cake and eat it too: when someone kicks off a build, look at the build graph to identify the exact list of inputs the requested build depends on, check to see which of those have changed and if and only if that set is non-null, run the impacted build steps with normal timestamps. If nothing has changed pull the already built results from cache. However for that to work in practice, you need 1) a code base that uses a fully hermetic build system with explicit dependencies at every step and 2) a strong enough guarantee that any binaries that should ever be “the same” will be built from the same cache.
FWIW, a build system along the lines of Bazel makes both of those attainable, as long as you are willing to limit yourself to dependencies set up to use it or are willing to “impedance match” for dependencies that don’t.
I love it when people jump straight to calling something “stupid” without bothering to understand the issue, then offer their advice which wouldn’t work at all.
I would suggest just changing all the debug tools to refer to this field as “BinaryHash” or some such name and display it as an opaque hex value…
…but probably somebody, somewhere, has written complicated scripting logic that tries to parse and consume this field, so that would be backwards incompatible. Rats.
I love your work, Raymond. Keep it up 🙂
I learn something new from you every day.
Raymond is talking about reproducible builds, but does the Visual C++ document the /Brepro flag anywhere? Should we use, or rely on, undocumented feature?
You should never depend on undocumented features. /Brepo is only designed for use with non-executable, resource-only MUI files?
Brepo doesn’t work when you’re trying to compare hashes of the same sources compiled by different machines and only seems to work with locally built executables because the linker doesn’t delete any caches and re-uses previously compiled objects. The second you delete the local .vs and obj cache directories, even local builds of the same sources generate different hashes which is not how reproducibility is supposed to work.
There’s also other issues like how import/export tables of an executable break reproducible builds because they contain ‘Hint’ fields and the values are copied by the linker from the current operating system, these values change almost every 2 weeks with every windows update – there’s also a separate issue with /Brepro hashing uninitialized/invalid import/export table memory while hashing the image.
If the project contains any resources like an application manifest these are patched/hacked into executables by an external cvtres.exe process by msbuild and linker.exe during the build, this process uses random guids for the temporary directory and are added to the PDB symbols and later hashed by /Brepro and break the reproduceable build and resulting hash.
The /MP (Build with multiple processes) option also doesn’t synchronize access to the pdb, executable or object files, the order of strings and other objects in the file and PDB change location and break the reproduceable hash.
You also need other flags like /d1nodatetime and /d1trimfile with /Brepro to remove some other information breaking reproducibility, but they don’t solve issues with cvtres.exe or the two issues with the import/export table. The best project to experiment with different options is a project created by the Microsoft dotnet team named ZeroSharp and removing all imports was the only way to make Brepo work.
Moral of the story is don’t use undocumented functionality because it won’t do what you think it does or will change and start doing something you don’t know about.
OK, I get it … Raymond is talking about reproducible builds, so … how do I achieve reproducible build of my C++ application? Any links to Microsoft documentation?