We saw some time ago that strings in Win32 resources are grouped in bundles of 16. Why not have each string be a separate resource? Why are they bundled at all? And why bundles of 16? Why not 15 or 8 or 32?
Recall how resources worked in 16-bit Windows. To load a resource, you allocate a selector to hold the resource data, load the resource data from disk into the selector, and then use the selector to access the resource. The selector remains in memory afterward, but it is marked as “discardable”, so that it can be freed when the system comes under memory pressure.
Windows 1.0’s system requirements did not include a hard drive. You could run it off a two-floppy system.¹ Reducing I/O had a noticeable effect on performance, so issuing a separate I/O for each string was going to be inefficient.
On top of that, Windows 1.0 had a limit of 4096 selectors, so putting each string in its own resource would drain the system of selectors.
On the other hand, you don’t want to load all the strings, because that means doing a large I/O transfer for data, most of which are not going to be used. Furthermore, you increased memory consumption because all of the strings got loaded into memory, creating additional memory pressure, and when the strings were discarded (which would happen more often because of the increased memory pressure), you lost all of your strings.
The decision to bundle strings in groups of 16 was an attempt to balance these two competing performance issues. There’s nothing magical about the number 16. It was a convenient number that gave you a decent amount of batching while still keeping the batches from getting too large.
Grouping your strings into bundles became a performance game similar to segment tuning, where you wanted to put strings that were used at the same time into the same bundle to maximize the value of each I/O operation.
Although 32-bit Windows doesn’t use allocate resources to segments the same way that 16-bit Windows did, the bundling design was nevertheless carried forward. One reason is that bundling expanded the range of string resource IDs from a 16-bit value to a 20-bit value, so the highest resource string ID went from 65535 to a bit over a million. And even though they don’t occupy a segment any more, there is still overhead in the file format to describe each resource. Strings tend to be short, so this overhead ends up being significant. You don’t want a four-character string to come with 24 bytes of overhead. There is still a small memory benefit to not wasting slots in a bundle, though it is not as severe as it was in 16-bit days.
¹ That’s what I did back in the day. The company I worked for at the time had one computer with a hard drive, and wow that hard drive made a big difference.
The string IDs being 20-bit would be very nice and useful, if we could actually use it.
LoadString rejects values above 65535 and Microsoft Resource Compiler silently truncates all string IDs to lower 16 bits. But MinGW one gladly compiles them, and simple custom LoadString implementation (needed anyway due to security mitigations) will load them. I have this request open, but I'm not holding my breath.
This bundling of string resources has the unfortunate side effect of causing “CVT1100 Error duplicate resource”, even if you do not really have string resources with the same ids.
Consider this example:
File resource1.rc
File resource2.rc
The resources are different, but trying to have both files in the same VC++ project results in the CVT1100 error.
The first sentence of the last paragraph has a stray verb, “use.”