Last time, we finished with a nice table of now Windows synthesizes each of the three text clipboard formats from the other two. We saw that the CF_ clipboard format plays an important role in the conversion. But when most people set text on the clipboard, they don’t specify an explicit CF_ format. What happens in that case?
If the code that puts the initial text on the clipboard does not specify an explicit CF_, then the system synthesizes one by using the LCID associated with the user’s current keyboard layout. This is what 16-bit Windows did, and 32-bit Windows carried this policy forward for backward compatibility.
This means that if you typed the text with the US-International keyboard, then the system will use the US-English LCID as the CF_, and it will therefore use 1252 as the ANSI code page and 437 as the OEM code page.
You might say, “But that makes no sense, does it? Suppose I highlight some text in Hebrew and copy it to the clipboard. Shouldn’t that be set with a Hebrew LCID?”
Should it?
We’ll start studying all the multi-locale issues next time.
Indeed, my first thought was: how does this interact with tools such as AppLocale which let you set the locale per-process?