Last time, we started our exploration of how Windows synthesizes text clipboard formats by looking at the conversion between CF_ and CF_. Today, we’ll look at what happens when CF_ enters the picture.
The introduction of CF_ means that we now have three clipboard text formats, and therefore six possible conversions. The four new conversions are
CF_to/fromUNICODEÂTEXT CF_.TEXT CF_to/fromUNICODEÂTEXT CF_.OEMÂTEXT
These conversions are done with the assistance of the CF_ clipboard format, which contains an LCID, which is a 32-bit integer that encodes a primary language (such as German), a sublanguage (such as Swiss-German), and a sort rule (such as phone book). None of these details are directly relevant to character set conversion. The locale is used because both the ANSI and OEM code pages can be derived from the locale, so it’s only one value that needs to be recorded.¹
The system converts to/from CF_ via the code page obtained from the LCID:
LOCALE_when converting to/fromIDEFAULTÂANSIÂCODEÂPAGE CF_.TEXT LOCALE_when converting to/fromIDEFAULTÂCODEÂPAGE CF_.OEMÂTEXT
Putting all of this into a chart gives us
| To | From | ||
|---|---|---|---|
| CF_TEXT | CF_OEMTEXT | CF_UNICODETEXT | |
| CF_TEXT | nop | OemToAnsi | WC2MB(ANSI CP) |
| CF_OEMTEXT | AnsiToOem | nop | WC2MB(OEM CP) |
| CF_UNICODETEXT | MB2WC(ANSI CP) | MB2WC(OEM CP) | nop |
In the above table, “ANSI CP” means “the code page reported by calling GetÂLocaleÂInfo with the LCID in the CF_ clipboard format, and the LOCALE_ locale attribute”. Similarly for “OEM CP”, using LOCALE_ instead of LOCALE_.
That’s great, we have all the answers in a table. But that table raises more questions!
We’ll start answering questions next time.
¹ This CF_ clipboard format existed in 16-bit Windows as well, but it wasn’t really used for anything. The people who added Unicode support to the clipboard realized, “Hey, the thing we need is already here! We just have to start using it.”
And if an application sets CF_UNICODETEXT on the clipboard, how is CF_LOCALE filled? GetThreadLocale? GetThreadUILanguage?