Last time, we encountered a mystery where the synthesis of CF_ from CF_ did not use AnsiÂToÂOem. Today we will begin the investigation.
Recall that we have a table showing how Windows synthesizes each of the various text formats from the other two. But in the case where the clipboard has two formats available, and you ask for the third, there are two ways that the third format could be synthesized: It could convert the first, or it could convert the second. How does Windows decide?
The preference table is
| To get | First try | Then try | And then try |
|---|---|---|---|
| CF_TEXT | CF_TEXT | CF_UNICODETEXT | CF_OEMTEXT |
| CF_OEMTEXT | CF_OEMTEXT | CF_UNICODETEXT | CF_TEXT |
| CF_UNICODETEXT | CF_UNICODETEXT | CF_TEXT | CF_OEMTEXT |
In words, first look for a perfect match. If that’s not available, then try (in order) CF_, then CF_, then CF_. (One of those last three checks is redundant with the perfect match check.)
Combining that with our previous table produces this conversion table with priorities:
| To get | First try | Then try | And then try |
|---|---|---|---|
| CF_TEXT | CF_TEXT | CF_UNICODETEXT + WC2MB(ANSI CP) | CF_OEMTEXT + OemToAnsi |
| CF_OEMTEXT | CF_OEMTEXT | CF_UNICODETEXT + WC2MB(OEM CP) | CF_TEXT + AnsiToOem |
| CF_UNICODETEXT | CF_UNICODETEXT | CF_TEXT + MB2WC(ANSI CP) | CF_OEMTEXT + MB2WC(OEM CP) |
Again, “ANSI CP” means “the code page reported by calling GetÂLocaleÂInfo with the LCID in the CF_ clipboard format, and the LOCALE_ locale attribute”. Similarly for “OEM CP”, using LOCALE_ instead of LOCALE_.
If you stare at this table, you might notice something odd, possibly even disturbing. And that is part of the answer to the mystery. We’ll talk about it next time.
0 comments
Be the first to start the discussion.