December 16th, 2025
likeintriguing2 reactions

Why is the Windows clipboard taking the scenic route when converting from CF_TEXT to CF_OEM­TEXT?

Our investigation of why the CF_OEM­TEXT clipboard format is not being created from CF_TEXT via Ansi­To­Oem led us to the realization that the Windows clipboard automatic text conversion diagram is non-commutative. And the conversion we’re observing is consistent with Windows deciding not to do the direct conversion of CF_TEXT to CF_OEM­TEXT, but rather converting from CF_TEXT to CF_UNICODE­TEXT, and from there to CF_OEM­TEXT.

It took a day before I realized what was going on. Let’s look at the graph again.

  CF_TEXT
(CF_LOCALE) ⇅   ↑ (LOCALE_
  CF_UNICODETEXT   | USER_
    ↖↘ ↓ DEFAULT)
  (CF_LOCALE)   CF_OEMTEXT  

If we start with CF_TEXT and somebody asks for CF_UNICODE­TEXT, it will be converted via the CF_LOCALE clipboard format to Unicode (which for Windows means UTF-16LE). And then if we ask for CF_OEM­TEXT, the diagram above shows that Windows will prefer to convert from CF_UNICODE­TEXT, so the string ends up being converted through CF_UNICODE­TEXT after all.

The fact that the conversion diagram is path-dependent means that what you get is now influenced by what other applications read from the clipboard. After our test program copied text to the clipboard in ANSI format, the fact that another program requested CF_UNICODE­TEXT influences what a future CF_OEM­TEXT request will produce.

And it’s not like the interloper program can do anything about it. There is no way to ask, “Hey, like, I know that you say that you have CF_UNICODE­TEXT, but do you really have it? Or are you just pretending to have it?”

And how do these interloper programs know when you changed the clipboard? Because they have registered clipboard format listeners. But what program do I have that has registered a clipboard format listener?

And then it occurred to me: I have Clipboard History enabled.

Clipboard History is waking up when the first test program copies ANSI text to the clipboard and reading out CF_UNICODE­TEXT to add to the clipboard history. This triggers the conversion CF_UNICODE­TEXT, and the result is cached back onto the clipboard. This means that when you ask for CF_OEM­TEXT, the clipboard sees that it has a choice of CF_TEXT and CF_UNICODE­TEXT, and from the diagram we see that it prefers converting from CF_UNICODE­TEXT.

I turned off Clipboard History, and the query for CF_OEM­TEXT started behaving as expected: The OEM text was generated by applying Ansi­To­Oem to the CF_TEXT contents.

So I came to two conclusions.

First, if you care about OEM text, then you should set your CF_LOCALE to LOCALE_USER_DEFAULT to avoid path-dependent conversions.

Second, pretty much nobody cares about OEM text.

Next time, we’ll look at another consequence of the above diagram.

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

3 comments

Discussion is closed. Login to edit/delete existing comments.

Sort by :
  • Igor Levicki

    > Clipboard History is waking up when the first test program copies ANSI text to the clipboard and reading out CF_UNICODE­TEXT to add to the clipboard history. This triggers the conversion CF_UNICODE­TEXT, and the result is cached back onto the clipboard.

    That’s just horrible design of Clipboard History.

    When I place something on the clipboard I don’t expect it to be silently mutated by some OS component.

  • Simon Farnsworth

    This path dependency looks like a consequence of history, and therefore something that might matter to backwards compatibility in niche use cases.

    I'm assuming that CF_TEXT and CF_OEMTEXT existed back in the Windows 1.0 days - certainly Win16 era, before Unicode was a thing at all, and the respective conversions existed then, too. This gives you a nice simple table: "If I have CF_TEXT, and the consumer wants CF_OEMTEXT, use AnsiToOem, if I have CF_OEMTEXT and want CF_TEXT, use OemToAnsi". You might well handle the conversion by adding the missing format on demand, and thus caching the fact that a conversion...

    Read more
    • Stephan Leclercq

      It seems to me that this dependency is a consequence of the KISS principle trying to avoid unavoidable complexity, and failing miserably as usual ;-)

      The natural solution is that clipboard should not synthesize additional content based on whatever is already present in the clipboard, but based on what the user (ie program that called OpenClipboard()) initially provided as data. If the user provides CF_TEXT, then the clipboard synthesizes CF_UNICODETEXT on demand from the clipboard history, then someone asks for CF_OEMTEXT, the synthesis should consider only CF_TEXT because that's the only thing that was initially provided.

      The implementation would be...

      Read more