How does Windows synthesize `CF_OEMTEXT` from `CF_TEXT` and vice versa?

Raymond Chen

Windows has three built-in text formats for the clipboard:

CF_UNICODETEXT: UTF-16 text.
CF_TEXT: 8-bit text in ANSI code page.
CF_OEMTEXT: 8-bit text in OEM code page.

If you don’t provide all three formats, then the system will synthesize the missing ones from the ones you have. How does this work?

Believe it or not, we’re going to spend the rest of the week on this topic.

One thing to note is that the synthesis is done on demand. It is only when somebody asks for, say, CF_TEXT and the clipboard realizes that it has only CF_OEMTEXT, that the clipboard creates the CF_TEXT on the fly. Once done, the result is cached so that it doesn’t have to be converted again.

Today’s conversion is the one that looks easiest on the surface: Converting CF_TEXT to CF_OEMTEXT and back.

To convert CF_TEXT to CF_OEMTEXT, Windows uses the AnsiToOem function. And to convert the other way, it uses the OemToAnsi function.

These are legacy function names; the modern names are CharToOem and OemToChar. But I used the legacy names because that’s what they were called in 16-bit Windows, and that’s how the conversion was done in 16-bit Windows.

This was anticlimactic, but we’re just getting started. And when we get to the end, we’ll see that what looks like a simple answer is actually quite complicated.

Back in the days of 16-bit Windows, ANSI text and OEM text were the only two clipboard text formats, so there were only two possible conversions (one in each direction). Things get more complicated with the introduction of CF_UNICODETEXT, which we’ll look at next time.

Author

Raymond Chen

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

11 comments

Discussion is closed. Login to edit/delete existing comments.

Jeremy Saldate December 9, 2025

Hi Raymond — thanks for diving into this! I’m curious: once the clipboard “synthesizes” a missing format (e.g. converting CF_TEXT → CF_OEMTEXT) and caches it, does the system ever re-validate or re-convert it if the underlying ANSI ⇒ OEM rules (or the code-page assumptions) change? In other words — could caching inadvertently produce incorrect results if a different code page or locale is later applied than when the conversion was first done?
- anonymous December 9, 2025 · Edited
  
  this comment has been deleted.
GL December 8, 2025

My guess is the complexity comes from “when there’re two sources, use which” (because the two sources could be malformed in different ways), and the consequences of caching so “who is ground-truth” could change when you start with 1 version, then query the other 2 in different order, or that Windows preemptively does all the conversion if any conversion needs to be done, etc.
skSdnW December 8, 2025 · Edited

If UTF-8 is selected as the ANSI codepage, you get it from CF_TEXT. If your app uses UTF-8 everywhere then it probably knows how to convert from UTF-16 already.

(I have no idea why my replies don’t attach correctly to the person I’m replying to)
- Jyrki Vesterinen December 9, 2025 · Edited
  
  Application manifest can also specify that ANSI codepage is UTF-8 irrespective of what the user has selected.
Igor Levicki December 8, 2025 · Edited

Hey Raymond, when will they be adding CF_UTF8TEXT?

Also, when will Microsoft Teams stop putting large images as giant Base64 encoded sausages on the clipboard?

A soul-crushing exercise: Try copying a large image from Microsoft Teams to another messenger app (Telegram, Viber, Signal, WhatsApp, etc), watch how it results in 1,000+ gibberish text messages when you paste it, and then have "fun" selecting and deleting them all.

IMO, that's total Clipboard API abuse and should be blocked on the OS level. The least that could be done is to give user the control of the size of text pasted to clipboard and also...
Read more
Hey Raymond, when will they be adding CF_UTF8TEXT?

Also, when will Microsoft Teams stop putting large images as giant Base64 encoded sausages on the clipboard?

A soul-crushing exercise: Try copying a large image from Microsoft Teams to another messenger app (Telegram, Viber, Signal, WhatsApp, etc), watch how it results in 1,000+ gibberish text messages when you paste it, and then have “fun” selecting and deleting them all.

IMO, that’s total Clipboard API abuse and should be blocked on the OS level. The least that could be done is to give user the control of the size of text pasted to clipboard and also an option to treat base64 encoded image as actual image by decoding it and placing it on the clipboard in the png or jpg format.

Read less
- Georg Rottensteiner December 8, 2025
  
  The clipboard is a mighty tool in the hands of the skilled.
  
  I wonder, will you also delve into IMHO not really official custom clipboard formats?
  
  For example “MSDEVColumnSelect” which is used by some programs (Visual Studio, TextPad) to signify column selection mode in the pasted text.
- Georg Rottensteiner December 8, 2025
  
  Hear, Hear!
  
  When I found out about that abomination I had to add base64-decoding for clipboard “images” from Teams to my paint app.
  - Tom Lint December 10, 2025
    
    That’s what you get when you allow web developers to develop ‘applications’ for the desktop. They think in web terms, and haven’t the slightest clue as to how a desktop application is supposed to work.
- Antonio Rodríguez December 8, 2025 · Edited
  
  I agree that this is very inconvenient, and I'd consider it a bug in Teams (private clipboard formats were invented to solve this problem). But, how could this be controlled at the OS level without crippling the user? If a large amount of text is placed in the clipboard, probably it's the user who wanted to move it from one application to another (maybe they are copying a chapter of a book or something like that). The OS shouldn't restrict the clipboard just because some apps abuse its use.
  
  And no, the solution isn't showing an alert interrupting the workflow, which...
  Read more
  I agree that this is very inconvenient, and I’d consider it a bug in Teams (private clipboard formats were invented to solve this problem). But, how could this be controlled at the OS level without crippling the user? If a large amount of text is placed in the clipboard, probably it’s the user who wanted to move it from one application to another (maybe they are copying a chapter of a book or something like that). The OS shouldn’t restrict the clipboard just because some apps abuse its use.
  
  And no, the solution isn’t showing an alert interrupting the workflow, which only servers to teach users to blindly click “Yes” (“Stupid Windows! I already chose the Copy command! Why on Earth do I have to confirm it?”). Neither is silently corrupting the copied data if it is “too long” (also, where do we put the length limit?).
  
  Read less
  - Igor Levicki December 9, 2025
    
    The OS should absolutely prevent such API abuse. You don't see that kind of crap on iOS and mac OS.
    
    If you are placing image on a clipboard place it in an image format so that all programs past and present which support images can paste them, not as a 2+MB base64 encoded data URI eldrich horror.
    
    99% of users have no clue what data URI is. They will try to paste the image into another app which doesn't support that, and they will either corrupt a document, or crash it and lose data. The worst scenario is that some of their...
    Read more
    The OS should absolutely prevent such API abuse. You don’t see that kind of crap on iOS and mac OS.
    
    If you are placing image on a clipboard place it in an image format so that all programs past and present which support images can paste them, not as a 2+MB base64 encoded data URI eldrich horror.
    
    99% of users have no clue what data URI is. They will try to paste the image into another app which doesn’t support that, and they will either corrupt a document, or crash it and lose data. The worst scenario is that some of their friends will get bazillion chat messages whose sending can’t be stopped until done, and which can’t be deleted in any reasonable amount of time aside from nuking the whole chat.
    
    TL;DR — Responsible software engineers stop to think what they are doing and how will that affect the users, they don’t just blindly implement whatever they are told by their line manager. If you blindly do what you’re told then you are just a mercernary without any ethics, not a proper software engineer. Worse yet, if it was your idea then you are a monster.
    
    Read less