Why does my string consist of this Korean character repeated over and over?

Raymond Chen

A customer reported that their program would sometimes print Chinese text instead of the actual desired value. Your initial reaction is probably, “Oh, I bet I know what’s going on. They’re displaying an ANSI string as if it were Unicode, amirite?”

And then you look at the screen shot.

췍췍췍췍췍췍췍

Okay, first of all, that’s not Chinese text. That’s Korean.

But I’ll forgive that error, because to the uninitiated, Chinese, Japanese, and Korean characters look alike: They are all monospace complex symbols. Of course, once you’ve become initiated, you can instantly tell them apart. The hard part is the initiation.

If you look more closely, you may even recognize the character as Unicode code point U+CDCD.

And that’s the key to the puzzle.

The byte 0xCD is a common fill byte. Visual Studio uses it in debug mode to represent uninitialized heap memory.

Therefore, the reason for the Korean character repeated over and over is that your so-called string is actually just uninitialized heap memory. Follow the money backward to the function which was supposed to fill it with data, and debug why that function failed. (While you’re at it, you might also want to add error checking, so that when that function fails, you don’t run ahead with uninitialized data.)

Author

Raymond Chen

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

9 comments

Discussion is closed. Login to edit/delete existing comments.

Robert Lim October 15, 2019

There’s a hidden bonus if you can read Korean, the way the character is pronounced is very similar to “check” in English!
GL October 10, 2019

This reminds me of an ancient joke: A newbie of C programming language yanked the power plug after he sees “烫烫烫” (lit. hot hot hot) printed on screen, in fear of the computer being overheated. “烫”, in code page 936, is 0xCCCC — that’s uninitialized stack.
cheong00 October 7, 2019

I’m a bit troubled by the suggestion that the customer was shipping debug build of their software.
- Scarlet Manuka October 8, 2019
  
  Not necessarily. They might have found this error during QA, for example.
  
  OK, yes, probably they were shipping the debug build. I’m sure I’ve seen articles — whether here or on The Daily WTF I don’t know — about companies who shipped the debug build, because the release build kept crashing but the debug build worked fine (for certain values of “worked” and “fine”).
  
  Why take the time and trouble to fix your errors, when you can just get the system to probably mask them for you?
  - Raymond Chen Author October 9, 2019
    
    The customer found this problem during development, even before handing over to QA. Developers use debug builds, so the problem was consistent there.
  - Akash Bagh October 19, 2019
    
    This comment has been deleted.
Paul Topping October 7, 2019

There’s money in uninitialized data? Who knew?
- Piotr Siódmak October 7, 2019
  
  there is money if it leads to an exploit which allows you to plant malware on the victim’s computer
Simon Clarkstone October 7, 2019

Round here, my font doesn’t support that character, so it’s a tiny box with
CD
CD
in it, even easier to decode (if one thinks of looking).

Why does my string consist of this Korean character repeated over and over?

Author

9 comments

Read next

A window can’t have two timers with the same ID, so how do I assign an ID that nobody else is using?

Fibers aren’t useful for much any more; there’s just one corner of it that remains useful for a reason unrelated to fibers

Author

9 comments

Read next

A window can’t have two timers with the same ID, so how do I assign an ID that nobody else is using?

Fibers aren’t useful for much any more; there’s just one corner of it that remains useful for a reason unrelated to fibers

Stay informed