Why does my string consist of this Korean character repeated over and over?

Raymond Chen

Raymond

A customer reported that their program would sometimes print Chinese text instead of the actual desired value. Your initial reaction is probably, “Oh, I bet I know what’s going on. They’re displaying an ANSI string as if it were Unicode, amirite?”

And then you look at the screen shot.

췍췍췍췍췍췍췍

Okay, first of all, that’s not Chinese text. That’s Korean.

But I’ll forgive that error, because to the uninitiated, Chinese, Japanese, and Korean characters look alike: They are all monospace complex symbols. Of course, once you’ve become initiated, you can instantly tell them apart. The hard part is the initiation.

If you look more closely, you may even recognize the character as Unicode code point U+CDCD.

And that’s the key to the puzzle.

The byte 0xCD is a common fill byte. Visual Studio uses it in debug mode to represent uninitialized heap memory.

Therefore, the reason for the Korean character repeated over and over is that your so-called string is actually just uninitialized heap memory. Follow the money backward to the function which was supposed to fill it with data, and debug why that function failed. (While you’re at it, you might also want to add error checking, so that when that function fails, you don’t run ahead with uninitialized data.)

Raymond Chen
Raymond Chen

Follow Raymond   

8 comments

  • Avatar
    Simon Clarkstone

    Round here, my font doesn’t support that character, so it’s a tiny box with
    CD
    CD
    in it, even easier to decode (if one thinks of looking).

    • Avatar
      Scarlet Manuka

      Not necessarily. They might have found this error during QA, for example.

      OK, yes, probably they were shipping the debug build. I’m sure I’ve seen articles — whether here or on The Daily WTF I don’t know — about companies who shipped the debug build, because the release build kept crashing but the debug build worked fine (for certain values of “worked” and “fine”).

      Why take the time and trouble to fix your errors, when you can just get the system to probably mask them for you?

  • Avatar
    GL

    This reminds me of an ancient joke: A newbie of C programming language yanked the power plug after he sees “烫烫烫” (lit. hot hot hot) printed on screen, in fear of the computer being overheated. “烫”, in code page 936, is 0xCCCC — that’s uninitialized stack.

  • Avatar
    Robert Lim

    There’s a hidden bonus if you can read Korean, the way the character is pronounced is very similar to “check” in English!

Leave a comment