July 26th, 2021

Diagnosing why your batch file prints a garbage character, one character, and nothing more

You’ve written a batch file, and you try to execute it, but instead of running, it just prints some weird garbage character, then the first character of the batch file, and then that’s it.

Here’s the batch file:

@echo off
echo Hello, world.

And here’s what happens when you run it:

C:\>■@
'■@' is not recognized as an internal or external command,
operable program or batch file.
C:\>

What’s going on here?

Put on your thinking cap.

The file was saved in UTF-16LE format with a byte order mark. The leading garbage character was the byte order mark being interpreted in the ANSI code page.

But wait, you say. The UTF-16LE byte order mark is two characters long: 0xFF and 0xFE. Why did only one garbage character print?

Because character 0xFF is invisible.

The next Unicode character in the batch file is the at-sign, which in UTF-16LE is encoded as a @ followed by a null byte. The @ is read from the batch file, but the null causes the command processor to think it reached the end of the file.

That means that the batch file is treated as if it consisted of a single line. And that explains the error message.

Save the batch file as ANSI rather than UTF16-LE, and that will fix it.

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

13 comments

Discussion is closed. Login to edit/delete existing comments.

  • Matthew van Eerde

    > Save the batch file as ANSI

    Ew. Is there no way to get batch to work with Unicode? I suppose PowerShell is where it’s at nowadays

    • Me Gusta

      Set Windows to use UTF-8 as the non-Unicode codepage.
      This is still classed as beta, but it is one of the nicest Windows 10 features. It is also much easier these days since applications have generally gone towards Unicode.

      • Antonio Rodríguez · Edited

        The problem is that it breaks many Win32 applications developed with Windows 9x compatibility in mind (Windows 9x only supported "ANSI", or Windows 1252). And they aren't just "legacy" applications. There are many out there, much more than you would think.

        A local (and thus better) solution can be selecting the UTF-8 codepage in CMD, executing "chcp 65001". If you want to make it the default, you can add it as a modifier to the shortcut...

        Read more
    • Antonio Rodríguez

      If you want to create a full application with nice multilingual messages in several languages, yes, CMD can't do that. But maybe a PowerShell script isn't a good choice, either, even if it supports Unicode/UTF-8 natively. Batch files are for automating file management and administrative tasks, nothing less, nothing more. IMHO their advantages over other programming languages (simplicity and immediacy) outweigh their compromises for these tasks. Just use the right tool for the right job...

      Read more
      • Andy Cadley

        I think I’d be hard pressed to think of anything these days where the “right tool for the job” was a batch file over a PowerShell script.

      • Antonio Rodríguez

        Simple: you have been writing your own batch files for 25 years, you have a nice library of "reusable"* code, you know CMD like the palm of your hand, and everything you need is covered by it and, perhaps, some command line tools (like the included in WSL or GNU/Win32). Why learn a completely new programming language when your clients do not require you to use it?

        *"reusable" up to the point where CMD lets you...

        Read more
  • Yuri Khan

    Nitpicker’s corner. Last time I was on Windows:

    * Batch files normally used OEM encoding rather than ANSI.
    * In OEM encodings I was familiar with, 0xFF was not totally invisible. It was the equivalent of Unicode U+00A0 NO-BREAK SPACE.

    • Neil Rashbrook

      And of course being a space is the real reason that it doesn't print, as the command processor simply skips it when attempting to interpret the file. This is most easily demonstrated by prefixing a space byte to the file, which does not change the output, then instead (e.g.) a 'z' byte to the file, at which point it will output (but who still creates batch files in ?)

      Read more
    • Antonio Rodríguez

      No, batch files don't use OEM (or even ANSI) by default. Batch files are executed in whatever codepage is active in the terminal window where the command interpreter runs. If you type chcp 1252 at the command prompt, batch files will "use" ANSI (or, more specifically, Windows-1252). For example, if you redirect a batch's output to a file and it contains non-ASCII characters, they won't show correctly on Notepad unless you select the OEM font....

      Read more
  • Roger Hågensen

    I’ll assume a batchfile written as UTF-8 without a BOM will work okay?!

    • skSdnW · Edited

      cmd.exe does not really like it if you chcp to UTF-8 and batch file processing becomes unstable.

      • Me Gusta · Edited

        No, but if you just set Windows 10 to use UTF-8 as the default codepage, cmd.exe works well. Sure, this is a beta feature, but it is rather stable.
        If you just type chcp without giving it a new codepage, it will display 65001 as the active codepage. Under this I have used batch files which use BOM-less UTF-8 and output text successfully. This includes things like using echo to output emoji. The problem then...

        Read more
    • 紅樓鍮

      I guess rather that cmd interprets the batch file in whatever the current code page is set to. (Of course if the batch file contains only ASCII-range characters then you can save it in any non-UTF-16 encoding and it will work.)