February 22nd, 2008

Why you should never use TCHAR in C++ header files

Andrew Arnott
Principal Software Engineer

In C++, TCHAR, LPCTSTR, CString, and other types discretely change to their char/WCHAR, CStringA/CStringW counterparts depending on whether UNICODE is defined in your source code.  Cool.  By conscientiously using _countof(x) instead of sizeof(x) where appropriate and TCHAR’s everywhere instead of char/WCHAR, you can write code that will compile with ANSI or UNICODE support as easy as flipping a switch.  But these tricks are not safe in .h header files.

Consider the following very simple scenario:

tool.h:

void SayHello(CString recipient);

tool.cpp

//#define UNICODE
#include “tool.h”
void SayHello(CString recipient)
{
   printf(_T(“Hello, %s”), recipient);
}

Compiled into a standalone app, these two files would easily compile with or without the UNICODE symbol set and the right thing would happen. 

But now consider that you compiled this not into a standalone .exe but into a .lib or .dll, and shared out the .h header file with some consuming application (which might be an external customer).  That header file will be #include’d into their app, which may or may not define the UNICODE symbol.  Suppose you defined UNICODE and the linking app did not.  The header file will automatically adjust to the consuming app’s ANSI style strings and the C++ compiler and linker will be perfectly happy. You won’t know there’s a problem with the app passing ANSI strings to your UNICODE library until runtime when you get random data corruption and application crashes.

How do you avoid this?  Well there are two ways, really.  If you already have an extensive library written with header files that use TCHARs and their cousins, just compile and ship both a UNICODE and an ANSI version of your library (toolA.lib and toolU.lib).  Then instruct your customer to link to the correct one. 

But if you’re writing a library from scratch, consider using all strongly-typed ANSI or UNICODE character types in your .h files, and in the method signatures of your .CPP files.  Then use whatever generic TCHAR’s make sense within the implementation of your methods.  So for example:

tool.h:

void SayHello(CStringW recipient);

tool.cpp

//#define UNICODE
#include “tool.h”
void SayHello(CStringW recipient)
{
   USES_CONVERSION;

   wprintf(L”Hello, %s”, recipient);

   // Call some ATL/MFC function that is only available in TCHARs
   TCHAR szBuffer[] = _T(“Some TCHAR buffer”);
   SomeImplementationSpecificMethod(szBuffer, CW2T(recipient));
}

This will compile with or without the UNICODE symbol defined, but the method signature will always be UNICODE, and the internals of method implementation will automatically adjust as appropriate for UNICODE or ANSI support.

Author

Andrew Arnott
Principal Software Engineer

Principal Software Engineer and OSS contributor. Visual Studio Platform.

0 comments

Discussion are closed.

Feedback