February 9th, 2017

Suspicious memory leak in std::basic_string

A customer asked for assistance debugging a memory leak. Their memory leak detection tool was reporting a leak in the following call stack:

ntdll!RtlAllocateHeap
Contoso!malloc
Contoso!Allocator<unsigned short>::allocate
Contoso!std::basic_string<unsigned short,std::char_traits<unsigned short>,Allocator<unsigned short>,_STL70>::_Copy
Contoso!std::basic_string<unsigned short,std::char_traits<unsigned short>,Allocator<unsigned short>,_STL70>::_Grow
Contoso!std::basic_string<unsigned short,std::char_traits<unsigned short>,Allocator<unsigned short>,_STL70>::assign
Contoso!std::basic_string<unsigned short,std::char_traits<unsigned short>,Allocator<unsigned short>,_STL70>::assign
Contoso!std::basic_string<unsigned short,std::char_traits<unsigned short>,Allocator<unsigned short>,_STL70>::operator=
Contoso!ConfigurationImpl::validate

“The Configuration­Impl object itself is not being leaked. Just the string inside it.”

The Visual C++ team reported that there are no known memory leaks in STL70. However, the code above is using a custom allocator, so they asked to see more of the customer’s code.

And they found the smoking gun, but it wasn’t in the allocator. It was in the class constructor.

ConfigurationImpl::ConfigurationImpl()
{
    // Initialize all members to zero.
    memset(this, 0, sizeof(ConfigurationImpl));
}

basic_string, like all STL objects, is non-POD. A POD type is roughly¹ something that can be declared in C as a plain old struct, such as struct Pod { int x; int y; };. POD types can be treated as a blob of bytes that you can manipulate with memset, memcpy, and such. Non-POD types, on the other hand, are those with things like constructors, destructors, virtual methods, all that fancy C++ stuff. You cannot treat them as just a blob of bytes because they have other fancy behaviors attached, and treating them as a blob of bytes bypasses (and may even damage) those fancy behaviors.

In this case, using memset to zero out a basic_string wipes out all the work that was performed by the basic_string constructor and results in the dreaded undefined behavior. Maybe undefined behavior manifests itself as a memory leak. Maybe it manifests itself as a crash. Maybe it manifests itself as time travel.

In practical terms, what you have there is memory corruption. When you have memory corruption, crazy things can happen. So don’t corrupt memory.

The customer thanked us for our assistance and fixed their code.

¹This is a simplified discussion, so don’t haul out your language-lawyer pitchforks.

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

0 comments

Discussion are closed.