std::string now supports Address Sanitizer

Nicole Mazzuca

When using the Microsoft C++ Standard Library in debug mode (/MTd or /MDd),the library works hard to make sure programmers avoid many access violation bugs.Each container has a custom “wrapped” iterator, which, on every access, checksthat it is still valid, isn’t an end iterator, and when doing arithmetic,checks that it’s still in-bound.

However, as soon as you get out of the world of iterators into pointers,these checks can no longer do anything:

int vector_iterators() {
  std::vector<int> v{0, 1, 2, 3, 4, 5};
  return v.begin()[6]; // error at runtime!
}

int vector_data() {
  std::vector<int> v{0, 1, 2, 3, 4, 5};
  return v.data()[6]; // no error reported! undefined behavior!
}

Enter Address Sanitizer (ASan). Add the -fsanitize=address option to your buildwith either cl or clang, and the compiler will insert checks to make certainthat accessed memory is in scope and has not been deallocated.

// compile with -fsanitize=address
int check_stack() {
  int v[] = {0, 1, 2, 3, 4, 5};
  return v[6]; // ASan out-of-bounds access error reported!
}

int check_heap() {
  int *heap_v = new int[6]{0, 1, 2, 3, 4, 5};
  int result = heap_v[6]; // ASan out-of-bounds access error reported!
  delete heap_v;
  return result;
}

int check_vector() {
  std::vector<int> v{0, 1, 2, 3, 4, 5};
  return v.data()[6]; // ASan container-overflow access error reported!
}

Checking stack and raw heap memory has worked since we initially implementedthe Address Sanitizer feature. When using containers like std::vector or std::string,it will also make certain you don’t access memory that’s outside the underlyingallocation. However, because containers are library code, by default ASan will notprevent you from accessing memory that’s outside the bounds of the container’scapacity, but still inside the bounds of the allocation.

In microsoft/STL#2071, the standard library team added support forASan container-overflow annotations to std::vector, meaning that ourstandard library keeps the annotations for std::vector‘s buffer up-to-datemanually, using ASan’s __sanitizer_annotate_contiguous_container API.This means that check_vector above correctly errors on a container-overflow error,and we find more bugs in people’s code!

However, std::string is a very different beast to std::vector, due tothe Small String Optimization (SSO). In microsoft/STL#2196, we attemptedto do an initial implementation, supporting annotations of the SSO buffer as well.However, this had quite a few interesting and unfortunate bugs come out of it,so we needed to disable it again in microsoft/STL#2990.

// compile with -fsanitize=address
char string_old_asan() {
  std::string s = "Hello, world! Let's try address sanitizer!";
  s.reserve(64); // s's heap allocation is now 64 chars wide
  assert(s.size() == 42); // but s still only contains 42 chars
  // before Visual Studio 2022 17.6 Preview 1, this doesn't crash, but reads uninitialized memory
  return s.data()[50]; 
}

However, as of Visual Studio 2022 17.6 Preview 1 (in microsoft/STL#3164),we fixed the bugs and re-enabled the tracking machinery in std::string to keepASan’s knowledge up to date and correct — meaning that the code sample abovesuddenly went from silent undefined behavior to very loud undefined behavior!It works with any allocator, including custom allocators, but will check for slightlyfewer errors at the boundaries – if you want your custom allocators to fully supportASan checking, you can check out the ASan container-overflow annotations.

One limitation of this checking, in order to avoid the bugs that plagued the original,is that we do not annotate std::string‘s SSO buffer.This means that one can still access out-of-bounds inside the SSO buffer.We’ve left the door open to fixing this in the future, but we wanted tomake sure that checking of heap-allocated buffers at least worked.

If you do have issues with container-overflow checking on std::string, you can disableit by passing -D_DISABLE_STRING_ANNOTATION to your compile. If you find bugs in the feature,please report it to either Developer Community or the microsoft/STLGitHub. We can also be reached in the comments below, via the @VisualC twitter account,or via ASanDev@microsoft.com.