{"id":31904,"date":"2023-03-06T16:00:52","date_gmt":"2023-03-06T16:00:52","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/cppblog\/?p=31904"},"modified":"2023-03-03T09:06:15","modified_gmt":"2023-03-03T09:06:15","slug":"stdstring-now-supports-address-sanitizer","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/cppblog\/stdstring-now-supports-address-sanitizer\/","title":{"rendered":"std::string now supports Address Sanitizer"},"content":{"rendered":"<p>When using the Microsoft C++ Standard Library in debug mode (<code>\/MTd<\/code> or <code>\/MDd<\/code>),\nthe library works hard to make sure programmers avoid many access violation bugs.\nEach container has a custom &#8220;wrapped&#8221; iterator, which, on every access, checks\nthat it is still valid, isn&#8217;t an end iterator, and when doing arithmetic,\nchecks that it&#8217;s still in-bound.<\/p>\n<p>However, as soon as you get out of the world of iterators into pointers,\nthese checks can no longer do anything:<\/p>\n<pre><code class=\"language-cxx\">int vector_iterators() {\r\n  std::vector&lt;int&gt; v{0, 1, 2, 3, 4, 5};\r\n  return v.begin()[6]; \/\/ error at runtime!\r\n}\r\n\r\nint vector_data() {\r\n  std::vector&lt;int&gt; v{0, 1, 2, 3, 4, 5};\r\n  return v.data()[6]; \/\/ no error reported! undefined behavior!\r\n}<\/code><\/pre>\n<p>Enter <a href=\"https:\/\/learn.microsoft.com\/cpp\/sanitizers\/asan\">Address Sanitizer<\/a> (ASan). Add the <code>-fsanitize=address<\/code> option to your build\nwith either <code>cl<\/code> or <code>clang<\/code>, and the compiler will insert checks to make certain\nthat accessed memory is in scope and has not been deallocated.<\/p>\n<pre><code class=\"language-cxx\">\/\/ compile with -fsanitize=address\r\nint check_stack() {\r\n  int v[] = {0, 1, 2, 3, 4, 5};\r\n  return v[6]; \/\/ ASan out-of-bounds access error reported!\r\n}\r\n\r\nint check_heap() {\r\n  int *heap_v = new int[6]{0, 1, 2, 3, 4, 5};\r\n  int result = heap_v[6]; \/\/ ASan out-of-bounds access error reported!\r\n  delete heap_v;\r\n  return result;\r\n}\r\n\r\nint check_vector() {\r\n  std::vector&lt;int&gt; v{0, 1, 2, 3, 4, 5};\r\n  return v.data()[6]; \/\/ ASan container-overflow access error reported!\r\n}<\/code><\/pre>\n<p>Checking stack and raw heap memory has worked since we initially implemented\nthe Address Sanitizer feature. When using containers like <code>std::vector<\/code> or <code>std::string<\/code>,\nit will also make certain you don&#8217;t access memory that&#8217;s outside the underlying\nallocation. However, because containers are library code, by default ASan will not\nprevent you from accessing memory that&#8217;s outside the bounds of the container&#8217;s\ncapacity, but still inside the bounds of the allocation.<\/p>\n<p>In <a href=\"https:\/\/github.com\/microsoft\/STL\/pull\/2071\">microsoft\/STL#2071<\/a>, the standard library team added support for\n<a href=\"https:\/\/learn.microsoft.com\/cpp\/sanitizers\/error-container-overflow\">ASan container-overflow annotations<\/a>\u00a0to <code>std::vector<\/code>, meaning that our\nstandard library keeps the annotations for <code>std::vector<\/code>&#8216;s buffer up-to-date\nmanually, using ASan&#8217;s <a href=\"https:\/\/github.com\/llvm\/llvm-project\/blob\/9963f166fb5ea8b03b4455b265db3fe04fbf4cdd\/compiler-rt\/lib\/asan\/asan_poisoning.cpp#L411-L471\"><code>__sanitizer_annotate_contiguous_container<\/code><\/a> API.\nThis means that <code>check_vector<\/code> above correctly errors on a container-overflow error,\nand we find more bugs in people&#8217;s code!<\/p>\n<p>However, <code>std::string<\/code> is a very different beast to <code>std::vector<\/code>, due to\nthe <a href=\"https:\/\/cpp-optimizations.netlify.app\/small_strings\/\">Small String Optimization<\/a> (SSO). In <a href=\"https:\/\/github.com\/microsoft\/STL\/pull\/2196\">microsoft\/STL#2196<\/a>, we attempted\nto do an initial implementation, supporting annotations of the SSO buffer as well.\nHowever, this had quite a few interesting and unfortunate bugs come out of it,\nso we needed to disable it again in <a href=\"https:\/\/github.com\/microsoft\/STL\/pull\/2990\">microsoft\/STL#2990<\/a>.<\/p>\n<pre><code class=\"language-cxx\">\/\/ compile with -fsanitize=address\r\nchar string_old_asan() {\r\n  std::string s = \"Hello, world! Let's try address sanitizer!\";\r\n  s.reserve(64); \/\/ s's heap allocation is now 64 chars wide\r\n  assert(s.size() == 42); \/\/ but s still only contains 42 chars\r\n  \/\/ before Visual Studio 2022 17.6 Preview 1, this doesn't crash, but reads uninitialized memory\r\n  return s.data()[50]; \r\n}<\/code><\/pre>\n<p>However, as of Visual Studio 2022 17.6 Preview 1 (in <a href=\"https:\/\/github.com\/microsoft\/stl\/pull\/3164\">microsoft\/STL#3164<\/a>),\nwe fixed the bugs and re-enabled the tracking machinery in <code>std::string<\/code> to keep\nASan&#8217;s knowledge up to date and correct \u2014 meaning that the code sample above\nsuddenly went from silent undefined behavior to very loud undefined behavior!\nIt works with any allocator, including custom allocators, but will check for slightly\nfewer errors at the boundaries &#8211; if you want your custom allocators to fully support\nASan checking, you can check out the <a href=\"https:\/\/learn.microsoft.com\/cpp\/sanitizers\/error-container-overflow\">ASan container-overflow annotations<\/a>.<\/p>\n<p>One limitation of this checking, in order to avoid the bugs that plagued the original,\nis that we do not annotate <code>std::string<\/code>&#8216;s SSO buffer.\nThis means that one can still access out-of-bounds inside the SSO buffer.\nWe&#8217;ve left the door open to fixing this in the future, but we wanted to\nmake sure that checking of heap-allocated buffers at least worked.<\/p>\n<p>If you do have issues with container-overflow checking on <code>std::string<\/code>, you can disable\nit by passing <code>-D_DISABLE_STRING_ANNOTATION<\/code> to your compile. If you find bugs in the feature,\nplease report it to either <a href=\"https:\/\/developercommunity.visualstudio.com\/cpp\">Developer Community<\/a> or the <a href=\"https:\/\/github.com\/microsoft\/STL\/issues\/new\">microsoft\/STL<\/a>\nGitHub. We can also be reached in the comments below, via the <a href=\"https:\/\/twitter.com\/VisualC\">@VisualC<\/a> twitter account,\nor via <a href=\"mailto:ASanDev@microsoft.com\">ASanDev@microsoft.com<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The standard library now checks for more incorrect usage using the ASan &#8220;container overflow&#8221; feature in `std::string`.<\/p>\n","protected":false},"author":56668,"featured_media":35994,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1,230],"tags":[],"class_list":["post-31904","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cplusplus","category-new-feature"],"acf":[],"blog_post_summary":"<p>The standard library now checks for more incorrect usage using the ASan &#8220;container overflow&#8221; feature in `std::string`.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts\/31904","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/users\/56668"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/comments?post=31904"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts\/31904\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/media\/35994"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/media?parent=31904"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/categories?post=31904"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/tags?post=31904"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}