{"id":28026,"date":"2021-05-10T19:05:14","date_gmt":"2021-05-10T19:05:14","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/cppblog\/?p=28026"},"modified":"2021-05-10T19:05:14","modified_gmt":"2021-05-10T19:05:14","slug":"finding-bugs-with-addresssanitizer-patterns-from-open-source-projects","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/cppblog\/finding-bugs-with-addresssanitizer-patterns-from-open-source-projects\/","title":{"rendered":"Finding Bugs with AddressSanitizer: Patterns from Open Source Projects"},"content":{"rendered":"<p>AddressSanitizer (ASan) was <a href=\"https:\/\/devblogs.microsoft.com\/cppblog\/address-sanitizer-for-msvc-now-generally-available\/\">officially released in Visual Studio 2019 version 16.9<\/a>. We recently used this feature to <a href=\"https:\/\/devblogs.microsoft.com\/cppblog\/finding-bugs-with-addresssanitizer-msvc-compiler\/\">find and fix a bug in the MSVC compiler itself<\/a>. To further validate the usefulness of our ASan implementation, we also used it on a collection of widely used open source projects where it found bugs in Boost, Azure IoT C SDK, and OpenSSL. In this article, we present our findings by describing the type of bugs that we found and how they presented themselves in these projects. We provide links to the GitHub commits where these bugs were fixed so you can get a helpful look at what code changes were involved. If you are unfamiliar with what ASan is and how to use it, you may want to take a look at the <a href=\"https:\/\/docs.microsoft.com\/en-us\/cpp\/sanitizers\/asan?view=msvc-160\">AddressSanitizer documentation<\/a> prior to delving into this article.<\/p>\n<p>&nbsp;<\/p>\n<h3><span style=\"font-size: 18pt;\">Boost and the Eager Iterator<\/span><\/h3>\n<p>An <em>eager iterator<\/em> is one that points to an element outside the bounds of a container and is then dereferenced. The following code sample shows an example of this buggy memory access pattern:<\/p>\n<pre class=\"prettyprint\">template &lt;typename Iter&gt;\r\nint ComputeSum(Iter b, Iter e)\r\n{\r\n    int sum = 0;\r\n\r\n    for (; b &lt;= e; ++b) {\r\n        \/\/ ERROR: will dereference the 'end' iterator\r\n        \/\/ due to the use of the '&lt;=' operator above.\r\n        sum += *b;\r\n    }\r\n\r\n    return sum;\r\n}<\/pre>\n<p>Sometimes, eager iterators can appear by mistake in loops that are more complex, such as in the <code>do_length<\/code> function from Boost&#8217;s UTF-8 conversion facet implementation, shown below:<\/p>\n<pre class=\"prettyprint\">int utf8_codecvt_facet::do_length(\r\n    std::mbstate_t &amp;,\r\n    const char * from,\r\n    const char * from_end, \r\n    std::size_t max_limit\r\n) const\r\n#if BOOST_WORKAROUND(__IBMCPP__, BOOST_TESTED_AT(600))\r\n        throw()\r\n#endif\r\n{ \r\n    int last_octet_count=0;\r\n    std::size_t char_count = 0;\r\n    const char* from_next = from;\r\n\r\n    while (from_next+last_octet_count &lt;= from_end &amp;&amp; char_count &lt;= max_limit) {\r\n        from_next += last_octet_count;\r\n        last_octet_count = (get_octet_count(*from_next));\r\n        ++char_count;\r\n    }\r\n    return static_cast&lt;int&gt;(from_next-from);\r\n}<\/pre>\n<p>Here, the less-or-equal operator is used to correctly set <code>from_next<\/code> to <code>from_end<\/code> when the latter points at a UTF-8 character boundary. However, this also causes a bug where the end iterator is dereferenced. Building this code with ASan and debugging it in Visual Studio results in an ASan break at the expected location:<\/p>\n<p><img decoding=\"async\" class=\"alignnone wp-image-28029\" src=\"https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/05\/word-image-2.png\" alt=\"Screenshot of a debugging session in Visual Studio showing an AddressSanitizer global buffer overflow error in the 'do_length' function at line 'last_octet_count = (get_octet_count(*from_next));'\" width=\"700\" height=\"456\" srcset=\"https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/05\/word-image-2.png 700w, https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/05\/word-image-2-300x195.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/p>\n<p>We let the Boost team know about this issue and they promptly <a href=\"https:\/\/github.com\/boostorg\/detail\/commit\/131208d8ccd82ef69afb9cf0bad1a314bd931d88\">committed a fix on GitHub<\/a>.<\/p>\n<p>&nbsp;<\/p>\n<h3><span style=\"font-size: 18pt;\">Azure IoT C SDK: An Array and Its Length Constant Disagree<\/span><\/h3>\n<p>A disagreement between an array and its length constant happens when a constant is used to keep track of the length of an array but has the incorrect length. This can result in memory access bugs when the length constant is used in memory copy operations. The simple example below illustrates the problem:<\/p>\n<pre class=\"prettyprint\">#include &lt;cstring&gt;\r\n\r\nunsigned char GLOBAL_BUFFER[] = { 1,2,3,4,5 };\r\nconstexpr size_t BUF_SIZE = 6;\r\n\r\nvoid CopyGlobalBuffer(unsigned char* dst)\r\n{\r\n    \/\/ ERROR: AddressSanitizer: global-buffer-overflow\r\n    std::memcpy(dst, GLOBAL_BUFFER, BUF_SIZE);\r\n}<\/pre>\n<p>We found an instance of this bug in the Azure IoT C SDK, where the length constant for a string did not match the actual length:<\/p>\n<pre class=\"prettyprint\">static const unsigned char* TWIN_REPORTED_PROPERTIES = \r\n    (const unsigned char*)\r\n    \"{ \\\"reportedStateProperty0\\\": \\\"reportedStateProperty0\\\", \"\r\n    \"\\\"reportedStateProperty1\\\": \\\"reportedStateProperty1\\\" }\";\r\n\r\nstatic int TWIN_REPORTED_PROPERTIES_LENGTH = 117;<\/pre>\n<p>The value of the <code>TWIN_REPORTED_PROPERTIES_LENGTH<\/code> constant is 117 while the actual size of the <code>TWIN_REPORTED_PROPERTIES<\/code> string is 107, resulting in a global buffer overflow when copying the string with <code>memcpy<\/code>. Building this code with ASan and debugging with Visual Studio shows an error during a call to <code>memcpy<\/code>, in a deep internal function named <code>CONSTBUFFER_Create_Internal<\/code>:<\/p>\n<p><img decoding=\"async\" class=\"alignnone wp-image-28032\" src=\"https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/05\/word-image-5.png\" alt=\"Screenshot of a debugging session in Visual Studio showing an AddressSanitizer error in the 'CONSTBUFFER_Create_Internal' function on line '(void)memcpy(temp, source, size);'\" width=\"700\" height=\"456\" srcset=\"https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/05\/word-image-5.png 700w, https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/05\/word-image-5-300x195.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/p>\n<p>This didn\u2019t immediately tell us what the origin of the bug was, but thanks to the ASan integration within Visual Studio, it was possible to use the Call Stack window to walk up the stack and find the function that passed the incorrect size value:<\/p>\n<p><img decoding=\"async\" class=\"alignnone wp-image-28033\" src=\"https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/05\/word-image-6.png\" alt=\"Screenshot of the Call Stack window from a debugging session in Visual Studio. The call stack contains the following functions: CONSTBUFFER_Create_Internal, real_CONSTBUFFER_Create, send_one_report_patch, twin_msgr_do_work_started_with_EXPIRED_in_progress_patches_success, RunTests, and main.\" width=\"698\" height=\"205\" srcset=\"https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/05\/word-image-6.png 698w, https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/05\/word-image-6-300x88.png 300w\" sizes=\"(max-width: 698px) 100vw, 698px\" \/><\/p>\n<p>The culprit in this case was the <code>send_one_report_patch<\/code> function, which passed <code>TWIN_REPORTED_PROPERTIES<\/code> and <code>TWIN_REPORTED_PROPERTIES_LENGTH<\/code> to a function that indirectly calls <code>CONSTBUFFER_Create_Internal<\/code>:<\/p>\n<pre class=\"prettyprint\">static void send_one_report_patch(TWIN_MESSENGER_HANDLE handle, time_t current_time)\r\n{\r\n    const unsigned char* buffer = (unsigned char*)TWIN_REPORTED_PROPERTIES;\r\n    size_t size = TWIN_REPORTED_PROPERTIES_LENGTH;\r\n    CONSTBUFFER_HANDLE report = real_CONSTBUFFER_Create(buffer, size);\r\n\r\n    umock_c_reset_all_calls();\r\n    set_twin_messenger_report_state_async_expected_calls(report, current_time);\r\n    (void)twin_messenger_report_state_async(handle, report, \r\n        TEST_on_report_state_complete_callback, NULL);\r\n\r\n    real_CONSTBUFFER_DecRef(report);\r\n}<\/pre>\n<p>We fixed this issue by using the <code>sizeof<\/code> operator to set the length constant to a value that always reflects the actual size of the string. You can find our <a href=\"https:\/\/github.com\/Azure\/azure-iot-sdk-c\/commit\/79f6983b2ed86541bbdd943dbee59390efdd0ca9#diff-3c98bfc52b3211e281a71ec49c4ad26d5d0bb632f26795d33982d5e04abe2c31\">bug fix commit on GitHub<\/a>.<\/p>\n<p>&nbsp;<\/p>\n<h3><span style=\"font-size: 18pt;\">OpenSSL and the Shapeshifting Type<\/span><\/h3>\n<p>A shapeshifting type is born when a type\u2019s size varies depending on a preprocessor definition. If the type is then assumed to have a specific size, memory access bugs can occur. A simple example is shown below:<\/p>\n<pre class=\"prettyprint\">#include &lt;cstdint&gt;\r\n#include &lt;cstring&gt;\r\n#include &lt;array&gt;\r\n\r\n#ifdef BIGGER_INT\r\ntypedef int64_t MyInt;\r\n#else\r\ntypedef int32_t MyInt;\r\n#endif\r\n\r\nMyInt GLOBAL_BUFFER[] = { 1,2,3,4,5 };\r\n\r\nvoid SizeTypeExample()\r\n{\r\n    int localBuffer[std::size(GLOBAL_BUFFER)];\r\n\r\n    \/\/ ERROR: AddressSanitizer: stack-buffer-overflow\r\n    std::memcpy(localBuffer, GLOBAL_BUFFER, sizeof(GLOBAL_BUFFER));\r\n}<\/pre>\n<p>If <code>BIGGER_INT<\/code> is defined, the <code>memcpy<\/code> operation might trigger a stack buffer overflow due to the <code>localBuffer<\/code> variable assuming <code>MyInt<\/code> has a size identical to <code>int<\/code>. An instance of this bug was found in the <code>test_param_time_t<\/code> OpenSSL test:<\/p>\n<pre class=\"prettyprint\">static int test_param_time_t(int n)\r\n{\r\n    time_t in, out;\r\n    unsigned char buf[MAX_LEN], cmp[sizeof(size_t)];\r\n    const size_t len = raw_values[n].len &gt;= sizeof(size_t)\r\n                       ? sizeof(time_t) : raw_values[n].len;\r\n    OSSL_PARAM param = OSSL_PARAM_time_t(\"a\", NULL);\r\n\r\n    memset(buf, 0, sizeof(buf));\r\n    le_copy(buf, raw_values[n].value, sizeof(in));\r\n    memcpy(&amp;in, buf, sizeof(in));\r\n    param.data = &amp;out;\r\n    if (!TEST_true(OSSL_PARAM_set_time_t(&amp;param, in)))\r\n        return 0;\r\n    le_copy(cmp, &amp;out, sizeof(out));\r\n    if (!TEST_mem_eq(cmp, len, raw_values[n].value, len))\r\n        return 0;\r\n    in = 0;\r\n    if (!TEST_true(OSSL_PARAM_get_time_t(&amp;param, &amp;in)))\r\n        return 0;\r\n    le_copy(cmp, &amp;in, sizeof(in));\r\n    if (!TEST_mem_eq(cmp, sizeof(in), raw_values[n].value, sizeof(in)))\r\n        return 0;\r\n    param.data = &amp;out;\r\n    return test_param_type_extra(&amp;param, raw_values[n].value, sizeof(size_t));\r\n}<\/pre>\n<p>Here, <code>size_t<\/code> is assumed to be the same type as <code>time_t<\/code>, but this is not always the case depending on the architecture being compiled for. When copying <code>out<\/code> to <code>cmp<\/code> using the <code>le_copy<\/code> function, the size of the copying operation is <code>sizeof(time_t)<\/code> but the <code>cmp<\/code> buffer was initialized with size <code>size_t<\/code>. When building the OpenSSL tests with ASan and debugging with Visual Studio, the debugger breaks with an ASan error inside <code>le_copy<\/code>:<\/p>\n<p><img decoding=\"async\" class=\"alignnone wp-image-28037\" src=\"https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/05\/word-image-10.png\" alt=\"Screenshot of a debugging session in Visual Studio, showing an AddressSanitizer error in the 'le_copy' function on line 'memcpy(out, in, len)'.\" width=\"700\" height=\"445\" srcset=\"https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/05\/word-image-10.png 700w, https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/05\/word-image-10-300x191.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/p>\n<p>Again, thanks to the ASan integration in VS, we were able to use the call stack window to walk up to the actual source of the bug: the <code>test_param_time_t<\/code> function:<\/p>\n<p><img decoding=\"async\" class=\"alignnone wp-image-28038\" src=\"https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/05\/word-image-11.png\" alt=\"Screenshot of the Call Stack window from a Visual Studio debugging session. The call stack contains the following functions: le_copy, test_param_time_t, run_tests, and main.\" width=\"696\" height=\"182\" srcset=\"https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/05\/word-image-11.png 696w, https:\/\/devblogs.microsoft.com\/cppblog\/wp-content\/uploads\/sites\/9\/2021\/05\/word-image-11-300x78.png 300w\" sizes=\"(max-width: 696px) 100vw, 696px\" \/><\/p>\n<p>We let the OpenSSL team know about this bug and a fix was <a href=\"https:\/\/github.com\/openssl\/openssl\/commit\/628d2d3a7f2318b6a6a1c36f9d8d12032c69a9dd\">committed on GitHub<\/a>.<\/p>\n<p>&nbsp;<\/p>\n<h3><span style=\"font-size: 18pt;\">Try AddressSanitizer Today!<\/span><\/h3>\n<p>In this article, we shared how we were able to use AddressSanitizer to find bugs in various open source projects. We hope this will motivate you to try out this feature on your own code base. Have you found eager iterators, shapeshifting types, or array \/ length constant disagreements in your projects? Let us know in the comments below, on Twitter <a href=\"https:\/\/twitter.com\/visualc\">(@VisualC<\/a>), or via email at <a href=\"mailto:visualcpp@microsoft.com\">visualcpp@microsoft.com<\/a>.<\/p>\n<p><em>This article contains code snippets from the following sources:<\/em><\/p>\n<p><em><a href=\"https:\/\/github.com\/boostorg\/detail\/blob\/b29edf18cb94b01076c2cca1d72b32dc684ee575\/include\/boost\/detail\/utf8_codecvt_facet.ipp\">utf8_codecvt_facet.ipp<\/a> file, <a href=\"https:\/\/github.com\/boostorg\/boost\">Boost C++ Libraries<\/a>, Copyright (c) 2001 Ronald Garcia and Andrew Lumsdaine, distributed under the <a href=\"http:\/\/www.boost.org\/LICENSE_1_0.txt\">Boost Software License, Version 1.0<\/a>.<\/em><\/p>\n<p><em><a href=\"https:\/\/github.com\/Azure\/azure-iot-sdk-c\">Azure IoT C SDKs and Libraries<\/a>, Copyright (c) Microsoft Corporation, distributed under the <a href=\"https:\/\/github.com\/Azure\/azure-iot-sdk-c\/blob\/116d971f17a64d79ca745b46d707c8210dbe3437\/LICENSE\">MIT License<\/a>.<\/em><\/p>\n<p><em><a href=\"https:\/\/github.com\/Azure\/azure-c-shared-utility\">Azure C Shared Utility<\/a>, Copyright (c) Microsoft Corporation, distributed under the <a href=\"https:\/\/github.com\/Azure\/azure-c-shared-utility\/blob\/c4e4d472679958b595a06c58849e3e2faf0074b7\/LICENSE\">MIT License<\/a>.<\/em><\/p>\n<p><em><a href=\"https:\/\/github.com\/openssl\/openssl\/blob\/8020d79b4033400d0ef659a361c05b6902944042\/test\/params_api_test.c\">params_api_test.c<\/a> file, <a href=\"https:\/\/github.com\/openssl\/openssl\">OpenSSL<\/a>, Copyright 2019-2021 The OpenSSL Project Authors, Copyright (c) 2019 Oracle and\/or its affiliates, distributed under the <a href=\"https:\/\/www.apache.org\/licenses\/LICENSE-2.0.txt\">Apache License 2.0<\/a>.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>AddressSanitizer (ASan) was officially released in Visual Studio 2019 version 16.9. We recently used this feature to find and fix a bug in the MSVC compiler itself. To further validate the usefulness of our ASan implementation, we also used it on a collection of widely used open source projects where it found bugs in Boost, [&hellip;]<\/p>\n","protected":false},"author":6966,"featured_media":35994,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1,239],"tags":[],"class_list":["post-28026","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cplusplus","category-diagnostics"],"acf":[],"blog_post_summary":"<p>AddressSanitizer (ASan) was officially released in Visual Studio 2019 version 16.9. We recently used this feature to find and fix a bug in the MSVC compiler itself. To further validate the usefulness of our ASan implementation, we also used it on a collection of widely used open source projects where it found bugs in Boost, [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts\/28026","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/users\/6966"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/comments?post=28026"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts\/28026\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/media\/35994"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/media?parent=28026"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/categories?post=28026"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/tags?post=28026"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}