May 10th, 2021

Finding Bugs with AddressSanitizer: Patterns from Open Source Projects

Kevin Cadieux
Software Engineer

AddressSanitizer (ASan) was officially released in Visual Studio 2019 version 16.9. We recently used this feature to find and fix a bug in the MSVC compiler itself. To further validate the usefulness of our ASan implementation, we also used it on a collection of widely used open source projects where it found bugs in Boost, Azure IoT C SDK, and OpenSSL. In this article, we present our findings by describing the type of bugs that we found and how they presented themselves in these projects. We provide links to the GitHub commits where these bugs were fixed so you can get a helpful look at what code changes were involved. If you are unfamiliar with what ASan is and how to use it, you may want to take a look at the AddressSanitizer documentation prior to delving into this article.

 

Boost and the Eager Iterator

An eager iterator is one that points to an element outside the bounds of a container and is then dereferenced. The following code sample shows an example of this buggy memory access pattern:

template <typename Iter>
int ComputeSum(Iter b, Iter e)
{
    int sum = 0;

    for (; b <= e; ++b) {
        // ERROR: will dereference the 'end' iterator
        // due to the use of the '<=' operator above.
        sum += *b;
    }

    return sum;
}

Sometimes, eager iterators can appear by mistake in loops that are more complex, such as in the do_length function from Boost’s UTF-8 conversion facet implementation, shown below:

int utf8_codecvt_facet::do_length(
    std::mbstate_t &,
    const char * from,
    const char * from_end, 
    std::size_t max_limit
) const
#if BOOST_WORKAROUND(__IBMCPP__, BOOST_TESTED_AT(600))
        throw()
#endif
{ 
    int last_octet_count=0;
    std::size_t char_count = 0;
    const char* from_next = from;

    while (from_next+last_octet_count <= from_end && char_count <= max_limit) {
        from_next += last_octet_count;
        last_octet_count = (get_octet_count(*from_next));
        ++char_count;
    }
    return static_cast<int>(from_next-from);
}

Here, the less-or-equal operator is used to correctly set from_next to from_end when the latter points at a UTF-8 character boundary. However, this also causes a bug where the end iterator is dereferenced. Building this code with ASan and debugging it in Visual Studio results in an ASan break at the expected location:

Screenshot of a debugging session in Visual Studio showing an AddressSanitizer global buffer overflow error in the 'do_length' function at line 'last_octet_count = (get_octet_count(*from_next));'

We let the Boost team know about this issue and they promptly committed a fix on GitHub.

 

Azure IoT C SDK: An Array and Its Length Constant Disagree

A disagreement between an array and its length constant happens when a constant is used to keep track of the length of an array but has the incorrect length. This can result in memory access bugs when the length constant is used in memory copy operations. The simple example below illustrates the problem:

#include <cstring>

unsigned char GLOBAL_BUFFER[] = { 1,2,3,4,5 };
constexpr size_t BUF_SIZE = 6;

void CopyGlobalBuffer(unsigned char* dst)
{
    // ERROR: AddressSanitizer: global-buffer-overflow
    std::memcpy(dst, GLOBAL_BUFFER, BUF_SIZE);
}

We found an instance of this bug in the Azure IoT C SDK, where the length constant for a string did not match the actual length:

static const unsigned char* TWIN_REPORTED_PROPERTIES = 
    (const unsigned char*)
    "{ \"reportedStateProperty0\": \"reportedStateProperty0\", "
    "\"reportedStateProperty1\": \"reportedStateProperty1\" }";

static int TWIN_REPORTED_PROPERTIES_LENGTH = 117;

The value of the TWIN_REPORTED_PROPERTIES_LENGTH constant is 117 while the actual size of the TWIN_REPORTED_PROPERTIES string is 107, resulting in a global buffer overflow when copying the string with memcpy. Building this code with ASan and debugging with Visual Studio shows an error during a call to memcpy, in a deep internal function named CONSTBUFFER_Create_Internal:

Screenshot of a debugging session in Visual Studio showing an AddressSanitizer error in the 'CONSTBUFFER_Create_Internal' function on line '(void)memcpy(temp, source, size);'

This didn’t immediately tell us what the origin of the bug was, but thanks to the ASan integration within Visual Studio, it was possible to use the Call Stack window to walk up the stack and find the function that passed the incorrect size value:

Screenshot of the Call Stack window from a debugging session in Visual Studio. The call stack contains the following functions: CONSTBUFFER_Create_Internal, real_CONSTBUFFER_Create, send_one_report_patch, twin_msgr_do_work_started_with_EXPIRED_in_progress_patches_success, RunTests, and main.

The culprit in this case was the send_one_report_patch function, which passed TWIN_REPORTED_PROPERTIES and TWIN_REPORTED_PROPERTIES_LENGTH to a function that indirectly calls CONSTBUFFER_Create_Internal:

static void send_one_report_patch(TWIN_MESSENGER_HANDLE handle, time_t current_time)
{
    const unsigned char* buffer = (unsigned char*)TWIN_REPORTED_PROPERTIES;
    size_t size = TWIN_REPORTED_PROPERTIES_LENGTH;
    CONSTBUFFER_HANDLE report = real_CONSTBUFFER_Create(buffer, size);

    umock_c_reset_all_calls();
    set_twin_messenger_report_state_async_expected_calls(report, current_time);
    (void)twin_messenger_report_state_async(handle, report, 
        TEST_on_report_state_complete_callback, NULL);

    real_CONSTBUFFER_DecRef(report);
}

We fixed this issue by using the sizeof operator to set the length constant to a value that always reflects the actual size of the string. You can find our bug fix commit on GitHub.

 

OpenSSL and the Shapeshifting Type

A shapeshifting type is born when a type’s size varies depending on a preprocessor definition. If the type is then assumed to have a specific size, memory access bugs can occur. A simple example is shown below:

#include <cstdint>
#include <cstring>
#include <array>

#ifdef BIGGER_INT
typedef int64_t MyInt;
#else
typedef int32_t MyInt;
#endif

MyInt GLOBAL_BUFFER[] = { 1,2,3,4,5 };

void SizeTypeExample()
{
    int localBuffer[std::size(GLOBAL_BUFFER)];

    // ERROR: AddressSanitizer: stack-buffer-overflow
    std::memcpy(localBuffer, GLOBAL_BUFFER, sizeof(GLOBAL_BUFFER));
}

If BIGGER_INT is defined, the memcpy operation might trigger a stack buffer overflow due to the localBuffer variable assuming MyInt has a size identical to int. An instance of this bug was found in the test_param_time_t OpenSSL test:

static int test_param_time_t(int n)
{
    time_t in, out;
    unsigned char buf[MAX_LEN], cmp[sizeof(size_t)];
    const size_t len = raw_values[n].len >= sizeof(size_t)
                       ? sizeof(time_t) : raw_values[n].len;
    OSSL_PARAM param = OSSL_PARAM_time_t("a", NULL);

    memset(buf, 0, sizeof(buf));
    le_copy(buf, raw_values[n].value, sizeof(in));
    memcpy(&in, buf, sizeof(in));
    param.data = &out;
    if (!TEST_true(OSSL_PARAM_set_time_t(&param, in)))
        return 0;
    le_copy(cmp, &out, sizeof(out));
    if (!TEST_mem_eq(cmp, len, raw_values[n].value, len))
        return 0;
    in = 0;
    if (!TEST_true(OSSL_PARAM_get_time_t(&param, &in)))
        return 0;
    le_copy(cmp, &in, sizeof(in));
    if (!TEST_mem_eq(cmp, sizeof(in), raw_values[n].value, sizeof(in)))
        return 0;
    param.data = &out;
    return test_param_type_extra(&param, raw_values[n].value, sizeof(size_t));
}

Here, size_t is assumed to be the same type as time_t, but this is not always the case depending on the architecture being compiled for. When copying out to cmp using the le_copy function, the size of the copying operation is sizeof(time_t) but the cmp buffer was initialized with size size_t. When building the OpenSSL tests with ASan and debugging with Visual Studio, the debugger breaks with an ASan error inside le_copy:

Screenshot of a debugging session in Visual Studio, showing an AddressSanitizer error in the 'le_copy' function on line 'memcpy(out, in, len)'.

Again, thanks to the ASan integration in VS, we were able to use the call stack window to walk up to the actual source of the bug: the test_param_time_t function:

Screenshot of the Call Stack window from a Visual Studio debugging session. The call stack contains the following functions: le_copy, test_param_time_t, run_tests, and main.

We let the OpenSSL team know about this bug and a fix was committed on GitHub.

 

Try AddressSanitizer Today!

In this article, we shared how we were able to use AddressSanitizer to find bugs in various open source projects. We hope this will motivate you to try out this feature on your own code base. Have you found eager iterators, shapeshifting types, or array / length constant disagreements in your projects? Let us know in the comments below, on Twitter (@VisualC), or via email at visualcpp@microsoft.com.

This article contains code snippets from the following sources:

utf8_codecvt_facet.ipp file, Boost C++ Libraries, Copyright (c) 2001 Ronald Garcia and Andrew Lumsdaine, distributed under the Boost Software License, Version 1.0.

Azure IoT C SDKs and Libraries, Copyright (c) Microsoft Corporation, distributed under the MIT License.

Azure C Shared Utility, Copyright (c) Microsoft Corporation, distributed under the MIT License.

params_api_test.c file, OpenSSL, Copyright 2019-2021 The OpenSSL Project Authors, Copyright (c) 2019 Oracle and/or its affiliates, distributed under the Apache License 2.0.

Author

Kevin Cadieux
Software Engineer

Engineer working on MSVC.

3 comments

Discussion is closed. Login to edit/delete existing comments.

  • Edward Lambert (SI)

    Since I updated to 16.10 and trying this I now get iterator debug level mismatches, we use iterator debug level 0 in our debug builds or they are too slow

    clang_rt.asan_dbg-x86_64.lib(asan_malloc_win_moveable.cpp.obj) : error LNK2038: mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '2' doesn't match value '0'
    39> clang_rt.asan_cxx_dbg-x86_64.lib(ubsan_type_hash_win.cpp.obj) : error LNK2038: mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '2' doesn't match value '0
    39> LIBVCASAND.lib(vcasan.obj) : error LNK2038: mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '2' doesn't match value '0'...

    Read more
  • Allen, Santiago

    Hi,

    I was wondering, is there already support for changing the default halt_on_error strategy? On Linux, we did that with -fsanitize-recover=address compile option and setting ASAN_OPTIONS=halt_on_error=false

    We found the option convenient because testers could run versions of our software with ASAN enabled and then hand over the reports to devs for analysis. From what I have tried so far, all the errors become exceptions and halt execution, but I am new to this feature...

    Read more
  • Nikolay Baklicharov

    Really cool! It’s great to see that the ASAN integration is effective and working properly.