Improved Null Pointer Dereference Detection in Visual Studio 2022 version 17.0 Preview 4

Gabor

The C++ static analysis team is committed to making your C++ coding experience as safe as possible. We are adding richer code safety checks and addressing high impact customer feedback bugs posted on the C++ Developer Community page. Thank you for engaging with us and giving us great feedback on the past releases and early previews leading to this point. Below is the detailed overview of a new experimental code analysis check that can detect null pointer dereference errors, along with a comparison to an existing check that has the same purpose.

Overview

Internally, we have multiple analysis engines. This is opaque from the users’ point of view; warnings are surfaced identically regardless of the engine we used to implement them. One of our code analysis tools, has a number of checks to catch null pointer dereference errors. These include C6011, C6387, and C28196. While these warnings have historically been successful and prevented many errors, they do not work well with some aspects of modern C++. Moreover, the data flow framework they are written in has its limitations. EspXEngine was created to solve most of these problems. We already ship many analyses that are based on EspXEngine’s powerful path-sensitive data flow analysis including Concurrency Check and Use After Move Check. The success of these checks convinced us to port null pointer analysis to EspXEngine. We are excited to make the new version available to try which introduces many improvements compared to the old one. The rest of the blog post is an in-depth overview of some of the improvements and gives some hints how to use power-user features like annotations.

Path-sensitive analysis

Both analysis engines are capable of path-sensitive analysis. Let’s consider the example below to understand what this means:

void path_sensitive(int *p, bool cond) { 
    int state = 0; 

    // branch 1  
    if (p != nullptr) { 
        state = 1; 
    } 

    // branch 2 
    if (cond) { 
        state = 2; 
        p = nullptr; 
    } 

    // branch 3 
    if (state == 1) { 
        *p = 42; // Null dereference? 
    } 
} 

The code above has multiple branches. Some of those branches are correlated, but flow-sensitive analyses will not reason about these correlations. For instance, a flow-sensitive analysis might conclude that the code is unsafe due to a potential null-dereference, since p is set to nullptr in branch 2, and then dereferenced in branch 3. However, this would be a false positive because branch 3 cannot be reached if branch 2 has been taken. Path-sensitive analyses, on the other hand, do reason about these types of reachability conditions, and would therefore conclude that the code above is safe. As a result, path-sensitive analyses are more precise. But, this precision comes at the cost of analysis time and memory. Both engines have identical behavior on this code snippet.

Local analysis

Both engines are doing intraprocedural analysis. They cannot see across function boundaries and rely on types, type extensions, models and contracts to bridge the gap.

void local_analysis(int *p, int *q, bool cond) { 
    if (p == nullptr) 
        return; 
    q = nullptr; 
    std::swap(p, q); 
    *p = 42; // Null dereference 
} 

The code above has a bug. The pointer p is nullptr due to the call to swap. This bug is not found by the current check. However, EspXEngine models some common APIs. As a result, it can figure out the bug and report a warning to the user.

Unfortunately, when we call our own APIs, EspXEngine will not know the semantics of the called function. In those cases, we can use types or SAL annotations to describe the pre- and postconditions of our functions:

_Notnull_ int *get_my_ptr(); 
gsl::not_null<int *> get_my_ptr2(); 
void local_analysis(int *p) { 
    _Analysis_assume_(p != nullptr); 
    *p = 42; 
} 

In the code above, we use the _Notnull_ and _Analysis_assume_ SAL annotations to describe the constraints on the values of some pointers. This is supported by both engines. A more modern approach is to use rich types to express these contracts. This is only supported in EspXEngine. Moreover, it will flag code where a null pointer is stored into a gsl::not_null pointer:

void assign_to_gsl_notnull() { 
    int* p = nullptr; 
    auto q = gsl::make_not_null(p); // C26822 warning 
} 

While types are great to encode our expectations, SAL has the power to express a wider range of contracts. Consider the example below:

void postcondition_conditional(bool b, _When_(b == true, _Outptr_) int** p)  { 
    if (b == true) 
        *p = nullptr; // C26824 warning 
} 

This function has a complex postcondition. Whenever the first argument is true, the value at location *p must be not-null when the function exists. These contracts are understood by both engines (although the support in EspXEngine is more sophisticated) and many Windows APIs are annotated to describe their behavior. We would love to use a standard language facility, but the contracts proposal was not accepted for C++20 and we need a solution that work both for C and C++ APIs.

Some problems with our existing null pointer checks

I wanted to showcase some examples where the null pointer check based on EspXEngine has better behavior than the current one. First of all, there are some low easy to catch null pointer dereferences that are not found by the current checks:

void nullptr_constant_dereference() { 
    *(int*)nullptr = 5; // Previously, it was not found. 
} 

There are also cases where they were noisier:

struct Node { 
    int number; 
    Node* next; 
}; 

void add_number(Node*& head, Node*& tail, int data) { 
    if (head != nullptr) { 
        tail->next = (Node*)malloc(sizeof(Node)); 
        tail = tail->next; 
    } else { 
        head = (Node*)malloc(sizeof(Node)); 
        tail = head; 
    } 
    tail->number = data; // C6011 warning 
    tail->next = nullptr; 
} 

In the code above the current version will give a null pointer dereference warning on the line with the comment. Technically, this warning could be a true positive when malloc fails and returns a nullptr. This is a scenario that is irrelevant for many applications. EspXEngine has both low and high confidence warnings and will only emit a low confidence warning in this case. Most users are probably only interested in the high confidence warnings that are expected to have less noise and turn the low confidence warnings off.

Moreover, we decided to make EspXEngine stricter detecting various undefined behavior:

void method_null_dereference(Foo* p, Foo* q) { 
    if (p || q) 
        return; 

    p->method();            // C26822 warning 
    q->static_method(42);   // OK, not UB.  
} 

In the code above, contrary to EspXEngine, the current warning will not warn when we call a method on a null pointer. Strictly speaking, this code has undefined behavior, but many implementations will work fine when method does not dereference the this pointer.

Conclusion

The upcoming Visual Studio 2022 17.0 Preview 4 will feature new, experimental checks to find null pointer dereference errors. These checks are intended to be better versions of the current ones with improved precision and additional features. These new checks are doing in-depth analysis and are expected to increase the analysis time. They are off by default and can be enabled by using the CppCoreCheckExperimentalRules ruleset.

Try it out and let us know what you think:

The work that we do is heavily influenced by feedback we receive on the Developer Community so thank you again for your participation. Please continue to file feedback and let us know if there is a checker or rule that you would like to see added to C++ Core Check. Stay tuned for more C++ static analysis blogs. In the meanwhile, do not hesitate to reach out to us. We can be reached via the comments below or @VisualC on Twitter.

2 comments

Leave a comment