December 13th, 2022

High-confidence Lifetime Checks in Visual Studio version 17.5 Preview 2

Gabor Horvath
Software Engineer

The C++ team is committed to making your C++ coding experience as safe as possible. We are adding richer code safety checks and addressing high impact customer feedback bugs posted on the C++ Developer Community page. Thank you for engaging with us and giving us great feedback on the past releases and early previews leading to this point. Below is the detailed overview of the improvements we made to the lifetime analysis.

Overview

The C++ Core Guidelines’ Lifetime Profile, aims to detect lifetime problems, like dangling pointers and references, in C++ code. For more information on the history and goals of the profile, check out Herb Sutter’s blog post about version 1.0. It has been quite a while since we last talked about lifetime analysis. The Lifetime rules have good defaults for common cases so that they don’t require any annotations for most code. Where annotations are needed, we designed the syntax to follow the ISO C++ contracts proposals’ syntax, and in the meantime while work on that syntax still in progress, we have focused implementation of the lifetime analysis on the other parts that do not require annotations. Support for the non-default lifetime annotations will be added later. Lately, there has been an increased push in the C++ community to introduce lifetime-related safety features, which has led us to revisit the lifetime analysis in MSVC.

We spent the last couple of months looking into the results of using the lifetime analysis on real world code. This blog post summarizes our experience and the improvements we made along the way. The biggest change is the introduction of a new set of warnings. These warnings are the high-confidence versions of the existing warnings. Users who want less noise can enable only the high-confidence warnings, while users who want more rigorous checks at the cost of noise can enable both the old and the new warnings. As of 17.5, the high-confidence warnings are still experimental, but depending on the feedback we might include them in some of the recommended profiles in future versions.

High-confidence warnings

Lifetime analysis was originally designed to work well with code written in a specific style that closely follows some of the recommendations of the C++ Core Guidelines. These recommendations include avoiding return arguments, naked unions, and replacing pointer arithmetic with higher level abstractions like span. Consequently, analysis has not worked well with arbitrary code out of the box. The goal of these newer, high-confidence warnings is to be applicable for a wider range of code bases (at the cost of potentially missing more bugs).

Case study

Before diving into how the new warnings work let’s look at a case study. We tested the analysis on multiple internal projects that did not follow the best practices the lifetime analysis expects. I wanted to share some data regarding one of them. This internal project is a software component from Windows. The first column represents the baseline before we started to implement any adaptations for legacy code, the after represents the sate of lifetime analysis as of 17.5 Preview 2.

Before After
Compile-time cost +5.2% +3.5%
Lifetime warnings (regular) 2777 2074
Lifetime warnings (high-confidence) 0 6
Assertion failures in the analysis 109 1

It takes ~50 minutes to compile and analyze this codebase. Turning on lifetime analysis increased the processing time by 5.2%. After the improvements we could reduce this overhead to 3.5%. If the compilation time cost of turning lifetime analysis on looks reasonable, that is great! One of the design goals of the analysis was to make it efficient enough to run regularly at build time, and we think this implementation is now efficient enough to turn on by default in most projects.

We also managed to fix many assertion failures. Most importantly, the newly introduced high-confidence warnings only emitted a manageable 6 warnings on this project, and all of them happened to uncover real problems! While on some other projects we still see some noise from the high-confidence warnings, the number of results is definitely much more manageable. So it is our recommendation that you give the high-confidence warnings a try! Don’t forget to report the bugs you encounter to help us further improve these checks.

Our fixes were focused primarily on reducing noise for high-confidence warnings, they also had the effect of reducing the noise of the lower-confidence warnings by almost 35%! Note that the high number of regular warnings is expected as the test codebase does not follow the C++ Core Guidelines and does not have any lifetime annotations in place.

Adapting lifetime analysis to arbitrary coding styles

This section discusses the sources of noise from lifetime analysis on certain code bases and explains how we addressed them.

Adapting flow-sensitivity to complex control flow

Consider the following code snippet:

void f(bool b) {
    int* p = nullptr;
    int x;
    if (b) {
        p = &x;
    }
    // ...
    if (b) {
       *p = 42;  // Location A.
    }
}

Because the lifetime analysis is flow-sensitive, it warns at location A because nullptr is in the set of potential values of p across all branches. This warning is a false positive as the execution path where the value of p is nullptr and we later dereference p cannot be realized at runtime because of the correlated branches. We call such paths infeasible. Warnings from infeasible execution paths are a common source of noise. This problem can be mitigated using path-sensitive analysis that will reason about the branches and their correlations. Unfortunately, path-sensitive analysis is expensive and far from perfect. Consider the following slightly modified example:

bool cond();
void f(bool b) {
    int* p = nullptr;
    int x;
    if (cond()) {
        p = &x;
    }
    // ...
    if (cond()) {
       *p = 42;  // Location A.
    }
}

The conditions are now replaced with a call to a function that potentially defined in another library. To check how the two branches are correlated, we would need to do inter-procedural analysis. Inter-procedural, path-sensitive analysis can be really expensive, for instance if the branch correlation depends on a really deep call stack, or outright infeasible if the source code of cond is not available (e.g. from a proprietary library). Instead of path-sensitive analysis, we took a different approach: we only emit high-confidence warnings when the set of potential values for a pointer only contains a single element. This usually means that there is only a single execution path between the creation of the value and the unsafe operation (dereference in this case). As a result, the high-confidence warnings will be inherently less noisy. However, this confidence comes at a cost, since if cond() returns false on the first call, and true on the second, the resultant nullptr dereference will not be caught by the high-confidence warning.

Adapting to the lack of annotations

Let’s look at the code snippet below:

int* g(int* a, int* b);
void f() {
    int x;
    int* p = g(&x, nullptr);
    *p = 42; // Location B.
}

In the above code, we pass nullptr as the second input to function g. The naive lifetime analysis would assume that g could return either of its inputs, and so would warn for the case when nullptr is returned. But if g never returns the second argument, the code above is actually safe! Therefore, we upgraded the analysis to only emit a high-confidence warning if we made absolutely no assumptions about the called function. In the future, once g‘s behavior is annotated, the analysis will be able to know more precisely in which cases (if ever) g returns its second argument, and therefore be able to produce more high-confidence warnings.

Our analysis also makes certain assumptions for the role of the arguments. Let’s consider the following code snippet:

int* g(int** q);
void f() {
    int* p;
    g(&p);  // Location C.
}

The analysis will emit a warning at location C because we pass an uninitialized value (p) to a function. We have the assumption that p is an in-out argument in this case and g reading this value can result in undefined behavior. However, if p is only an out argument (g populates it before reading it), then the code is perfectly reasonable. Now, on one hand The C++ Core Guidelines is strongly against output arguments. On the other hand, output arguments are still used extensively in production code. As a result, we decided to only emit a high-confidence warning when there is no doubt about the role of the argument. In this particular case, since it is unclear whether p is an out or in-out argument, there will be no high-confidence warning emitted.

We also had to make changes how owner invalidation affects high-confidence warnings. Let’s look at a typical invalidation problem:

int& f(std::vector<int>& v) {
    int& before_last = v.back();
    v.push_back(42);
    return before_last; // Location D.
}

Here, the before_last reference is potentially invalidated by the push_back operation. This is dangerous code, so lifetime analysis will warn at location D. Now, while it is possible to teach the analysis that std::vector‘s push_back member is potentially invalidating, and do the same for other owners found in the STL (which we have done for the lower-confidence lifetime warnings), this is not possible to do for the general case of user-written ownership types. One approach for a generic owner class is to treat all of its non-const methods as potentially invalidating, but this is not always accurate, since operator[] has a non-const overload that will never invalidate the references. Moreover, invalidating functions do not always cause lifetime problems. Consider the following modified version of the previous snippet:

int& f(std::vector<int>& v) {
    v.reserve(v.size()+1);
    int& before_last = v.back();
    v.push_back(42);
    return before_last; // Location D.
}

The code is now safe, but we still get the lifetime analysis warnings. To avoid noisy warnings on this type of source code, currently, we will not emit a high-confidence warning for scenarios involving invalidating functions. While this will eliminate a lot of noise, high-confidence warnings will miss the entire class of lifetime problems caused by owner invalidation. We are planning targeted changes to emit high-confidence warnings in some scenarios that are really unlikely to be correct (e.g., when there is no std::vector::reserve calls in the function). In the future, we might consider looking into providing finer control over what sources of noise should be filtered out.

Bug fixes

We also spent significant amount of time fixing bugs in the lifetime analysis. These changes should make both the high-confidence warnings and the older warnings better. This section highlights some of the fixes we made. In addition to the highlighted changes, we also fixed many crashes, assertion failures, and made some performance optimizations.

  • Lifetime analysis should no longer warn when deleting a null pointer
  • Lifetime analysis used to assume that pointer arithmetic produces an invalid pointer. This is no longer the case, we have a separate dedicated warning to diagnose pointer arithmetic.
  • No longer attempt to diagnose errors for union members. Unions were a source of false positives as our analysis did not have a good understanding of which member should be considered the active member. The C++ Core Guidelines recommends using abstractions like std::variant over naked unions.
  • Better modeling for heap allocations
  • Better modeling for classes with const fields
  • Tracking the destruction of temporary objects more precisely
  • Teached lifetime analysis that std::move does not move
  • No longer attempt to verify the correct use of shared_ptrs. We might end up adding support back in the future after some improvements.
  • Many other bug fixes, including this one reported on Developer Community.

Conclusion

Visual Studio 2022 17.5 Preview 2 features many improvements to the lifetime analysis including a new set of high-confidence warnings. Give these new checks a try and let us know what you think. The work that we do is heavily influenced by feedback we receive on the Developer Community so thank you again for your participation. Please continue to file feedback and let us know if there is a checker or rule that you would like to see added to C++ Core Check. Stay tuned for more C++ static analysis blogs. In the meanwhile, do not hesitate to reach out to us. We can be reached via the comments below or @VisualC on Twitter.

Author

Gabor Horvath
Software Engineer

Compiler enthusiast with academic background. Gabor started a Ph.D. in 2016. He is a contributor to research projects related to static analysis since 2012. He is a clang contributor, participated in Google Summer of Code twice as a student and many times as a mentor, interned for Apple, Microsoft and Google. He taught C++ and compiler construction to undergrads at Eotvos Lorand University. Currently, he is working at Microsoft's C++ Static Analysis team to improve MSVC's static analysis ...

More about author

8 comments

Discussion is closed. Login to edit/delete existing comments.

    • Gabor HorvathMicrosoft employee

      Thanks for trying the checks out and reporting the issue. I was able to reproduce and fix this, starting 17.6, we should no longer have these false positives.

  • David Lowndes

    “Users who want less noise can enable only the high-confidence warnings, while users who want more rigorous checks at the cost of noise can enable both the old and the new warnings.”

    How do we do that?

    • Gabor HorvathMicrosoft employee

      This blog post describe the steps how to turn on lifetime warnings. These steps will turn on both the high-confidence and low-confidence warnings. Once the high-confidence warnings are out of experimental, we will make some changes to the shipped rulesets to make it easier to enable them selectively. In the meantime, you can create a new ruleset file with the rules you want to use. We are looking into ways to make enabling/disabling sets of warnings easier in the future.

      • David Lowndes · Edited

        OK, what are the numbers of the specific high confidence checks – and their equivalent low confidence ones so that we know which to play with?

      • Roger B · Edited

        Looks like the following 4 are the new ones:

        Taken from C:\Program Files\Microsoft Visual Studio\2022\Preview\Team Tools\Static Analysis Tools\Rule Sets\CppCoreCheckLifetimeRules.ruleset

            
            C26846
            C26847
            C26848
            C26849
        
  • José Pedro Lopes · Edited

    Hi, this is awesome! Great work! Can you share what kinds of Lifetime warnings (high-confidence) are most often found which result in actual bugs, perhaps as a result of your testing? Thanks in advance.