New std::optional Checks in Visual Studio 2022 version 17.3 Preview 3

Gabor Horvath

The C++ static analysis team is committed to making your C++ coding experience as safe as possible. We are adding richer code safety checks and addressing high impact customer feedback bugs posted on the C++ Developer Community page. Thank you for engaging with us and giving us great feedback on the past releases and early previews leading to this point. Below is the detailed overview of some new experimental code analysis checks that can detect unwrapping of empty std::optionals. The experimental checks can be enabled by using the CppCoreCheckExperimentalRules ruleset. Note that the experimental checks are not part of the Microsoft All Rules ruleset. While these checks are marked experimental, they look promising in our internal, preliminary testing. Unless we get reports about crashes or excessive number of false positives, we plan to move these checks to the NativeRecommendedRules ruleset (which is the default ruleset) in the next release.

Overview

std::optional was introduced in C++17 to represent a value that may or may not be present. It is often used as the return type for a function that may fail. We introduced two new checks, C26829 and C26830, to find unwrap operations of empty std::optionals. Unwrapping an empty optional is undefined behavior. It can result in a crash, or worse, reading uninitialized memory depending on the implementation. In some cases, the latter is a vulnerability that an adversarial actor could exploit. The C++ Developer Community ask was one of the top voted feature requests for the static analysis team as dereferencing empty optionals has been a major source of real errors in many C++ projects.

Modeling optionals

In order to warn when (potentially) empty optionals are unwrapped, the analyzer needs to precisely model the semantics of std::optional.

Basic assumptions

Usually, the use of std::optionals is a stronger signal about the presence of values compared to pointer types. Let us look at the following code snippet:

void f(int* p);
void g(std::optional<int>& p);

In many codebases, we cannot know whether nullptr is a valid argument to function f. The function might have a precondition that it does not accept null pointers and the codebase might never pass a null value to f. A warning for null pointer dereference in the body of function f would be considered false positives by some developers. Usually, marking such pointers with gsl::not_null (void f(gsl::not_null<int*> p);) or replacing them with references (void f(int& p);) can make the code clearer.

In case of function g, however, the use of std::optional makes it explicit that it handles the lack of values gracefully. Therefore, while we tend to not warn on pointer parameters that don’t have null checks, we will warn on unwrapping std::optionals that might be empty. Unfortunately, there are some rare cases where this assumption would not hold. Let us look at the code snippet below:

std::optional<int> lookup(std::string_view key) {
    const static std::map myMap{std::pair{"Foo"sv, 1}, std::pair{"Bar"sv, 2}};
    auto it = myMap.find(key);
    return it == myMap.end() ? std::nullopt : std::optional{it->second};
}

While the function lookup might fail in the general case, a particular invocation of the function might have an argument that guarantees success (e.g., it might be lookup("Foo")). This guarantee is an invariant of the program that we currently cannot express using SAL annotations and cannot infer using function-local reasoning. The experimental versions of these checks might emit false positive warnings in those cases. We are actively looking into ways to mitigate this problem. Some of the options are improving existing annotations to be able to communicate this invariant, or believe certain assertions. Until we settle on a solution, it is always possible to either suppress these warnings or to check that the optional has a value before unwrapping it make the warning disappear.

Our modeling also assumes that whenever an optional is passed to a function by non-const reference, the called function might reset the optional. This assumption helps us catch more problems at the cost of more false positives. As we gain more real-world experience with these checks we might revisit some of these assumptions/decisions in the future.

Basic operations

This section describes the details of the modeling using a notation borrowed from our automatic regression tests. This notation helps us document our expectations regarding the semantics of the analyzed program and check whether the analyzer’s understanding matches our intuition. Program points that should be deduced as reachable are annotated with __espx_expect_reached(). On the other hand, program points that should be deduced as unreachable are annotated with __espx_expect_unreached(). Looking at the reachability of certain program points can help us understand how the analysis engine reasoned about the values in the program. We can also query some values directly using annotations like __espx_expect_always_true(cond). Our analysis tool will evaluate the expression cond and will report a failure when it cannot prove that the value always evaluates to true.

Our analysis engine understands that the default constructor of std::optional will create an empty optional. Moreover, it understands the basic ways to check if an optional is empty:

void default_ctor_creates_empty()
{
    std::optional<int> opt;
    if (opt)
        __espx_expect_unreached();
    else
        __espx_expect_reached();

    if (opt.has_value())
        __espx_expect_unreached();
    else
        __espx_expect_reached();

    int x = opt.value_or(5);
    __espx_expect_always_true(x == 5);
}

The test case above shows that the engine can discover that opt evaluates to false, so the true branch of the first if statement is never reached, and the false branch is always reached. The engine also understands that the value_or will return its argument when it is invoked on an empty optional. Conversely, it also understands that value_or will return the internal value of an optional when it had a value:

void value_ctor_creates_non_empty()
{
    std::optional<int> opt{2};
    __espx_expect_always_true((bool)opt);

    int x = opt.value_or(5);
    __espx_expect_always_true(x == 2);
}

Our analyzer also understand value types. It knows that the copy of an optional has a value if and only if the copied optional also had a value. Moreover, the contained value is the copy of the original:

void copied_non_empty_optional_is_not_empty()
{
    std::optional<int> opt{2};
    auto opt2 = opt;
    __espx_expect_always_true((bool)opt);
    __espx_expect_always_true((bool)opt2);

    __espx_expect_always_true(opt.value() == opt2.value());
}

The analyzer also understands that the value inside an optional is always at the same address and two different optional objects are living at different addresses:

void accessor_produces_stable_addresses()
{
    std::optional<int> opt{2};
    __espx_expect_always_true(&opt.value() == &opt.value());
    int* ptr = &opt.value();
    opt = std::optional<int>{2};
    __espx_expect_always_true(&opt.value() == ptr);
    std::optional<int> opt2{opt};
    __espx_expect_always_true(&opt.value() != &opt2.value());
}

Surprisingly, a moved-from optional that used to have a valid value is not empty. It holds the moved-from value:

void moved_from_optional_is_not_empty()
{
    std::optional<int> opt{2};
    auto opt2 = std::move(opt);
    __espx_expect_always_true((bool)opt);
    __espx_expect_always_true(*opt2 == 2);
}

This might be a potential source of confusion. While we currently will not warn for using the moved-from object in the original optional, we are looking into how can we teach our existing use-after-move check to find such errors by piggy-backing on the engine’s understanding of std::optional.

Symbolic reasoning

Our analysis engine is using symbolic reasoning to model the emptiness of optionals. Whenever the engine learns new facts about these symbols, this knowledge automatically and retroactively applied to the state of the objects. Consider the following example:

void constraints_correctly_applied(std::optional<int> optVal)
{
    bool b = (bool)optVal;                         // Program point: A.
    if (b)                                         // Program point: B.
    {
       __espx_expect_always_true((bool)optVal);    // Program point: C.
    }
}

In the code snippet above, we have no information about the emptiness of optVal at program point A. However, the analyzer knows that the value of the variable b is inherently entangled to the emptiness of optVal. We branch on b at program point B. In the true branch, we know that the value of b is true. As a result, we also learned that optVal is not empty. As a result, (bool)optVal will evaluate to true at program point C. To summarize, we might learn new facts about the state of optVal from expressions that will not even refer to optVal syntactically. This is the power of symbolic reasoning.

Modeling exceptions

The analyzer understands whether accessor methods like std::optional::value will or will not throw an exception based on the known state of the object. It can use this information to help the analysis skip certain execution paths that cannot happen at runtime. This helps reducing the number of false positives and improve the performance of the analysis. The code snippet below demonstrates the behavior of the analysis.

void exception_modeling(std::optional<int> unknown)
{
    std::optional<int> nonEmpty{2};
    std::optional<int> empty{};

    try
    {
        unknown.value();
        __espx_expect_reached();
    }
    catch(...)
    {
        __espx_expect_reached();
    }

    try
    {
        nonEmpty.value();
        __espx_expect_reached();
    }
    catch(...)
    {
        __espx_expect_unreached();
    }

    try
    {
        empty.value();
        __espx_expect_unreached();
    }
    catch(...)
    {
        __espx_expect_reached();
    }
}

Other considerations

Our analysis engine also understands nested optionals. There are many more modeled methods that we did not mention explicitly, including swap. Unfortunately, the current version of our modeling will not precisely model the semantics of free functions operating on std::optionals, like std::swap or the comparison operators. We have partial modeling in place for std::make_optional and std::in_place constructors. We plan to make the modeling more comprehensive in the future, but we feel like the current modeling should be sufficient to find most errors.

Emitting warnings

The analyzer will emit C26829 when an empty optional is unwrapped. On the other hand, it will emit C26830 when a potentially empty optional is unwrapped. The emitted warnings will also include a path that describes the execution that could trigger the problem. In the future, we plan to include key events in the emitted diagnostics that will highlight parts of the code that are important to understand the warning. The highlighted snippets might include the program points where the emptiness of the optional was checked and calls where the emptiness of the optional might have been changed.

void unwrap_empty()
{
  std::optional<int> o;
  *o = 5; // C26829 emitted
}

void unwrap_maybe_empty(std::optional<int> o)
{
  *o = 5; // C26830 emitted
}

In function unwrap_empty above, we will see a C26829. In this case the analyzer is confident that the optional was empty. This usually happens when we forget to initialize an optional or accidentally write a negated condition. In function unwrap_maybe_empty, however, we will see a C26830. In this case the engine is not sure whether the optional is empty, and the unwrap operation is not guarded.

Conclusion

The upcoming Visual Studio 2022 17.3 Preview 3 will feature new checks to find hard-to-find misuses of std::optionals. These are experimental checks that need to be enabled explicitly by using the CppCoreCheckExperimentalRules ruleset or adding C26829 and C26830 to your custom ruleset. C26829 is a high-confidence warning that should have very few false positives. C26830 is a medium confidence check that should not be too noisy for most projects. Depending on the bugs reported and our experience with these checks in the coming weeks, either C26829 only or both of these warnings might be turned on by default in 17.4.

Try it out and let us know what you think:

The work that we do is heavily influenced by feedback we receive on the Developer Community so thank you again for your participation. Please continue to file feedback and let us know if there is a checker or rule that you would like to see added to C++ Core Checks. Stay tuned for more C++ static analysis blogs. In the meanwhile, we would love to learn more about your experience with our static analysis tools. Comment below, or reach us via email at visualcpp@microsoft.com or via Twitter at @VisualC.