{"id":30706,"date":"2022-07-14T19:35:58","date_gmt":"2022-07-14T19:35:58","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/cppblog\/?p=30706"},"modified":"2022-07-14T19:35:58","modified_gmt":"2022-07-14T19:35:58","slug":"new-stdoptional-checks-in-visual-studio-2022-version-17-3-preview-3","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/cppblog\/new-stdoptional-checks-in-visual-studio-2022-version-17-3-preview-3\/","title":{"rendered":"New std::optional Checks in Visual Studio 2022 version 17.3 Preview 3"},"content":{"rendered":"<p>The C++ static analysis team is committed to making your C++ coding experience as safe as possible. We are adding richer code safety checks and addressing high impact customer feedback bugs posted on the <a href=\"https:\/\/developercommunity.visualstudio.com\/search?space=62\">C++ Developer Community<\/a>\u202fpage. Thank you for engaging with us and giving us great feedback on the past releases and early previews leading to this point. Below is the detailed overview of some new experimental code analysis\u202fchecks that can detect unwrapping of empty <code>std::optional<\/code>s. The experimental checks can be enabled by using the <code>CppCoreCheckExperimentalRules<\/code> ruleset. Note that the experimental checks are not part of the <code>Microsoft All Rules<\/code> ruleset. While these checks are marked experimental, they look promising in our internal, preliminary testing. Unless we get reports about crashes or excessive number of false positives, we plan to move these checks to the <code>NativeRecommendedRules<\/code> ruleset (which is the default ruleset) in the next release.<\/p>\n<h2>Overview<\/h2>\n<p><code>std::optional<\/code> <a href=\"https:\/\/en.cppreference.com\/w\/cpp\/utility\/optional\">was introduced in C++17<\/a> to represent a value that may or may not be present. It is often used as the return type for a function that may fail.\nWe introduced two new checks, <code>C26829<\/code> and <code>C26830<\/code>, to find unwrap operations of empty <code>std::optional<\/code>s. Unwrapping an empty optional is undefined behavior. It can result in a crash, or worse, reading uninitialized memory\ndepending on the implementation. In some cases, the latter is a vulnerability that an adversarial actor could exploit. The <a href=\"https:\/\/developercommunity.visualstudio.com\/t\/Detect-unguarded-dereferences-of-std::op\/1430703\">C++ Developer Community ask<\/a> was one of the top voted feature requests for the static analysis team as dereferencing empty optionals has been a major source of real errors in many C++ projects.<\/p>\n<h2>Modeling optionals<\/h2>\n<p>In order to warn when (potentially) empty optionals are unwrapped, the analyzer needs to precisely model the semantics of <code>std::optional<\/code>.<\/p>\n<h3>Basic assumptions<\/h3>\n<p>Usually, the use of <code>std::optional<\/code>s is a stronger signal about the presence of values compared to pointer types. Let us look at the following code snippet:<\/p>\n<pre><code class=\"language-cpp\">void f(int* p);\r\nvoid g(std::optional&lt;int&gt;&amp; p);<\/code><\/pre>\n<p>In many codebases, we cannot know whether <code>nullptr<\/code> is a valid argument to function <code>f<\/code>. The function might have a precondition that it does not accept null pointers and the codebase might never pass a null value to <code>f<\/code>. A warning for null pointer dereference in the body of function <code>f<\/code> would be considered false positives by some developers.\nUsually, marking such pointers with <code>gsl::not_null<\/code> (<code>void f(gsl::not_null&lt;int*&gt; p);<\/code>) or replacing them with references (<code>void f(int&amp; p);<\/code>) can make the code clearer.<\/p>\n<p>In case of function <code>g<\/code>, however, the use of <code>std::optional<\/code> makes it explicit that it handles the lack of values gracefully. Therefore, while we tend to not warn on pointer parameters that don&#8217;t have null checks, we will warn on unwrapping <code>std::optional<\/code>s that might be empty.\nUnfortunately, there are some rare cases where this assumption would not hold. Let us look at the code snippet below:<\/p>\n<pre><code class=\"language-cpp\">std::optional&lt;int&gt; lookup(std::string_view key) {\r\n    const static std::map myMap{std::pair{\"Foo\"sv, 1}, std::pair{\"Bar\"sv, 2}};\r\n    auto it = myMap.find(key);\r\n    return it == myMap.end() ? std::nullopt : std::optional{it-&gt;second};\r\n}<\/code><\/pre>\n<p>While the function <code>lookup<\/code> might fail in the general case, a particular invocation of the function might have an argument that guarantees success (e.g., it might be <code>lookup(\"Foo\")<\/code>). This guarantee is an invariant of the program that we currently cannot express using <a href=\"https:\/\/docs.microsoft.com\/cpp\/c-runtime-library\/sal-annotations\">SAL annotations<\/a> and cannot infer using function-local reasoning. The experimental versions of these checks might emit false positive warnings in those cases. We are actively looking into ways to mitigate this problem. Some of the options are improving existing annotations to be able to communicate this invariant, or believe certain assertions. Until we settle on a solution, it is always possible to either suppress these warnings or to check that the optional has a value before unwrapping it make the warning disappear.<\/p>\n<p>Our modeling also assumes that whenever an optional is passed to a function by non-const reference, the called function might reset the optional. This assumption helps us catch more problems at the cost of more false positives.\nAs we gain more real-world experience with these checks we might revisit some of these assumptions\/decisions in the future.<\/p>\n<h3>Basic operations<\/h3>\n<p>This section describes the details of the modeling using a notation borrowed from our automatic regression tests. This notation helps us document our expectations regarding the semantics of the analyzed program and check whether the analyzer&#8217;s understanding matches our intuition.\nProgram points that should be deduced as reachable are annotated with <code>__espx_expect_reached()<\/code>. On the other hand, program points that should be deduced as unreachable are annotated with <code>__espx_expect_unreached()<\/code>.\nLooking at the reachability of certain program points can help us understand how the analysis engine reasoned about the values in the program. We can also query some values directly using annotations like <code>__espx_expect_always_true(cond)<\/code>. Our analysis tool will evaluate the expression <code>cond<\/code> and will report a failure when it cannot prove that the value always evaluates to true.<\/p>\n<p>Our analysis engine understands that the default constructor of <code>std::optional<\/code> will create an empty optional. Moreover, it understands the basic ways to check if an optional is empty:<\/p>\n<pre><code class=\"language-cpp\">void default_ctor_creates_empty()\r\n{\r\n    std::optional&lt;int&gt; opt;\r\n    if (opt)\r\n        __espx_expect_unreached();\r\n    else\r\n        __espx_expect_reached();\r\n\r\n    if (opt.has_value())\r\n        __espx_expect_unreached();\r\n    else\r\n        __espx_expect_reached();\r\n\r\n    int x = opt.value_or(5);\r\n    __espx_expect_always_true(x == 5);\r\n}<\/code><\/pre>\n<p>The test case above shows that the engine can discover that <code>opt<\/code> evaluates to false, so the true branch of the first if statement is never reached, and the false branch is always reached. The engine also understands that the <code>value_or<\/code> will return its argument when it is invoked on an empty optional. Conversely, it also understands that <code>value_or<\/code> will return the internal value of an optional when it had a value:<\/p>\n<pre><code class=\"language-cpp\">void value_ctor_creates_non_empty()\r\n{\r\n    std::optional&lt;int&gt; opt{2};\r\n    __espx_expect_always_true((bool)opt);\r\n\r\n    int x = opt.value_or(5);\r\n    __espx_expect_always_true(x == 2);\r\n}<\/code><\/pre>\n<p>Our analyzer also understand value types. It knows that the copy of an optional has a value if and only if the copied optional also had a value. Moreover, the contained value is the copy of the original:<\/p>\n<pre><code class=\"language-cpp\">void copied_non_empty_optional_is_not_empty()\r\n{\r\n    std::optional&lt;int&gt; opt{2};\r\n    auto opt2 = opt;\r\n    __espx_expect_always_true((bool)opt);\r\n    __espx_expect_always_true((bool)opt2);\r\n\r\n    __espx_expect_always_true(opt.value() == opt2.value());\r\n}<\/code><\/pre>\n<p>The analyzer also understands that the value inside an optional is always at the same address and two different optional objects are living at different addresses:<\/p>\n<pre><code class=\"language-cpp\">void accessor_produces_stable_addresses()\r\n{\r\n    std::optional&lt;int&gt; opt{2};\r\n    __espx_expect_always_true(&amp;opt.value() == &amp;opt.value());\r\n    int* ptr = &amp;opt.value();\r\n    opt = std::optional&lt;int&gt;{2};\r\n    __espx_expect_always_true(&amp;opt.value() == ptr);\r\n    std::optional&lt;int&gt; opt2{opt};\r\n    __espx_expect_always_true(&amp;opt.value() != &amp;opt2.value());\r\n}<\/code><\/pre>\n<p>Surprisingly, a moved-from optional that used to have a valid value is not empty. It holds the moved-from value:<\/p>\n<pre><code class=\"language-cpp\">void moved_from_optional_is_not_empty()\r\n{\r\n    std::optional&lt;int&gt; opt{2};\r\n    auto opt2 = std::move(opt);\r\n    __espx_expect_always_true((bool)opt);\r\n    __espx_expect_always_true(*opt2 == 2);\r\n}<\/code><\/pre>\n<p>This might be a potential source of confusion. While we currently will not warn for using the moved-from object in the original optional, we are looking into how can we teach our existing <a href=\"https:\/\/devblogs.microsoft.com\/cppblog\/new-code-analysis-checks-in-visual-studio-2019-use-after-move-and-coroutine\/\">use-after-move check<\/a> to find such errors by piggy-backing on the engine&#8217;s understanding of <code>std::optional<\/code>.<\/p>\n<h3>Symbolic reasoning<\/h3>\n<p>Our analysis engine is using symbolic reasoning to model the emptiness of optionals. Whenever the engine learns new facts about these symbols, this knowledge automatically and retroactively applied to the state of the objects. Consider the following example:<\/p>\n<pre><code class=\"language-cpp\">void constraints_correctly_applied(std::optional&lt;int&gt; optVal)\r\n{\r\n    bool b = (bool)optVal;                         \/\/ Program point: A.\r\n    if (b)                                         \/\/ Program point: B.\r\n    {\r\n       __espx_expect_always_true((bool)optVal);    \/\/ Program point: C.\r\n    }\r\n}<\/code><\/pre>\n<p>In the code snippet above, we have no information about the emptiness of <code>optVal<\/code> at program point <code>A<\/code>. However, the analyzer knows that the value of the variable <code>b<\/code> is inherently entangled to the emptiness of <code>optVal<\/code>. We branch on <code>b<\/code> at program point <code>B<\/code>. In the true branch, we know that the value of <code>b<\/code> is true. As a result, we also learned that <code>optVal<\/code> is not empty. As a result, <code>(bool)optVal<\/code> will evaluate to true at program point <code>C<\/code>. To summarize, we might learn new facts about the state of <code>optVal<\/code> from expressions that will not even refer to <code>optVal<\/code> syntactically. This is the power of symbolic reasoning.<\/p>\n<h3>Modeling exceptions<\/h3>\n<p>The analyzer understands whether accessor methods like <code>std::optional::value<\/code> will or will not throw an exception based on the known state of the object. It can use this information to help the analysis skip certain execution paths that cannot happen at runtime. This helps reducing the number of false positives and improve the performance of the analysis. The code snippet below demonstrates the behavior of the analysis.<\/p>\n<pre><code class=\"language-cpp\">void exception_modeling(std::optional&lt;int&gt; unknown)\r\n{\r\n    std::optional&lt;int&gt; nonEmpty{2};\r\n    std::optional&lt;int&gt; empty{};\r\n\r\n    try\r\n    {\r\n        unknown.value();\r\n        __espx_expect_reached();\r\n    }\r\n    catch(...)\r\n    {\r\n        __espx_expect_reached();\r\n    }\r\n\r\n    try\r\n    {\r\n        nonEmpty.value();\r\n        __espx_expect_reached();\r\n    }\r\n    catch(...)\r\n    {\r\n        __espx_expect_unreached();\r\n    }\r\n\r\n    try\r\n    {\r\n        empty.value();\r\n        __espx_expect_unreached();\r\n    }\r\n    catch(...)\r\n    {\r\n        __espx_expect_reached();\r\n    }\r\n}<\/code><\/pre>\n<h3>Other considerations<\/h3>\n<p>Our analysis engine also understands nested optionals. There are many more modeled methods that we did not mention explicitly, including <code>swap<\/code>. Unfortunately, the current version of our modeling will not precisely model the semantics of free functions operating on <code>std::optional<\/code>s, like <code>std::swap<\/code> or the comparison operators. We have partial modeling in place for <code>std::make_optional<\/code> and <code>std::in_place<\/code> constructors. We plan to make the modeling more comprehensive in the future, but we feel like the current modeling should be sufficient to find most errors.<\/p>\n<h2>Emitting warnings<\/h2>\n<p>The analyzer will emit <code>C26829<\/code> when an empty optional is unwrapped. On the other hand, it will emit <code>C26830<\/code> when a <strong>potentially<\/strong> empty optional is unwrapped. The emitted warnings will also include a path that describes the execution that could trigger the problem. In the future, we plan to include key events in the emitted diagnostics that will highlight parts of the code that are important to understand the warning. The highlighted snippets might include the program points where the emptiness of the optional was checked and calls where the emptiness of the optional might have been changed.<\/p>\n<pre><code class=\"language-cpp\">void unwrap_empty()\r\n{\r\n  std::optional&lt;int&gt; o;\r\n  *o = 5; \/\/ C26829 emitted\r\n}\r\n\r\nvoid unwrap_maybe_empty(std::optional&lt;int&gt; o)\r\n{\r\n  *o = 5; \/\/ C26830 emitted\r\n}<\/code><\/pre>\n<p>In function <code>unwrap_empty<\/code> above, we will see a <code>C26829<\/code>. In this case the analyzer is confident that the optional was empty. This usually happens when we forget to initialize an optional or accidentally write a negated condition.\nIn function <code>unwrap_maybe_empty<\/code>, however, we will see a <code>C26830<\/code>. In this case the engine is not sure whether the optional is empty, and the unwrap operation is not guarded.<\/p>\n<h2>Conclusion<\/h2>\n<p>The upcoming Visual Studio 2022 17.3 Preview 3 will feature new checks to find hard-to-find misuses of <code>std::optional<\/code>s. These are experimental checks that need to be enabled explicitly by using the <code>CppCoreCheckExperimentalRules<\/code> ruleset or adding <code>C26829<\/code> and <code>C26830<\/code> to your custom ruleset. <code>C26829<\/code> is a high-confidence warning that should have very few false positives. <code>C26830<\/code> is a medium confidence check that should not be too noisy for most projects.\nDepending on the bugs reported and our experience with these checks in the coming weeks, either <code>C26829<\/code> only or both of these warnings might be turned on by default in 17.4.<\/p>\n<h2>Try it out and let us know what you think:<\/h2>\n<p>The work that we do is heavily influenced by feedback we receive on the\u202f<a href=\"https:\/\/developercommunity.visualstudio.com\/search?space=62\">Developer Community<\/a>\u202fso thank you again for your participation. Please continue to file feedback and let us know if there is a checker or rule that you would like to see added to C++ Core Checks. Stay tuned for more C++ static analysis blogs. In the meanwhile, we would love to learn more about your experience with our static analysis tools. Comment below, or reach us via email at <a href=\"mailto:visualcpp@microsoft.com\">visualcpp@microsoft.com<\/a> or via Twitter at\u202f<a href=\"https:\/\/twitter.com\/visualc\">@VisualC<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>New std::optional Checks in Visual Studio 2022<\/p>\n","protected":false},"author":58008,"featured_media":35994,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1,239],"tags":[],"class_list":["post-30706","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cplusplus","category-diagnostics"],"acf":[],"blog_post_summary":"<p>New std::optional Checks in Visual Studio 2022<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts\/30706","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/users\/58008"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/comments?post=30706"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/posts\/30706\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/media\/35994"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/media?parent=30706"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/categories?post=30706"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/cppblog\/wp-json\/wp\/v2\/tags?post=30706"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}