New safety rules in C++ Core Check

Avatar

Sunny

Rust and C++ are two popular systems programming languages. For years, the focus of C++ has been on performance. We are increasingly hearing calls from customers and security researchers that C++ should have stronger safety guarantees in the language. C++ often falls behind Rust when it comes to programming safety. Visual Studio 2019 version 16.7 contains four new rules in C++ Core Check to incorporate some safety features from Rust into C++.  

For more detailed information on C++ Core Check, please see the C++ Core Guidelines Checker Reference documentation. If you’re just getting started with native code analysis tools, take a look at our introductory quick start for Code Analysis for C/C++. 

Missing default label in switch statements 

Rust’s pattern matching constructs can be used similarly to the C++ switch statement. One way in which they differ, however, is that the Rust equivalent requires the programmer to cover all possible patterns being matched. This can be achieved either through writing an explicit handler for each pattern or appending a default handler for cases not explicitly covered.  

For example, the following Rust code would not compile if the default handler were missing. 

// i32 == 32-bit signed integer 
fn printDiceRoll(roll: i32) { 
    match roll { 
        1 => println!("one!"), 
        2 => println!("two!"), 
        3 => println!("three!"), 
        4 => println!("four!"), 
        5 => println!("five!"), 
        6 => println!("six!"), 
        _ => println!("what kind of dice are you using?") // default handler 
    } 
}

This is a neat little safety feature because it guards against this extremely easy to make, but not so easy to catch, programming error. 

Visual Studio does warn whenever all cases of an enum type are not covered in a C++ switch statement <C4061, C4062>. However, such a warning is not present for other types, such as integers, as in the Rust example above. 

This release introduces a new check to warn whenever switch statements over non-enum types (i.e., char, int, …) are missing a default label. You can find detailed documentation on this check hereTo enable this rule in Visual Studio, you will have to select the ruleset “C++ Core Check Style Rules”, “C++ Core Check Rules”, or “Microsoft All Rules” for your project and then run code analysis. 

Re-writing the Rust example from above in C++, we would get something like below. 

void printDiceRoll(int roll) { 
    switch (roll) { 
        case 1: 
            std::cout << "one\n"; 
            break; 
        case 2: 
            std::cout << "two\n"; 
            break; 
        case 3: 
            std::cout << "three\n"; 
            break; 
        case 4: 
            std::cout << "four\n"; 
            break; 
        case 5: 
            std::cout << "five\n"; 
            break; 
        case 6: 
            std::cout << "six\n"; 
            break; 
        default: 
            std::cout << "what kind of dice are you using?\n"; 
            break; 
    } 
}

Removing the default handler now results in a warning. 

void printDiceRoll(int roll) { 
    switch (roll) { // warning C26818: Switch statement does not cover all cases. Consider adding a 'default' label (es.79) 
        case 1: 
            std::cout << "one\n"; 
            break; 
        case 2: 
            std::cout << "two\n"; 
            break; 
        case 3:
            std::cout << "three\n"; 
            break; 
        case 4: 
            std::cout << "four\n"; 
            break; 
        case 5: 
            std::cout << "five\n"; 
            break; 
        case 6: 
            std::cout << "six\n"; 
            break; 
    } 
}

Unannotated fallthrough in switch statements 

Another restriction of Rust’s match statement is that it does not support the notion of fallthrough between cases. In C++, on the other hand, the following code is perfectly valid. 

enum class Food { 
    BANANA, ORANGE, PIZZA, CAKE, KALE, CELERY 
}; 

void takeFromFridge(Food food) { 
    switch (food) { 
        case Food::BANANA: 
        case Food::ORANGE: 
            peel(food);   // implicit fallthrough 
        case Food::PIZZA: 
        case Food::CAKE: 
            eat(food); 
            break; 
        case Food::KALE: 
        case Food::CELERY: 
            throwOut(food);         
            break; 
    } 
}

While this example is perfectly sound, implicit fallthrough between cases can very easily be a bug. Say, for instance, that the programmer of the above function had forgotten the break statement after the call to eat(food). The code would run, but the behavior would be completely incorrect. With a larger and more complex codebase, tracking this type of bug can be difficult. 

Fortunately, with C++17 comes the addition of the [[fallthrough]] annotation, whose purpose is to mark fallthrough between case labels, such as in the example above, so that maintainers of the code can be confident thfallthrough behavior is intended.  

With Visual Studio 2019 version 16.7, warning C26819 is raised whenever a non-empty switch case falls through into a following case without marking the fallthrough using the [[fallthrough]] annotation. Detailed documentation can be found here. This rule is enabled by default in Visual Studio when you run code analysis. 

void takeFromFridge(Food food) { 
    switch (food) { 
        case Food::BANANA: // empty case, fallthrough annotation not needed 
        case Food::ORANGE: 
            peel(food);    // warning C26819: Unannotated fallthrough between switch labels (es.78) 
        case Food::PIZZA:  // empty case, fallthrough annotation not needed 
        case Food::CAKE: 
            eat(food); 
            break; 
        case Food::KALE:  // empty case, fallthrough annotation not needed 
        case Food::CELERY: 
            throwOut(food);         
            break; 
    } 
}

To fix this warning, insert a [[fallthrough]] statement.

void takeFromFridge(Food food) { 
    switch (food) { 
        case Food::BANANA: 
        case Food::ORANGE: 
            peel(food); 
            [[fallthrough]]; // the fallthrough is intended 
        case Food::PIZZA: 
        case Food::CAKE: 
            eat(food); 
            break; 
        case Food::KALE: 
        case Food::CELERY: 
            throwOut(food);         
            break; 
    } 
}

Expensive range-for copy 

A major difference between Rust and C++ is that Rust is move-by-default rather than copy-by-default. 

Some Rust code:

struct MySequence {
    sequence: Vec<i32>
}

fn main() {
    let mut a = MySequence { sequence: vec![0, 1, 1, 2, 3, 5, 8, 13] };
    let mut _b = a;
    a.sequence.push(21); // error! `a` was moved into `b` and can no longer be used
}

This means that explicit copy semantics must be used in most cases whenever a copy is intended. 

#[derive(Clone)]
struct MySequence {
    sequence: Vec<i32>
}

fn main() {
    let mut a = MySequence { sequence: vec![0, 1, 1, 2, 3, 5, 8, 13] };
    let mut _b = a.clone();
    a.sequence.push(21); // much better
}

C++, on the other hand, is copy-by-default. This is not a problem in general but can sometimes be a source of bugs. One case where this commonly occurs is within range-for statements. Take for example the following piece of code. 

struct Person { 
    std::string first_name; 
    std::string last_name; 
    std::string email_address; 
}; 

void emailEveryoneInCompany(const std::vector<Person>& employees) { 
    Email email; 
    for (Person p: employees) { // copy of type `Person` occurs on each iteration 
        email.addRecipient(p.email_address); 
    } 
    email.addBody("Sorry for the spam"); 
    email.send(); 
}

In the above snippet, each element of the vector is copied into p on each iteration of the loop. This is not obvious and can be a significant source of inefficiency if the copy is expensive. To remedy this unnecessary copy, we added a new C++ Core Check rule, suggesting a way to remove the copy. 

void emailEveryoneInCompany(const std::vector<Person>& employees) { 
    Email email; 
    for (Person p: employees) { // Warning C26817: Potentially expensive copy of variable 'p' in range-for loop. Consider making it a const reference (es.71) 
        email.addRecipient(p.email_address); 
    } 
    email.addBody("Sorry for the spam"); 
    email.send(); 
}

By using the suggestion from the warning and changing the type of the variable p in the loop from a Person to a const Person&, the variable no longer receives an expensive copy of the data on each iteration. 

void emailEveryoneInCompany(const std::vector<Person>& employees) { 
    Email email; 
    for (const Person& p: employees) { // expensive copy no longer occurs 
        email.addRecipient(p.email_address); 
    } 
    email.addBody("Sorry for the spam"); 
    email.send(); 
}

To decide what constitutes an “expensive” copy, the following heuristic is used by the check: 

If the size of the type is greater than twice the platform-dependent pointer size and the type is not a smart pointer or one of gsl::span, gsl::string_span, or std::string_view, then the copy is considered expensive. This means that for small datatypes such as numeric scalars, the warning will not trigger. For larger types, such as the Person type in the example above, the copy is considered expensive and a warning will be raised. 

One final point to note is that the check will not fire if the variable is mutated in the loop body. 

struct Person { 
    std::string first_name; 
    std::string last_name; 
    int hourlyrate; // dollars per hour 
}; 

void giveEveryoneARaise(const std::vector<Person>& employees) { 
    for (Person p: employees) { 
        p.hourlyrate += 10; // `p` can no longer be marked `const Person&`, so the copy is unavoidable 
    } 
}

If instead the container was not const-qualified, then the copy could be avoided by changing the type Person to Person&. 

void giveEveryoneARaise() { 
    std::vector<Person> employees = getEmployees(); 
    for (Person& p: employees) { // no more expensive copying, but any subsequent mutation will change the container! 
        p.hourlyrate += 10; 
    } 
}

But this change comes with the introduction of new side-effects to the code. Therefore, the range-for copy warning only ever suggests marking the loop variable as const T&, and will not fire if the loop-variable cannot legally be marked const. 

Full documentation of the check can be found hereThis rule is enabled by default in Visual Studio when you run code analysis. 

Expensive copy with the auto keyword 

The last new check in this release concerns itself with expensive copies occurring with the use of the auto type. 

Consider the following Rust example in which type resolution occurs for a variable being assigned a reference. 

struct PasswordManager { 
    password: String 
} 

impl PasswordManager { 
    // member-access function returning an immutable reference to a member 
    fn getPassword(&self) -> &String { &self.password } 
    // Note: for the sake of an example dealing with expensive types, a &String is being returned. 
    // More realistically though, a string slice would be returned instead (similar to a string view in C++) 
} 

fn stealPassword(pm: &PasswordManager) { 
    let password = pm.getPassword(); // the type of `a` resolves to `&String`. No copy occurs. 
}

Because of Rust’s requirement that in the majority of cases copying must be explicit, the type of password in the example automatically resolves to an immutable reference when being assigned an immutable reference, and no expensive copying is performed. 

On the other hand, consider the equivalent C++ code. 

class PasswordManager { 
    std::string password; 
public: 
    const std::string& getPassword() const { return password; }  
}; 

void stealPassword(const PasswordManager& pm) {
    auto password = pm.getPassword(); // the type of `password` resolves to `std::string`. Copy occurs.
}

Here, the type of password resolves to std::string, even though the return type of getPassword() is a const-reference to a string. The resulting behavior is that the contents of PasswordManager::password get copied into the local variable password.

Compare this with a function returning a pointer: 

class PasswordManager {
    std::string password;
public:
    const std::string* getPassword() const { return &password; }
};

void stealPassword(const PasswordManager& pm) {
    auto password = pm.getPassword(); // the type of `password` resolves to `const std::string*`. No copy occurs.
}

This difference in behavior between assigning a reference and a pointer to a variable marked auto is non-obvious, resulting in potentially unwanted and unexpected copying. 

To guard against bugs arising from this behavior, the checker examines all instances of initialization from a reference to a variable marked auto. If the resulting copy is deemed expensive using the same heuristics as in the range-for check, the checker warns to mark the variable const auto& instead.

class PasswordManager {
    std::string password;
public:
    const std::string& getPassword() const { return password; }
};

void stealPassword(const PasswordManager& pm) {
    auto password = pm.getPassword();  // Warning C26820: Assigning by value when a const-reference would suffice, use const auto&amp; instead (p.9)
}

And as with the range-for check, this warning does not get raised whenever the variable cannot be legally marked const. 

std::string hashPassword(const PasswordManager& pm) {
    auto password = pm.getPassword(); // warning no longer gets raised because `password` is modified below
    password += "salt";
    return std::hash(password);
}

Another instance in which the warning does not get raised is whenever the reference is being derived from a temporary. In such cases, using const auto& would result in a dangling reference once the temporary gets destroyed. 

class PasswordManager {
    std::string password;
public:
    PasswordManager(const std::string& password): password(password) {}
    const std::string& getPassword() const { return password; }
};

void stealPassword() {
    const auto& password = PasswordManager("123").getPassword(); // using `const auto&` instead of just `auto`
    use_password(password); // `password` is now referencing invalid memory!
}

Full documentation of the check can be found hereThis rule is enabled by default in Visual Studio when you run code analysis.

Give us your feedback

Check out these newly added rules and the recently released GSL 3.0 library and let us know if they help you write safer C++. Stay tuned as we add more safety rules in future releases of Visual Studio.

Download Visual Studio 2019 version 16.7 today and give it a try. We would love to hear from you to help us prioritize and build the right features for you. We can be reached via the comments below, Developer Community, and Twitter (@VisualC). The best way to file a bug or suggest a feature is via Developer Community. 

6 comments

Comments are closed. Login to edit/delete your existing comments

  • Avatar
    Paulo Pinto

    Very nice improvements, everything that helps reducing the gap is welcomed.

    As for general feedback, the lifetime analyser still has too many false positives and we haven’t heard much more from it for quite sometime now.

    • Avatar
      Sunny ChatterjeeMicrosoft employee

      Paulo, thanks for your feedback. We are aware of the accuracy concerns in the lifetime analyzer and explicitly marked the lifetime rules as experimental for 16.7 release of Visual Studio. We plan to announce further improvements in the analyzer later this year. Please stay tuned and keep the feedback coming.

  • Avatar
    Mohammed El-Afifi

    In the very last example where const auto& is used to point to a temporary object inside a function(local parameter), shouldn’t const auto& prolong the temporary object lifetime till the enclosing function(stealPassword) scope exits, as per §8.5.3/5, [dcl.init.ref] in the standard? See https://stackoverflow.com/a/2784304/493119. Thus password should refer to a valid object throughout the body of the function stealPassword.

    • Avatar
      Sunny ChatterjeeMicrosoft employee

      Mohammed, it’s undefined behavior because PasswordManager(“123”) returns a temporary but the code isn’t binding that value to anything, only binding the return value of PasswordManager(“123”).getPassword(). So, in effect, the compiler is immediately throwing away the return value of PasswordManager(“123”) but then using a reference to its data member. Extending the lifetime of a temporary using `const auto &` or `auto &&` doesn’t recursively extend the lifetime of the parent’s value. Here is an example to illustrate my point:

      #include <iosteam>
      #include <string>
      
      struct Password
      {
          Password(const std::string& password) : password(password) {}
          std::string password;
          ~Password() {
              std::cout << "Password object is being destroyed." << std::endl;
          }
      };
      
      class PasswordManager {
          Password password_;
      public:
          PasswordManager(const std::string& password) : password_(password) {}
          const Password& getPassword() const { return password_; }
      };
      
      void use_password(const Password& password)
      {
          std::cout << "Using stole password: " << password.password << std::endl;
      }
      
      void stealPassword() {
          auto password = PasswordManager("123").getPassword(); // using `const auto&` instead of just `auto`
          use_password(password); // `password` is now referencing invalid memory!
      }
      
      int main()
      {
          std::cout << "Hello World!\n";
          stealPassword();
      }

      If you run this program, you get the following output:

      Hello World!
      Password object is being destroyed.
      Using stole password:

      Hope this helps!