September 11th, 2017

Two-phase name lookup support comes to MSVC

点这里看中文版

This post written by Tanveer Gani, Stephan T. Lavavej, Andrew Marino, Gabriel Dos Reis, and Andrew Pardoe

“Two-phase name lookup” is an informal term that refers to a set of rules governing the resolution of names used in a template declaration. These rules were formalized more than two decades ago in an attempt to reconcile two opposing compilation models for templates: the inclusion model (what most developers know of templates today), and the separation model (the basis of the original design for templates). You can find the genesis of dependent names in the foundational paper Proposed Revisions to the Template Specification, firmly grounded in the One Definition Rule principle. If you’re interested in diving into the glorious details, you can find these rules in modern terms in section 17.6 (stable name [temp.res]) of the C++17 draft standard. In the last few months the MSVC compiler has come from having no support for two-phase name lookup to being usable on most code. We’ll finish complete support for this feature in a future Visual Studio 2017 update.

You’ll need to use the /permissive- conformance switch to enable two-phase lookup in the MSVC compiler included with Visual Studio 2017 “15.3”. Two-phase name lookup drastically changes the meaning of some code so the feature is not enabled by default in the current version of MSVC.

This post examines exactly what two-phase name lookup entails, what’s currently implemented in MSVC, and how to make effective use of MSVC’s partial but substantial support for two-phase name lookup. We’ll also tell you how to opt-out of two-phase lookup, even if you want the rest of your code to strictly conform to the Standard. Lastly, we’ll explain a bit about why it took us so long to get here—these rules are at least 25 years old!

What is “two-phase name lookup”?

The original design of templates for C++ meant to do exactly what the term “template” implied: a template would stamp out families of classes and functions. It allowed and encouraged, but did not require, early checking of non-dependent names. Consequently, identifiers didn’t need to be looked up during parsing of the template definition. Instead, compilers were allowed to delay name lookup until the template was instantiated. Similarly, the syntax of a template didn’t need to be validated until instantiation. Essentially, the meaning of a name used in a template was not determined until the template was instantiated.

In accordance with these original rules, previous versions of MSVC did very limited template parsing.  In particular, function template bodies were not parsed at all until instantiation. The compiler recorded the body of a template as a stream of tokens that was replayed when it was needed during instantiation of a template where it might be a candidate.

Let’s consider what this means by looking at a piece of code. Links are provided to online compilers so you can play with the code as you read through this post.

#include <cstdio>

void func(void*) { std::puts("The call resolves to void*") ;}

template<typename T> void g(T x)
{
    func(0);
}

void func(int) { std::puts("The call resolves to int"); }

int main() 
{
    g(3.14);
}

To which of these overloads does the call on line 7 resolve? The void* overload was already declared at the point the template was written on line 5. The function void func(int) didn’t exist when the template was written. Therefore, the call on line 14 to the function template void g(T x) on line 5 should resolve to function void func(void*) on line 3.

When compiled with a compiler that conforms to the standard, this program prints “The call resolves to void*“. You can see this behavior in GCC using the Rextester online compiler. Using the MSVC from Visual Studio 2015, without support for two-phase name lookup, the program prints “The call resolves to int”.

Why did MSVC get this wrong? The mechanics we used to parse templates worked when templates were simple, but limited what the compiler could do when two-phase name lookup came into play. MSVC previously recorded the body of the template as a stream of tokens and stored that stream away to be replayed at instantiation time. The behavior of MSVC’s template substitution from a recorded token stream somewhat resembled the behavior of macro substitution in that limited analysis was done of a template’s body.

In this example, MSVC stored a token stream for the function template void g(T x). If the compiler had analyzed the function call at the point where it was encountered, only the declaration for void func(void*) would have been in the overload set. (Note that is a valid match for the call func(0) because C++ allows 0 to represent a null pointer constant that can be converted to any pointer type.)

The function overload void func(int) would also be a match for the call func(0) except that it should not be in the overload set at the point the function template void g(T x) was evaluated. But MSVC didn’t evaluate the body of the template until the point of instantiation—after the declaration for void func(int) had been added to the overload set. At that point, the compiler picked the better match for an integer argument: int rather than void*.

You can see both of the compilers in action in this code sample on the online Compiler Explorer. GCC refuses to compile the code sample when line 3 is commented out, whereas MSVC happily matches a function that wasn’t even defined at the point the template was written. It would be recognized as illegal code if it were not a template, but our broken template substitution mechanics allowed the compiler to accept this code.

The C++ standards committee realized that code written in templates should not be subtly affected by the surrounding context, while also upholding the ODR. They introduced the notion of dependent and non-dependent names in the rules for name-binding in templates because it would be surprising behavior to have the function written on line 10 change the meaning of the code above it.

The rules in the [temp.res] section of the standard list three kinds of names:

  1. The name of the template and names declared in the template
  2. Names that depend upon a template’s parameter
  3. Names from scopes that are visible inside the template’s definition

The first and third categories are non-dependent names. They’re bound at the point of the template’s definition and stay bound in every instantiation of that template. They are never looked up when a template is instantiated. (See §17.6 [temp.res]/10 and §17.6.3 [temp.nondep] in the Draft Standard for details.)

The second category are dependent names. Dependent names are not bound at the point of the template’s definition. Instead, these names are looked up when the template is instantiated. For function calls with a dependent function name, the name is bound to the set of functions that are visible at the point of the call in the template’s definition. Additional overloads from argument-dependent lookup are added at both the point of the template definition and the point of where the template is instantiated. (See §17.6.2 [temp.dep], §17.6.4 [temp.dep.res], and §17.6.4.2 [temp.dep.candidate] in the Draft Standard for details.)

It’s important to note that overloads declared after the point of the template’s definition but before the point of the template’s instantiation are only considered if they are found through argument-dependent lookup. MSVC previously didn’t do argument-dependent lookup separately from ordinary, unqualified lookup so this change in behavior may be surprising.

Consider this code sample, which is also available on the Wandbox online compiler:

#include <cstdio> 

void func(long) { std::puts("func(long)"); }

template <typename T> void meow(T t) {
    func(t);
}

void func(int) { std::puts("func(int)"); }

namespace Kitty {
    struct Peppermint {};
    void func(Peppermint) { std::puts("Kitty::func(Kitty::Peppermint)"); }
}

int main() {
    meow(1729);
    Kitty::Peppermint pepper;
    meow(pepper);
}

The call meow(1729) resolves to the void func(long) overload, not void func(int) overload, because the unqualified func(int) is declared after the definition of the template and not found through argument-dependent lookup. But void func(Peppermint) does participate in argument-dependent lookup, so it is added to the overload set for the call meow(pepper).

From the above examples, you can see that the two phases of “two-phase lookup” are the lookup for non-dependent names at the time of template definition and lookup for dependent names at the time of template instantiation.

MSVC behavior prior to Visual Studio 2017 “15.3”

Historically, when a template was encountered, the MSVC compiler took the following steps:

  • When parsing a class template, MSVC previously parsed only the template declaration, the class head, and the base class list. The template body was captured as a token stream. No function bodies, initializers, default arguments, or noexcept arguments were parsed. The class template was pseudo-instantiated on a “tentative” type to validate that the declarations in the class template were correct.Take for example this class template: template <typename T> class Derived : public Base<T> { ... }. The template declaration, template <typename T>, the class head, class Derived, and the base-class list, public Base<T> are parsed but the template body, { ... }, is captured as a token stream.
  • When parsing a function template, MSVC previously parsed only the function signature. The function body was never parsed—it was captured as a token stream. Consequently, if the template body had syntax errors and the template was never instantiated the errors were never diagnosed.

An example of how this behavior caused incorrect parsing can be seen with how MSVC did not require the keywords template and typename everywhere they C++ Standard requires them. These keywords are needed in some positions to disambiguate how compilers should parse a dependent name during the first phase of lookup. For example, consider this line of code:

T::Foo<a || b>(c); 

Is this code a call to a function template with an argument of a || b? Or is this a logical-or expression with T::foo < a as the left operand and b > (c) as the right operand?

A conforming compiler will parse Foo as a variable in the scope of T, meaning this code is an or operation between two comparisons. If you meant to use Foo as a function template, you must indicate that this is a template by adding the template keyword, e.g.,

T::template Foo<a || b>(c); 

Prior to Visual Studio 2017 “15.3”, MSVC allowed this code without the template keyword because it parsed templates in a very limited fashion. The code above would not have been parsed at all in the first phase. During the second phase there’s enough context to tell that T::Foo is a template rather than a variable so MSVC did not enforce use of the keyword.

This behavior can also be seen by eliminating the keyword typename before names in function template bodies, initializers, default arguments, and noexcept arguments. Consider this code:

template<typename T>
typename T::TYPE func(typename T::TYPE*)
{
    typename T::TYPE i;
}

If you remove the keyword typename in the function body on line 4, MSVC would have still compiled this code whereas a conforming compiler would reject the code. You need the typename keyword to indicate that the TYPE is dependent. Because MSVC previously didn’t parse the body it didn’t require the keyword. You can see this example in the online Compiler Explorer. Since compiling such code under the MSVC conformance mode, (/permissive-), will result in errors, as you move forward to MSVC versions 19.11 and beyond, make sure to look for places like this where the typename keyword is missing.

Similarly, in this code sample:

template<typename T>
typename T::template X<T>::TYPE func(typename T::TYPE)
{
    typename T::template X<T>::TYPE i;
}

MSVC previously only required the template keyword on line 2. A conforming compiler requires the template keyword on line 4 as well to indicate that T::X<T> is a template. Uncomment the keyword in this example on the Compiler Explorer to see the error in action. Again, keep this missing keyword in mind as you move your code forward.

Two Phase Name Lookup in Visual Studio 2017 “15.3”

We introduced a “conformance mode” switch with Visual Studio 2017. In the v141 compiler toolset released with VS2017 you can use the /permissive- switch to turn on this conformance mode. (In the next major compiler revision, conformance mode will be on by default. At that point you’ll be able to use the /permissive switch to request the non-conforming mode (without the -) much like the -fpermissive switch in other compilers.) One of the big features missing when we introduced the /permissive- switch was two-phase name lookup, which has now been partially implemented in the compiler that ships with VS2017 “15.3”.

There are a few missing parts to our two-phase name lookup support—see the section “What’s coming next” below for details. But the MSVC compiler now parses correctly and enforces syntax rules strictly for:

  • Class templates
  • Bodies of function templates and member functions of class templates
  • initializers, including member initializers
  • default arguments
  • noexcept arguments

Additionally, the MSVC implementation of the STL is fully two-phase clean (validated by /permissive- in MSVC as well as Clang’s -fno-ms-compatibility -fno-delayed-template-parsing). We’ve recently gotten ATL to be two-phase clean; if you find any lingering bugs please be sure to let us know!

But what do you do for your legacy code that may rely on the old, incorrect MSVC behavior? You can still use /permissive- for the rest of the conformance improvements even if your code isn’t quite yet ready to have template bodies parsed and dependent names bound correctly. Just throw the /Zc:twoPhase- switch to turn off template parsing and dependent name binding. Using this switch will cause the MSVC compiler to use the old behavior with non-standard semantics, giving you a chance to fix your code to compile correctly with a conforming MSVC compiler.

If you are using the Windows RedStone2 (“Creators Update”) SDK with the /permissive- switch, you’ll need to temporarily disable two-phase name lookup by using the /Zc:twoPhase- switch until the Windows RedStone 3 (“Fall Creators Update”) SDK is available. This is because the Windows team has been working with the MSVC team to make the SDK headers work properly with two-phase name lookup. Their changes will not be available until the RedStone3 Windows SDK is released, nor will the changes for two-phase name lookup be ported back to the RedStone2 Windows SDK.

What’s coming next

MSVC’s support for two-phase name lookup is a work in progress. Here’s a list of what’s left to come in future updates to MSVC in Visual Studio 2017. Remember that you need to use the /permissive- switch with these examples to enable two-phase lookup.

  1. Undeclared identifiers in templates are not diagnosed. E.g.
    template<class T>
    void f()
    {
        i = 1; // Missing error: `i` not declared in this scope
    }
    

    MSVC does not emit an error that `i` is not declared and the code compiles successfully. Adding an instantiation of f causes the correct errors to be generated:

    template<class T>
    void f()
    {
        i = 1; // Missing error: `i` not declared in this scope
    }
    
    void instantiate()
    {
        f<int>();
    }
    
    C:\tmp> cl /c /permissive- /diagnostics:caret one.cpp
    Microsoft (R) C/C++ Optimizing Compiler Version 19.11.25618 for x64
    Copyright (C) Microsoft Corporation.  All rights reserved.
    
    one.cpp
    c:\tmp\one.cpp(4,5): error C2065: 'i': undeclared identifier
        i = 1;
        ^
    c:\tmp\one.cpp(9): note: see reference to function template instantiation 'void f<int>(void)' being compiled
        f<int>();
    
  2. The MSVC compiler with VS 2017 “15.3” will generate an error for missing template and typename keywords but will not suggest adding these keywords. Newer compiler builds give more informative diagnostics.
    template <class T>
    void f() {
       T::Foo<int>();
    }
    

    The MSVC compiler shipped with VS 2017 “15.3” gives this error:

    C:\tmp>cl /c /permissive- /diagnostics:caret two.cpp
    Microsoft (R) C/C++ Optimizing Compiler Version 19.11.25506 for x64
    Copyright (C) Microsoft Corporation.  All rights reserved.
    
    two.cpp
    two.cpp(3,16): error C2187: syntax error: ')' was unexpected here
       T::Foo<int>();
                   ^
    

    Builds of the compiler that will ship with future updates of VS 2017 give a more informative error:

    C:\tmp>cl /c /permissive- /diagnostics:caret two.cpp
    Microsoft (R) C/C++ Optimizing Compiler Version 19.11.25618 for x64
    Copyright (C) Microsoft Corporation.  All rights reserved.
    
    two.cpp
    two.cpp(3,7): error C7510: 'Foo': use of dependent template name must be prefixed with 'template'
       T::Foo<int>();
          ^
    two.cpp(3,4): error C2760: syntax error: unexpected token 'identifier', expected 'id-expression'
       T::Foo<int>();
       ^
    
  3. The compiler is not properly looking up functions during argument-dependent lookup. This can result in the wrong function being called at runtime.
    #include <cstdio>
    
    namespace N
    {
        struct X {};
        struct Y : X {};
        void f(X&) 
        { 
            std::puts("X&"); 
        }
    }
    
    template<typename T>
    void g()
    {
        N::Y y;
        f(y); // This is non-dependent but it is not found during argument-dependent lookup so it is left unbound.
    }
    
    void f(N::Y&)
    {
        std::puts("Y&");
    }
    
    int main()
    {
        g<int>();
    }
    

    The output from running this program is the above is Y& when it should be X&.

    C:\tmp>cl /permissive- /diagnostics:caret three.cpp
    Microsoft (R) C/C++ Optimizing Compiler Version 19.11.25506 for x64
    Copyright (C) Microsoft Corporation.  All rights reserved.
    
    three.cpp
    Microsoft (R) Incremental Linker Version 14.11.25506.0
    Copyright (C) Microsoft Corporation.  All rights reserved.
    
    /out:three.exe
    three.obj
    
    C:\tmp>three
    Y&
    
  4. Non-type dependent expressions involving local declarations aren’t analyzed correctly. The MSVC compiler currently parses the type as dependent causing an incorrect error.
    template<int> struct X 
    { 
        using TYPE = int; 
    };
    
    template<typename>
    void f()
    {
        constexpr int i = 0;
        X<i>::TYPE j;
    }
    

    A syntax error is issued because the i is not properly analyzed to be a non-value dependent expression when the value of the expression on line 9 is not type-dependent.

    C:\tmp>cl /c /permissive- /diagnostics:caret four.cpp
    Microsoft (R) C/C++ Optimizing Compiler Version 19.11.25618 for x64
    Copyright (C) Microsoft Corporation.  All rights reserved.
    
    four.cpp
    four.cpp(10,16): error C2760: syntax error: unexpected token 'identifier', expected ';'
        X<i>::TYPE j;
                   ^
    four.cpp(10,5): error C7510: 'TYPE': use of dependent type name must be prefixed with 'typename'
        X<i>::TYPE j;
        ^
    
  5. Neither redeclaration of template parameters and redefinition of template function arguments as local names are reported as errors.
    template<class T>
    void f(int i)
    {
        double T = 0.0; // Missing error: Declaration of `T` shadows template parameter
        float i = 0;    // Missing error: Redefinition of `i` with a different type
    }
    
  6. The MSVC compiler misidentifies the current instantiation in some cases. Using the keyword typename is legal and helps the compiler to correctly identify the current instantiation.
    template<class T> struct A {
        typedef int TYPE;
        A::TYPE c1 = 0;    // Incorrectly fails to compile
        A<T>::TYPE c2 = 0; // Incorrectly fails to compile
    };
    

    Adding the keyword typename before each instance of A allows this code to compile:

    template<class T> 
    struct A 
    {
        typedef int TYPE;
        typename A::TYPE c1 = 0;
        typename A<T>::TYPE c2 = 0;
    };
    
  7. Undeclared default arguments are not diagnosed. This example demonstrates a case where the MSVC compiler is still doing one-phase lookup. It is using the declaration of SIZE found after the template declaration as if it were declared before the template.
    template<int N = SIZE> // Missing diagnostic: Use of undeclared identifier `SIZE`
    struct X
    {
        int a[N];
    };
    
    constexpr int SIZE = 42;
    
    X<> x;
    

All of the above issues are planned to be fixed in the next major update of MSVC in Visual Studio 2017.

Why did it take so long?

Other compilers have had two-phase name lookup implemented for quite some time. Why is MSVC just now getting it right?

Implementing two-phase name lookup required fundamental changes in MSVC’s architecture. The biggest change was to write a new recursive descent parser to replace the YACC-based parser that we’ve used for over 35 years.

We decided early on to follow an incremental path rather than rewriting the compiler from scratch. Evolving the aged MSVC code base into a more modern code base instead of “going dark” on a big rewrite allowed us to make huge changes without introducing subtle bugs and breaking changes when compiling your existing code. Our “compiler rejuvenation” work required carefully bridging the old code and the new code, making sure all the time that large test suites of existing code continued to compile exactly the same (except where we intentionally wanted to make a change to introduce conforming behavior.) It took a bit longer to do the work in this fashion but that allowed us to deliver incremental value to developers. And we have been able to make major changes without unexpectedly breaking your existing code.

In closing

We’re excited to finally have support for two-phase name lookup in MSVC. We know that the compiler still won’t compile some template code correctly—if you find a case not mentioned in this post, please reach out to us so that we can fix the bug!

All of the code samples in this post now compile (or fail to compile, when appropriate) correctly according to the Standard. You’ll see this new behavior with Visual Studio 2017 “15.3”, or you can try it out right now using a daily build of the MSVC compiler.

Now is a good time to start using the /permissive- switch to move your code forward. Remember when you run into template parsing errors that adding the keywords template and typename that MSVC did not previously require (see above) might fix the error.

If you have any feedback or suggestions for us, let us know. We can be reached via the comments below, via email (visualcpp@microsoft.com) and you can provide feedback via Help > Report A Problem in the product, or via Developer Community. You can also find us on Twitter (@VisualC) and Facebook (msftvisualcpp).

Category
C++

Author

0 comments

Discussion are closed.