Hi – Jonathan Caves again. Over the last couple of weeks I’ve seen some reports from users that the C++ compiler does not act the way they think it should. As it turns these reports weren’t real bugs, but the issues brought up are interesting enough to share with a wider audience.
The first was from a customer who reported that the compiler was calling the wrong function. The problem code can be reduced to the following:
class string {
public:
string(const char*);
};
void f(string, string, bool = false);
void f(string, bool = false);
void g()
{
f(“Hello”, “Goodbye”);
}
The user’s observation was that the compiler should call the first function:
void f(string, string, bool = false);
but in reality it was calling the second function:
void f(string, bool = false);
and they thought that this was a bug. At first glance it does appear that the user is correct – but looks can be deceiving. Just because the string class has a converting-constructor from a string-literal doesn’t mean that the compiler has to use it. For the first argument to the function call, the conversion is straight forward – both of the candidate functions expect a string and so the compiler will use the provided converting constructor to convert the string-literal to an instance of the string class. The second argument is not so straight forward. For the first function the compiler can again use the converting-constructor, but for the second function it can use the standard pointer-to-bool conversion to convert the string-literal (which the compiler will consider as type “const char*”) to bool. As this is a standard conversion, the C++ Standard considers this conversion to be cheaper than calling the converting-constructor (which is a user-defined conversion) and hence the second function is a better match than the first function and the compiler, correctly, calls that function.
Note: the real issue here is with the use of default-arguments. Without default-arguments the user would not be left to the mercy of the C++ conversion rules. If they wanted to call the three parameter version then they would need to provide three arguments; if they want to call the two parameter version then they need to provide two arguments. At first glance, default-arguments seem to be a great C++ language feature but I have seen them cause users no end of problems. I’ve even seen users doing stuff like the following:
SomeFunction(arg1, arg2 /*, arg3 = false, arg4 = true */);
They do this just so it is clear to readers of the code that default arguments are being used. If you are going to go this far then just get rid of the default arguments (and the comments). Believe me that it will make your life much easier and your code more maintainable!
The second issue was around code that compiled but when the user applied what they thought was a minor edit the code no longer compiled. The code could be reduced to the following:
namespace std {
template<typename T>
class list {
public:
size_t size() const;
};
}
class X : std::list<int> {
public:
size_t mf1() const { return list::size(); }
size_t mf2() const { return std::list::size(); }
};
The problem was that while mf1 compiles fine, mf2 generates the following error message:
a.cpp(12) : error C2955: ‘std::list’ : use of class template requires template argument list
a.cpp(3) : see declaration of ‘std::list’
The user’s question was why? Surely if the first function compiles then the second function should also compile because all the user has done is to make it clearer to the compiler what was going on. But the problem was that they have been too specific. It all comes down to something in C++ called the “injected-class-name”.
In C++, each class has a member that is added – injected – by the compiler and this member has the same name as the class. (Note: don’t confuse this with a constructor which only looks as if it has the same name as the class. In reality constructors have no-name or at least a name that cannot be written in C++ code.) This member is needed in order that there can be rules for defining a constructor outside of a class. Without this injected-member the C++ Standard would have to revert to hand-waving – something writers of Standards really hate. One further twist is that in the case that the class is a specialization of a class template, there are two versions of the injected-class-name: one is the name of the class without the template-arguments, list in the example above, and the other is the name of the class with the template-arguments, list<int> in the example above.
So in the first example when the compiler sees the identifier, list, it does normal name-lookup. It first looks up ‘list’ in the current class scope and finds nothing. It then looks up ‘list’ in the scope of the base class where it finds the injected-class-name and, eventually, works out that by ‘list’ the user means ‘std::list<int>’ which is a base-class of the current class. So the compiler treats the code as-if the user had written:
size_t mf1() const { return list<int>::size(); }
In the second example the compiler first sees ‘std’ which it looks-up and finds the namespace std. It then looks up ‘list’ within the namespace ‘std’ and finds the class-template – but this is not an injected-class-name – and there is no version in which the template-arguments are implied. So in this case the user needs to explicitly provide the template-arguments, hence the error mess age.
I think the lesson here is don’t try to help the compiler – let it resolve the code by itself. If you have got it wrong it will tell you. If it compiles and you want to double check the result, then you should debug the code (you already do step through all the code you write in the debugger – don’t you?). The worse offence in this category are people who add casts thinking that they will force the compiler to compile the code they way they want it to be complied. At best the cast is unnecessary (and hence makes the code more difficult to maintain) and worst it will lead to hard-to-detect runtime errors.
Thanks,
Jonathan
0 comments