This post is part of a regular series of posts where the C++ product team here at Microsoft and other guests answer questions we have received from customers. The questions can be about anything C++ related: MSVC toolset, the standard language and library, the C++ standards committee, isocpp.org, CppCon, etc. Today’s post is by Casey Carter.
C++17 adds several new “vocabulary types” – types intended to be used in the interfaces between components from different sources – to the standard library. MSVC has been shipping implementations of std::optional
, std::any
, and std::variant
since the Visual Studio 2017 release, but we haven’t provided any guidelines on how and when these vocabulary types should be used. This article on std::any
is the second of a series that examines each of the vocabulary types in turn.
Storing arbitrary user data
Say you’re creating a calendar component that you intend to distribute in a library for use by other programmers. You want your calendar to be usable for solving a wide array of problems, so you decide you need a mechanism to associate arbitrary client data with days/weeks/months/years. How do you best implement this extensibility design requirement?
A C programmer might add a void*
to each appropriate data structure:
struct day { // ...things... void* user_data; }; struct month { std::vector<day> days; void* user_data; };
and suggest that clients hang whatever data they like from it. This solution has a few immediately apparent shortcomings:
- You can always cast a
void*
to aFoo*
whether or not the object it points at is actually aFoo
. The lack of type information for the associated data means that the library can’t provide even a basic level of type safety by guaranteeing that later accesses to stored data use the same type as was stored originally:some_day.user_data = new std::string{"Hello, World!"}; // …much later Foo* some_foo = static_cast<Foo*>(some_day.user_data); some_foo->frobnicate(); // BOOM!
void*
doesn’t manage lifetime like a smart pointer would, so clients must manage the lifetime of the associated data manually. Mistakes result in memory leaks:delete some_day.user_data; some_day.user_data = nullptr; some_month.days.clear(); // Oops: hopefully none of these days had // non-null user_data
- The library cannot copy the object that a
void*
points at since it doesn’t know that object’s type. For example, if your library provides facilities to copy annotations from one week to another, clients must copy the associated data manually. As was the case with manual lifetime management, mistakes are likely to result in dangling pointers, double frees, or leaks:some_month.days[0] = some_month.days[1]; if (some_month.days[1].user_data) { // I'm storing strings in user_data, and don't want them shared // between days. Copy manually: std::string const& src = *some_month.days[1].user_data; some_month.days[0].user_data = new std::string(src); }
The C++ Standard Library provides us with at least one tool that can help: shared_ptr<void>
. Replacing the void*
with shared_ptr<void>
solves the problem of lifetime management:
struct day { // ...things... std::shared_ptr<void> user_data; }; struct month { std::vector<day> days; std::shared_ptr<void> user_data; };
since shared_ptr
squirrels away enough type info to know how to properly destroy the object it points at. A client could create a shared_ptr<Foo>
, and the deleter would continue to work just fine after converting to shared_ptr<void>
for storage in the calendar:
some_day.user_data = std::make_shared<std::string>("Hello, world!"); // ...much later... some_day = some_other_day; // the object at which some_day.user_data _was_ // pointing is freed automatically
This solution may help solve the copyability problem as well, if the client is happy to have multiple days/weeks/etc. hold copies of the same shared_ptr<void>
– denoting a single object – rather than independent values. shared_ptr
doesn’t help with the primary problem of type-safety, however. Just as with void*
, shared_ptr<void>
provides no help tracking the proper type for associated data. Using a shared_ptr
instead of a void*
also makes it impossible for clients to “hack the system” to avoid memory allocation by reinterpreting integral values as void*
and storing them directly; using shared_ptr
forces us to allocate memory even for tiny objects like int
.
Not just any
solution will do
std::any
is the smarter void*
/shared_ptr<void>
. You can initialize an any
with a value of any copyable type:
std::any a0; std::any a1 = 42; std::any a2 = month{"October"};
Like shared_ptr
, any
remembers how to destroy the contained value for you when the any
object is destroyed. Unlike shared_ptr
, any
also remembers how to copy the contained value and does so when the any
object is copied:
std::any a3 = a0; // Copies the empty any from the previous snippet std::any a4 = a1; // Copies the "int"-containing any a4 = a0; // copy assignment works, and properly destroys the old value
Unlike shared_ptr
, any
knows what type it contains:
assert(!a0.has_value()); // a0 is still empty assert(a1.type() == typeid(int)); assert(a2.type() == typeid(month)); assert(a4.type() == typeid(void)); // type() returns typeid(void) when empty
and uses that knowledge to ensure that when you access the contained value – for example, by obtaining a reference with any_cast
– you access it with the correct type:
assert(std::any_cast<int&>(a1) == 42); // succeeds std::string str = std::any_cast<std::string&>(a1); // throws bad_any_cast since // a1 holds int, not string assert(std::any_cast<month&>(a2).days.size() == 0); std::any_cast<month&>(a2).days.push_back(some_day);
If you want to avoid exceptions in a particular code sequence and you are uncertain what type an any
contains, you can perform a combined type query and access with the pointer overload of any_cast
:
if (auto ptr = std::any_cast<int>(&a1)) { assert(*ptr == 42); // runs since a1 contains an int, and succeeds } if (auto ptr = std::any_cast<std::string>(&a1)) { assert(false); // never runs: any_cast returns nullptr since // a1 doesn't contain a string }
The C++ Standard encourages implementations to store small objects with non-throwing move constructors directly in the storage of the any
object, avoiding the costs of dynamic allocation. This feature is best-effort and there’s no guaranteed threshold below which any
is portably guaranteed not to allocate. In practice, the Visual C++ implementation uses a larger any
that avoids allocation for object types with non-throwing moves up to a handful of pointers in size, whereas libc++ and libstdc++ allocate for objects that are two or more pointers in size (See https://godbolt.org/z/RQd_w5).
How to select a vocabulary type (aka “What if you know the type(s) to be stored?”)
If you have knowledge about the type(s) being stored – beyond the fact that the types being stored must be copyable – then std::any
is probably not the proper tool: its flexibility has performance costs. If there is exactly one such type T
, you should reach for std::optional
. If the types to store will always be function objects with a particular signature – callbacks, for example – you want std::function
. If you only need to store types from some set fixed at compile time, std::variant
is a good choice; but let’s not get ahead of ourselves – that will be the next article.
Conclusions
When you need to store an object of an arbitrary type, pull std::any
out of your toolbox. Be aware that there are probably more appropriate tools available when you do know something about the type to be stored.
If you have any questions (Get it? “any” questions?), please feel free to post in the comments below. You can also send any comments and suggestions directly to the author via e-mail at cacarter@microsoft.com, or Twitter @CoderCasey. Thank you!
std::any seems quite restricted to be more widely usable:
1) There does not seem to be a way to pass an allocator or control the potential dynamic allocations in general.
2) It is not possible to define the size of the memory scratch pool (e.g., template parameter) to avoid depending on the implementation.
> Using a
shared_ptr
instead of avoid*
also makes it impossible for clients to “hack the system” to avoid memory allocation by reinterpreting integral values asvoid*
and storing them directly; usingshared_ptr
forces us to allocate memory even for tiny objects likeint
.This is absolutely incorrect, as shown below:
“`
user_data = std::shared_ptr<void>(std::shared_ptr<void>(),
reinterpret_cast<void*>(static_cast<std::intptr_t>(some_int)));
“`
Waiting in hope for “std::variant : How, when, and why”.