The C++20 ranges library come with the ability to transform a view; that is, to provide a unary function that is applied to each element of a view, producing a new view.
std::array<int, 2> a = { 99, 42 }; // This range reports values that are one larger than the values // in an array. The array values are unchanged. auto r = a | std::ranges::views::transform([](int v) { return v + 1; }); // This creates a vector from the range std::vector v(r.begin(), r.end()); // The resulting vector is { 100, 43 }
But what if your code base isn’t ready to move to C++20 yet? For example, you might be a library that wants to support C++17 or even C++14.
You can take the original iterator and wrap it inside another iterator that overrides the *
operator so that it applies a transformation to the value before returning it. Note that our transforming iterator cannot support the ->
operator since there is no value object we can return a pointer to; our value is generated on demand. Fortunately, the standard permits us to omit ->
support, provided we define pointer
as void
.
template<typename Inner> struct Wrap { Wrap(Inner const& inner) : m_inner(inner) {} Inner m_inner; }; template<typename It, typename Transformer> class transform_iterator : Wrap<Transformer> { It m_it; public: transform_iterator(It const& it, Transformer const& transformer) : m_it(it), Wrap<Transformer>(transformer) {} // copy constructors and assignment operators defaulted using difference_type = typename std::iterator_traits<It>::difference_type; using value_type = typename std::invoke_result< Transformer, It>::type; using pointer = void; using reference = void; using iterator_category = std::input_iterator_tag; bool operator==(transform_iterator const& other) { return m_it == other.m_it; } bool operator!=(transform_iterator const& other) { return m_it != other.m_it; } auto operator*() const { return (*this)(*m_it); } auto operator++() { ++m_it; return *this; } auto operator++(int) { auto prev = *this; ++m_it; return prev; } }; // For C++14 (no CTAD) template<typename It, typename Transformer> auto make_transform_iterator( It const& it, Transformer const& transformer) { return transform_iterator<It, Transformer> (it, transformer); }
We can use this transforming iterator like this:
std::array<int, 2> a = { 99, 42 }; auto transformer = [](int v) { return v + 1; }; // This creates a vector from the array, applying // a transformation to each value std::vector v( transform_iterator(a.begin(), transformer), transform_iterator(a.end(), transformer)); // If C++14 (no CTAD) std::vector<int> v( transform_iterator(a.begin(), transformer), transform_iterator(a.end(), transformer)); // The resulting vector is { 100, 43 }
The transformation can even change the type:
std::array<int, 2> a = { 99, 42 }; auto transformer = [](int v) { return std::make_pair(v + 1, v); }; // This creates a map from the array std::map m( transform_iterator(a.begin(), transformer), transform_iterator(a.end(), transformer)); // If C++14 (no CTAD) std::map<int, int> m( transform_iterator(a.begin(), transformer), transform_iterator(a.end(), transformer)); // The resulting map is // m[100] = 99 // m[43] = 42
There are some non-obvious pieces of the above transform_
.
We want to accept not just lambdas as transformers, but any Callable. That means that we use std::invoke
to invoke the transformer on the wrapped iterator. This allows transformers to be function pointers, member function pointers, pointers to member variables, lambdas, or any other class with a public operator()
. For example:
struct S { int value; int ValuePlusOne() { return value + 1; }; }; int ValueMinusOne(S const& s) { return s.value - 1; } void example() { std::array<S, 2> a { 99, 42 }; // v1 = { 99, 42 }; - pointer to data member std::vector<int> v1( transform_iterator(a.begin(), &S::value), transform_iterator(a.end(), &S::value)); // v2 = { 100, 43 }; - pointer to member function std::vector<int> v2( transform_iterator(a.begin(), &S::ValuePlusOne), transform_iterator(a.end(), &S::ValuePlusOne)); // v3 = { 98, 41 }; - pointer to free function std::vector<int> v3( transform_iterator(a.begin(), &ValueMinusOne), transform_iterator(a.end(), &ValueMinusOne)); }
The transform_
derives from a wrapped Transformer
. This is a space optimization that takes advantage of empty base optimization (EBO): If the Transformer
is an empty class, then the Wrapped<Transformer>
will also be an empty class, and empty base classes are permitted to occupy zero bytes.¹ (Normally, objects cannot be of size zero.) This means that a transform_
that has an empty class as a transformer (such as a captureless lambda) is the same size as the original iterator.
We wrap the Transformer
inside a class because base classes must be classes, but the Transformer
might not be a class, as we noted above.
A transforming iterator is handy for populating a std::map
because all three of the major implementations of the C++ standard library optimize the two-iterator insert()
overload for the case where the items are inserted in increasing key order at the end of map.²
In the case where you have two versions of a function, one of which takes a range and another of which takes items one at a time, you can avoid the need for a transforming iterator by turning the problem around: Instead of producing a range of transformed iterators to pass to function, you produce an output iterator that calls the single-parameter version of the function.
std::vector<int> v; std::transform(a.begin(), a.end(), std::back_inserter(v), transformer); std::map<int, int> m; std::transform(a.begin(), a.end(), std::inserter(m, m.end()), transformer);
This is simpler but has its downsides:
- You lose CTAD, since the compiler cannot infer the template type parameters from the constructor.
- Repeated single-element function calls may be less efficient than a bulk operation.
For example, inserting a transformed range into a vector is linear in the number of elements inserted plus the number of elements after the insertion point. In other words, inserting n new elements in front of k existing elements is O(n + k) if you do it in a single call to insert
:
// Bulk insert after the first element // This takes O(a.size() + v.size() - 1) = // O(a.size() + v.size()) v.insert(v.begin() + 1, transform_iterator(a.begin(), transformer), transform_iterator(a.end(), transformer));
If you insert one element at a time, then you pay the k each time, which results in a running time of n O(1 + k) = O(n + nk), an extra cost of O(nk).
// One-at-a-time insert after the first element // This takes O(a.size() + a.size() * (v.size() - 1)) = // O(a.size() * v.size()) v.insert(v.begin() + 1, transform_iterator(a.begin(), transformer), transform_iterator(a.end(), transformer));
Bonus chatter: The Boost library comes with an implementation of transform_
, so you can use that one instead of the custom one here. But at least you got to see a number of techniques that are commonly seen in library code.
¹ There are other things that could prevent the empty base optimization, but they do not apply here.
² In other words, it’s as if the ranged insertion method were written as
template&typename Iterator> void insert(Iterator first, Iterator last) { for (; first != last; ++first) { insert(end(), *first); } }
Last time, we looked at other ways of doing efficient bulk insertions.
Shouldn’t the C++14 usage examples also use `make_transform_iterator`, given that you had to define it specifically?
I’m missing how *this is callable. Seems to me that something is missing from Wrapper here.
Yeah, there is no sign of the std::invoke in there…
I don't think your use of EBO works. EBO leaves `transform_iterator` unpadded whenever the base class `Wrap<Transformer>` is empty…but the latter always has member `Inner m_inner;`. For this reason, cppinsights (which I think uses GCC) gives the original iterator size 8 and the transformed one size 16 in your stateless lambda use-case.
To fix this, I think you'd need to have two different cases for `Wrap<Transformer>`: if the `Inner` admits inheritance, make it a base as well:<code>