The C++20 ranges library come with the ability to transform a view; that is, to provide a unary function that is applied to each element of a view, producing a new view.
std::array<int, 2> a = { 99, 42 };
// This range reports values that are one larger than the values
// in an array. The array values are unchanged.
auto r = a |
std::ranges::views::transform([](int v) { return v + 1; });
// This creates a vector from the range
std::vector v(r.begin(), r.end());
// The resulting vector is { 100, 43 }
But what if your code base isn’t ready to move to C++20 yet? For example, you might be a library that wants to support C++17 or even C++14.
You can take the original iterator and wrap it inside another iterator that overrides the * operator so that it applies a transformation to the value before returning it. Note that our transforming iterator cannot support the -> operator since there is no value object we can return a pointer to; our value is generated on demand. Fortunately, the standard permits us to omit -> support, provided we define pointer as void.
template<typename Inner>
struct Wrap
{
Wrap(Inner const& inner) :
m_inner(inner) {}
Inner m_inner;
};
template<typename It, typename Transformer>
class transform_iterator : Wrap<Transformer>
{
It m_it;
public:
transform_iterator(It const& it,
Transformer const& transformer) :
m_it(it),
Wrap<Transformer>(transformer) {}
// copy constructors and assignment operators defaulted
using difference_type = typename
std::iterator_traits<It>::difference_type;
using value_type = typename std::invoke_result<
Transformer, It>::type;
using pointer = void;
using reference = void;
using iterator_category = std::input_iterator_tag;
bool operator==(transform_iterator const& other)
{ return m_it == other.m_it; }
bool operator!=(transform_iterator const& other)
{ return m_it != other.m_it; }
auto operator*() const { return (*this)(*m_it); }
auto operator++() { ++m_it; return *this; }
auto operator++(int)
{ auto prev = *this; ++m_it; return prev; }
};
// For C++14 (no CTAD)
template<typename It, typename Transformer>
auto make_transform_iterator(
It const& it, Transformer const& transformer)
{
return transform_iterator<It, Transformer>
(it, transformer);
}
We can use this transforming iterator like this:
std::array<int, 2> a = { 99, 42 };
auto transformer = [](int v) { return v + 1; };
// This creates a vector from the array, applying
// a transformation to each value
std::vector v(
transform_iterator(a.begin(), transformer),
transform_iterator(a.end(), transformer));
// If C++14 (no CTAD)
std::vector<int> v(
transform_iterator(a.begin(), transformer),
transform_iterator(a.end(), transformer));
// The resulting vector is { 100, 43 }
The transformation can even change the type:
std::array<int, 2> a = { 99, 42 };
auto transformer = [](int v)
{ return std::make_pair(v + 1, v); };
// This creates a map from the array
std::map m(
transform_iterator(a.begin(), transformer),
transform_iterator(a.end(), transformer));
// If C++14 (no CTAD)
std::map<int, int> m(
transform_iterator(a.begin(), transformer),
transform_iterator(a.end(), transformer));
// The resulting map is
// m[100] = 99
// m[43] = 42
There are some non-obvious pieces of the above transform_.
We want to accept not just lambdas as transformers, but any Callable. That means that we use std::invoke to invoke the transformer on the wrapped iterator. This allows transformers to be function pointers, member function pointers, pointers to member variables, lambdas, or any other class with a public operator(). For example:
struct S
{
int value;
int ValuePlusOne() { return value + 1; };
};
int ValueMinusOne(S const& s)
{
return s.value - 1;
}
void example()
{
std::array<S, 2> a { 99, 42 };
// v1 = { 99, 42 }; - pointer to data member
std::vector<int> v1(
transform_iterator(a.begin(), &S::value),
transform_iterator(a.end(), &S::value));
// v2 = { 100, 43 }; - pointer to member function
std::vector<int> v2(
transform_iterator(a.begin(), &S::ValuePlusOne),
transform_iterator(a.end(), &S::ValuePlusOne));
// v3 = { 98, 41 }; - pointer to free function
std::vector<int> v3(
transform_iterator(a.begin(), &ValueMinusOne),
transform_iterator(a.end(), &ValueMinusOne));
}
The transform_ derives from a wrapped Transformer. This is a space optimization that takes advantage of empty base optimization (EBO): If the Transformer is an empty class, then the Wrapped<Transformer> will also be an empty class, and empty base classes are permitted to occupy zero bytes.¹ (Normally, objects cannot be of size zero.) This means that a transform_ that has an empty class as a transformer (such as a captureless lambda) is the same size as the original iterator.
We wrap the Transformer inside a class because base classes must be classes, but the Transformer might not be a class, as we noted above.
A transforming iterator is handy for populating a std::map because all three of the major implementations of the C++ standard library optimize the two-iterator insert() overload for the case where the items are inserted in increasing key order at the end of map.²
In the case where you have two versions of a function, one of which takes a range and another of which takes items one at a time, you can avoid the need for a transforming iterator by turning the problem around: Instead of producing a range of transformed iterators to pass to function, you produce an output iterator that calls the single-parameter version of the function.
std::vector<int> v;
std::transform(a.begin(), a.end(), std::back_inserter(v),
transformer);
std::map<int, int> m;
std::transform(a.begin(), a.end(), std::inserter(m, m.end()),
transformer);
This is simpler but has its downsides:
- You lose CTAD, since the compiler cannot infer the template type parameters from the constructor.
- Repeated single-element function calls may be less efficient than a bulk operation.
For example, inserting a transformed range into a vector is linear in the number of elements inserted plus the number of elements after the insertion point. In other words, inserting n new elements in front of k existing elements is O(n + k) if you do it in a single call to insert:
// Bulk insert after the first element
// This takes O(a.size() + v.size() - 1) =
// O(a.size() + v.size())
v.insert(v.begin() + 1,
transform_iterator(a.begin(), transformer),
transform_iterator(a.end(), transformer));
If you insert one element at a time, then you pay the k each time, which results in a running time of n O(1 + k) = O(n + nk), an extra cost of O(nk).
// One-at-a-time insert after the first element
// This takes O(a.size() + a.size() * (v.size() - 1)) =
// O(a.size() * v.size())
v.insert(v.begin() + 1,
transform_iterator(a.begin(), transformer),
transform_iterator(a.end(), transformer));
Bonus chatter: The Boost library comes with an implementation of transform_, so you can use that one instead of the custom one here. But at least you got to see a number of techniques that are commonly seen in library code.
¹ There are other things that could prevent the empty base optimization, but they do not apply here.
² In other words, it’s as if the ranged insertion method were written as
template&typename Iterator>
void insert(Iterator first, Iterator last)
{
for (; first != last; ++first) {
insert(end(), *first);
}
}
Last time, we looked at other ways of doing efficient bulk insertions.
Shouldn’t the C++14 usage examples also use `make_transform_iterator`, given that you had to define it specifically?
I’m missing how *this is callable. Seems to me that something is missing from Wrapper here.
Yeah, there is no sign of the std::invoke in there…
I don't think your use of EBO works. EBO leaves `transform_iterator` unpadded whenever the base class `Wrap<Transformer>` is empty…but the latter always has member `Inner m_inner;`. For this reason, cppinsights (which I think uses GCC) gives the original iterator size 8 and the transformed one size 16 in your stateless lambda use-case.
To fix this, I think you'd need to have two different cases for `Wrap<Transformer>`: if the `Inner` admits inheritance, make it a base as well:<code>