Another way of looking at C++ reverse iterators

Raymond

C++ has this thing called a reverse_iterator that takes a regular iterator and goes in the opposite direction. You might think that an iterator and its reverse point to the same thing, and it’s just the increment and decrement operators that are reversed, but that’s not true. An iterator and its reverse point to different things.

This reason for this comes down to a quirk of the C++ language: You are allowed to have a pointer “one past the end” of a collection, but you are not allowed to have a pointer “one before the beginning” of a collection.

When moving forward through a collection, you can use the “one past the end” pointer as the sentinel that means “you’re done.”

   
 

begin

ok

ok

ok

ok

ok

end

But you don’t have such a luxury when going backward:

   

oops

ok

ok

ok

ok

ok

begin
 

You want to create an end sentinel value immediately before the first element, but you can’t because the C++ language forbids it.

The standard library finesses the problem by making the reverse iterator “off by one”: The element referenced by the iterator is the one before the one it nominally points to.

   
 

end

ok

ok

ok

ok

ok

begin

This off-by-one behavior is a bit tricky to wrap your brain around, but here’s a way of looking at it that’s a bit less wacky: View the pointer as pointing, not at the center of an element, but at its start. When iterating forward, you look to the element in front of the pointer, and when iterating backward, you look to the element in back of the pointer. In both cases, you look in the direction of motion.

   
 
↑begin
↑ok
↑ok
↑ok
↑ok
↑ok
↑end
   
end↑
ok↑
ok↑
ok↑
ok↑
ok↑
begin↑
 

Now the off-by-one behavior is easier to see. A pointer when interpreted as a forward iterator looks forward one element, but when interpreted as a reverse iterator looks backward one element.

   
rev
end

fwd
begin
           
 
rev
ok

fwd
ok
         
   
rev
ok

fwd
ok
       
     
rev
ok

fwd
ok
     
       
rev
ok

fwd
ok
   
         
rev
ok

fwd
ok
 
           
rev
begin

fwd
end

7 comments

Comments are closed. Login to edit/delete your existing comments

  • Don Dumitru

    You are mashing two things together: The abstract concept of iterators, and the C++ implementation of iterators over an array (which uses a pointer).

    In the abstract concept of iterators, the “end” iterator is *always* a sentinel. You can never legally dereference an “end” iterator – that is undefined behavior. But you can compare an iterator to the “end” iterator, and if they are equal that signals, “stop going forward, the iterator is exhausted”.

    C++ uses simple pointers to implement iterators over arrays, because that is fast, small, and simple – an amazing combination. But as you note, it requires a little bit of finesse to use simple pointers for both the forward and reverse implementations.

    But you aren’t supposed to look at the implementation in C++. You are supposed to stick to the abstract model and not even think about the implementation, because code you write that uses iterators should be written so that it works for any iterator, not just for iterators over arrays.

    (Unless you are debugging and need to actually see what value will come back if somebody dereferences it, or if somebody uses the iterator value incorrectly.)

    • Raymond ChenMicrosoft employee

      The mental model works for abstract iterators. In fact, it’s codified in the standard under [reverse.iter.elem]: constexpr reference operator* const;. Effects: As if by Iterator tmp = current; return *--tmp;

    • Peter Cooper Jr.

      Looks like he just added a few dozen lines of Javascript into the post.

      • Tango Gu

        Yes, I guess Raymond spent more time on implementing animation through javascript/html table then the content writing.
        Refer to “function addTable(id, start0, reverse0, start1, reverse1) {” related section in page source.

  • Robert Southee

    I’ve always been curious as to why the C/C++ implementation can have the concept of a pointer “one past the end”. I thought there might be some edge case, i.e. the last address is not valid.

    • Gareth Martin

      This is because C/C++ traditionally use loops from 0 to <size. Loops over pointers (or iterators) inherited the same – ending at <ptr+size or <end().

      Because of this, there is an implicit assumption that you can have an index, pointer, or iterator that represents the value to stop iterating after you calculate it but before you use it because it’s just outside the bounds.

      The alternative is to use <=last_idx, <=ptr+last_idx or <=last_it() as your ending condition – but that breaks when your container is zero-size! You then either need to add an additional if() around all loops, or make last point to something invalid and less than begin. -1 for indexes would make sense, if indexes are signed, but pointers complicate matters… if begin() is nullptr due to the container being empty, what would last() be in order to be “one before” to allow for a <=lastptr test? With <end(), it can also be nullptr.