Inside STL: The vector

Raymond Chen

The C++ language comes with a standard library, and although implementations are welcome to implement the library types in whatever manner they choose, they are constraints imposed by the standard which often force one of a small number of possible implementations.

The std::vector is one of those types which is constrained to the point that there’s really only one viable implementation.

Internally, a std::vector basically looks like this:

template<typename T>
struct vector
{
    T* first;
    T* last;
    T* end;
};

The first is a pointer to the beginning of a single memory allocation that holds the vector contents.

The last is a pointer one past the end of the last valid vector element.

The end is a pointer one past the end of the last allocated memory for the vector.

The picture for this is as follows:

  first       last   end
         
T T T T T ? ?

The first points to the start of the allocation.

The memory between first and last is used to hold live elements of the vector. The size() of the vector is last - first.

The memory between last and end is not being used for anything. The capacity() of the vector is end - first.

If you actually dump the std::vector in the debugger, you’ll see that it’s actually more complicated than this due to the need to store an allocator. The allocator is usually an empty class (the default allocator certainly is), so I didn’t show it in the conceptual version above.

Since the allocator is usually an empty class, it is stored as a compressed pair with one of the other members so that it usually occupies no memory.

The Visual Studio debugger contains a visualizer to view the contents of a std::vector more conveniently, but if you need to dig out the three magic pointers yourself, here’s how you can do it with the Microsoft implementation of the standard library:

0:001> ?? v
class std::vector<T,std::allocator<T> >
   +0x000 _Mypair          : std::_Compressed_pair<std::allocator<T>,std::_Vector_val<std::_Simple_types<T> >,1>
0:000> ?? v._Mypair
class std::_Compressed_pair<std::allocator<T>,std::_Vector_val<std::_Simple_types<T> >,1>
   +0x000 _Myval2          : std::_Vector_val<std::_Simple_types<T> >
0:000> ?? v._Mypair._Myval2
class std::_Vector_val<std::_Simple_types<T> >
   +0x000 _Myfirst         : T* ⇐
   +0x008 _Mylast          : T* ⇐
   +0x010 _Myend           : T* ⇐

The language requirements pretty much require this layout for a vector because the data() method must return a pointer to contiguous memory, and the constraints on the capacity() require that the excess elements come immediately after.

In particular, the language forbids a “small vector optimization” analogous to the small string optimization (which we’ll see soon), because the standard requires that¹ moving a vector not invalidate iterators or references. The vector elements have to remain where they are. A small-vector optimization would put them in vector-internal storage, which would require that the element move from one vector to another on a vector move operation, which is not allowed.

¹ Assuming you’re using a well-behaved allocator.

6 comments

Discussion is closed. Login to edit/delete existing comments.

  • Yuri Khan 0

    Are there any constraints that preclude vector implementations storing begin, size and capacity, and calculating end() as begin + size?

    • Michael Taylor 0

      From a performance POV I think that would be a step backwards. When iterating the elements you start at first and then continue until `first == last`. When adding an element you first determine if `last == end`. Simple pointer comparison. If you stored the count instead then iteration may behave the same (depends I think) but adding an element would require `last == (first + size)`. I would suspect that `size` is rarely needed, such as when needing to resize, and probably almost never outside the type itself. But I’m just guessing here as it’s been years so I’ve had to implement vectors in languages that didn’t support them.

      Also, just guessing here again, `size` would need to be a 64-bit integral to allow arbitrarily large arrays. On x86 that isn’t an issue and would waste 4 bytes per vector. Pointers are more efficient and allow arbitrary sizing.

      • Jan Ringoš 0

        I’d argue the opposite: If item size isn’t power of 2, then doing `(last – first)` requires DIV instruction. While getting `last` from `first + size` needs MUL which is usually way faster. Then again, one DIV is surely faster than many MULs, and I don’t have any profiling numbers from any real world software on this.

      • Yuri Khan 0

        I’d expect a Sufficiently Smart Compiler™ to hoist the end() calculation out of the loop.

        Comparing last == end to determine if you need a reallocation could be equivalently replaced with size == capacity.

        Sizes have to be the same bitness as pointers anyway, because std::vector<char>::size() has to work. first/last/end takes 3 * sizeof(T*) = 24 bytes (on a 64-bit platform); begin/size/capacity takes sizeof(T*) + 2 * sizeof(size_t) = also 24 bytes. (Although at least once I’ve wanted (but not really felt like implementing) a vector-like container that is represented with a 64-bit begin and 32-bit size/capacity, so that I could save 8 bytes per vector at the expense of not being able to have huge element counts.)

        • Alexander Shalimov 0

          “I’d expect a Sufficiently Smart Compiler™ to hoist the end() calculation out of the loop” — I don’t think this is possible for non-const iterators, because you can call erase() inside the loop which changes end().

    • Raymond ChenMicrosoft employee 0

      An implementation is welcome to store those three pieces of information in any way it likes, as long as it can eventually produce that information when necessary. The point is that the standard requires a contiguous block of memory, with the initial portion used and the trailing portion unused.

Feedback usabilla icon