{"id":108524,"date":"2023-08-02T07:00:00","date_gmt":"2023-08-02T14:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=108524"},"modified":"2025-03-04T21:21:44","modified_gmt":"2025-03-05T05:21:44","slug":"20230802-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20230802-00\/?p=108524","title":{"rendered":"Inside STL: The vector"},"content":{"rendered":"<p>The C++ language comes with a standard library, and although implementations are welcome to implement the library types in whatever manner they choose, there are constraints imposed by the standard which often force one of a small number of possible implementations.<\/p>\n<p>The <code>std::vector<\/code> is one of those types which is constrained to the point that there&#8217;s really only one viable implementation.<\/p>\n<p>Internally, a <code>std::vector<\/code> basically looks like this:<\/p>\n<pre>template&lt;typename T&gt;\r\nstruct vector\r\n{\r\n    T* first;\r\n    T* last;\r\n    T* end;\r\n};\r\n<\/pre>\n<p>The <code>first<\/code> is a pointer to the beginning of a single memory allocation that holds the vector contents.<\/p>\n<p>The <code>last<\/code> is a pointer one past the end of the last valid vector element.<\/p>\n<p>The <code>end<\/code> is a pointer one past the end of the last allocated memory for the vector.<\/p>\n<p>The picture for this is as follows:<\/p>\n<table class=\"cp3\" style=\"border-collapse: collapse;\" title=\"Diagram of where the vector members point (see following text)\" border=\"0\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr style=\"position: relative;\">\n<td rowspan=\"3\">\u00a0<\/td>\n<p><!-- extra column to prevent arrow from getting clipped due to parent overflow-x --><\/p>\n<td style=\"padding: 0;\">first<\/td>\n<p><!-- remove padding added by CSS --><\/p>\n<td>&nbsp;<\/td>\n<td>&nbsp;<\/td>\n<td>&nbsp;<\/td>\n<td align=\"right\">last<\/td>\n<td>&nbsp;<\/td>\n<td align=\"right\">end<\/td>\n<\/tr>\n<tr style=\"position: relative;\">\n<td style=\"position: relative; left: -.5ex; padding: 0;\">\u2193<\/td>\n<td>&nbsp;<\/td>\n<td>&nbsp;<\/td>\n<td>&nbsp;<\/td>\n<td>&nbsp;<\/td>\n<td style=\"position: relative; left: -.5ex; padding: 0;\">\u2193<\/td>\n<td>&nbsp;<\/td>\n<td style=\"position: relative; left: -.5ex; padding: 0;\">\u2193<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center; width: 3em; border: solid 1px gray;\">T<\/td>\n<td style=\"text-align: center; width: 3em; border: solid 1px gray;\">T<\/td>\n<td style=\"text-align: center; width: 3em; border: solid 1px gray;\">T<\/td>\n<td style=\"text-align: center; width: 3em; border: solid 1px gray;\">T<\/td>\n<td style=\"text-align: center; width: 3em; border: solid 1px gray;\">T<\/td>\n<td style=\"text-align: center; width: 3em; border: solid 1px gray;\">?<\/td>\n<td style=\"text-align: center; width: 3em; border: solid 1px gray;\">?<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The <code>first<\/code> points to the start of the allocation.<\/p>\n<p>The memory between <code>first<\/code> and <code>last<\/code> is used to hold live elements of the vector. The <code>size()<\/code> of the vector is <code>last - first<\/code>.<\/p>\n<p>The memory between <code>last<\/code> and <code>end<\/code> is not being used for anything. The <code>capacity()<\/code> of the vector is <code>end - first<\/code>.<\/p>\n<p>If you actually dump the <code>std::vector<\/code> in the debugger, you&#8217;ll see that it&#8217;s actually more complicated than this due to the need to store an allocator. The allocator is usually an empty class (the default allocator certainly is), so I didn&#8217;t show it in the conceptual version above.<\/p>\n<p>Since the allocator is usually an empty class, it is stored as a compressed pair with one of the other members so that it usually occupies no memory.<\/p>\n<p>The Visual Studio debugger contains a visualizer to view the contents of a <code>std::vector<\/code> more conveniently, but if you need to dig out the three magic pointers yourself, here&#8217;s how you can do it with the Microsoft implementation of the standard library:<\/p>\n<pre>0:001&gt; ?? v\r\nclass std::vector&lt;T,std::allocator&lt;T&gt; &gt;\r\n   +0x000 _Mypair          : std::_Compressed_pair&lt;std::allocator&lt;T&gt;,std::_Vector_val&lt;std::_Simple_types&lt;T&gt; &gt;,1&gt;\r\n0:000&gt; ?? v._Mypair\r\nclass std::_Compressed_pair&lt;std::allocator&lt;T&gt;,std::_Vector_val&lt;std::_Simple_types&lt;T&gt; &gt;,1&gt;\r\n   +0x000 _Myval2          : std::_Vector_val&lt;std::_Simple_types&lt;T&gt; &gt;\r\n0:000&gt; ?? v._Mypair._Myval2\r\nclass std::_Vector_val&lt;std::_Simple_types&lt;T&gt; &gt;\r\n   +0x000 _Myfirst         : T* \u21d0\r\n   +0x008 _Mylast          : T* \u21d0\r\n   +0x010 _Myend           : T* \u21d0\r\n<\/pre>\n<p>The language requirements pretty much require this layout for a vector because the <code>data()<\/code> method must return a pointer to contiguous memory, and the constraints on the <code>capacity()<\/code> require that the excess elements come immediately after.<\/p>\n<p>In particular, the language forbids a &#8220;small vector optimization&#8221; analogous to the small string optimization (which we&#8217;ll see soon), because the standard requires that\u00b9 moving a vector not invalidate iterators or references. The vector elements have to remain where they are. A small-vector optimization would put them in vector-internal storage, which would require that the element move from one vector to another on a vector move operation, which is not allowed.<\/p>\n<p>\u00b9 Assuming you&#8217;re using a well-behaved allocator.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A contiguous memory block, reallocated as necessary.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-108524","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>A contiguous memory block, reallocated as necessary.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/108524","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=108524"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/108524\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=108524"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=108524"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=108524"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}