{"id":104012,"date":"2020-07-28T07:00:00","date_gmt":"2020-07-28T14:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=104012"},"modified":"2020-07-28T08:40:38","modified_gmt":"2020-07-28T15:40:38","slug":"20200728-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20200728-00\/?p=104012","title":{"rendered":"A look back at memory models in 16-bit MS-DOS"},"content":{"rendered":"<p>In MS-DOS and 16-bit Windows programming, you had to deal with memory models. This terms does not refer to processor architecture memory models (how processors interact with memory) but rather how a program internally organizes itself. The operating system itself doesn&#8217;t know anything about application memory models; it&#8217;s just a convenient way of talking about how a program deals with different types of code and data.<\/p>\n<p>The terms for the memory models came from the C compiler, since this informed the compiler what type of code to generate. The four basic models fit into a nice table:<\/p>\n<table class=\"cp3\" style=\"border-collapse: collapse; text-align: center;\" border=\"1\" cellspacing=\"0\" cellpadding=\"3\">\n<tbody>\n<tr>\n<th colspan=\"2\" rowspan=\"2\">\u00a0<\/th>\n<th colspan=\"2\">Data pointer size<\/th>\n<\/tr>\n<tr>\n<th>Near<\/th>\n<th>Far<\/th>\n<\/tr>\n<tr>\n<th style=\"vertical-align: middle; width: 2em;\" rowspan=\"2\"><span style=\"writing-mode: vertical-lr; transform: rotate(180deg);\"> Code pointer size<\/span><\/th>\n<th>Near<\/th>\n<td>Small<\/td>\n<td>Compact<\/td>\n<\/tr>\n<tr>\n<th>Far<\/th>\n<td>Medium<\/td>\n<td>Large<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The 8086 used segmented memory, which means that a pointer consists of two parts: A segment and an offset. A <i>far pointer<\/i> consists of both the segment and the offset. A <i>near pointer<\/i> consists of only the offset, with the segment implied.<\/p>\n<p>Once you had more than 64KB of code or more than 64KB of data, you had to switch to far code pointers or far data pointers (respectively) in order to refer to everything you needed.<\/p>\n<p>Most of my programs were <i>Compact<\/i>, meaning that there wasn&#8217;t a lot of code, but there was a lot of data. That&#8217;s because the programs I wrote tended to do a lot of data processing.<\/p>\n<p>The <i>Medium<\/i> model was useful for programs that had a lot of code but not a lot of data. User interface code often fell into this category, because you had to write a lot of code to manage a dialog box, but the result of all that work was just a text string (from an edit box) and some flags (from some check boxes). Many computer games also fell into this category, because you had a lot of game logic, but not a lot of game state.<\/p>\n<p>MS-DOS had an additional memory model known as <i>Tiny<\/i>, in which the code and data were all combined into a single 64KB segment. This was the memory model required by programs that ended with the <code>.COM<\/code> extension, and it existed for backward compatibility with CP\/M. CP\/M ran on the 8080 processor which supported a maximum of 64KB of memory.<\/p>\n<p>Far pointers could access any memory in the 1MB address space of the 8086, but each object was still limited to 64KB because pointer arithmetic was performed only on the offset portion of the pointer. <i>Huge<\/i> pointers could refer to memory blocks larger than 64KB by <a href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20171113-00\/?p=97386\"> adjusting the segment whenever the offset overflowed<\/a>.\u00b9 Pointer arithmetic with huge pointers was computationally expensive, so you didn&#8217;t use them much.\u00b2<\/p>\n<p>You weren&#8217;t limited to the above memory models. You could make up your own, known as <i>mixed model programming<\/i>. For example, you could say that most of your program is <i>Small<\/i> memory model, but there&#8217;s one place where you need to access memory outside the default data segment, so you declared an explicit far pointer for that purpose. Similarly, you could define an explicitly far function to move it out of the default code segment.<\/p>\n<p>The memory model specified the default behavior, so if you called, say, <code>memcpy<\/code>, you got a version whose pointer size matched your memory model. If you had a <i>Small<\/i> or <i>Medium<\/i> model program and wanted to copy memory that was outside your default data segment, you could call the <code>_fmemcpy<\/code> function, which was the same as <code>memcpy<\/code> except that it took far pointers. (If you used the <i>Compact<\/i> or <i>Large<\/i> memory model, then <code>memcpy<\/code> and <code>_fmemcpy<\/code> were identical.)<\/p>\n<p>One of my former colleagues is back in school and was talking with his (younger) advisor. Somehow, the topic turned to the 8086 processor. My colleague and his friend explained segmented addressing, near and far pointers, how pointer equality and comparison behaved,\u00b3 the mysterious <a href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20120206-00\/?p=8373\"> A20 gate<\/a>, and how the real mode boot sector worked. &#8220;The look of horror on his face at how segment registers used to work was priceless.&#8221;<\/p>\n<p>I quickly corrected my colleague. &#8220;&#8216;Used to work&#8217;? They still work that way!&#8221;<\/p>\n<p><b>Bonus chatter<\/b>: You can see the remnants of 16-bit memory models in the macros <code>NEAR<\/code> and <code>FAR<\/code> defined by <code>windows.h<\/code>. They don&#8217;t mean anything any more, but they remain for <a href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20120913-00\/?p=6613\"> source code backward compatibility<\/a>.<\/p>\n<p>\u00b9 In MS-DOS, huge pointers operated by putting the upper 16 bits of the 20-bit address in the segment and putting the remaining 4 bits in the offset. The offset of a huge pointer in MS-DOS was always less than 16.<\/p>\n<p>\u00b2 In 16-bit Windows, there was a system function called <code>hmemcpy<\/code>, which copied memory blocks that could be larger than 64KB. And now you know what the <code>h<\/code> prefix stood for.<\/p>\n<p>\u00b3 Segmented memory is a great source of counterexamples. For example, in a segmented memory model, you could have a pointer <var>p<\/var> to a buffer of size <var>N<\/var>, and another pointer <var>q<\/var> could satisfy the inequalities<\/p>\n<pre>p &lt;= q &amp;&amp; q &lt;= p + N\r\n<\/pre>\n<p>despite not pointing into the buffer.<\/p>\n<p>This is one of the reasons why the C and C++ programming languages do not specify the result of comparisons between pointers that do not point to elements of the same array.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Ways of dealing with the fractured address space.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[2],"class_list":["post-104012","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-history"],"acf":[],"blog_post_summary":"<p>Ways of dealing with the fractured address space.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/104012","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=104012"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/104012\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=104012"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=104012"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=104012"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}