{"id":103547,"date":"2020-03-09T07:00:00","date_gmt":"2020-03-09T14:00:00","guid":{"rendered":"http:\/\/devblogs.microsoft.com\/oldnewthing\/?p=103547"},"modified":"2020-07-03T08:12:57","modified_gmt":"2020-07-03T15:12:57","slug":"20200309-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20200309-00\/?p=103547","title":{"rendered":"Why does MS-DOS put an <CODE>int 20h<\/CODE> at byte 0 of the COM file program segment?"},"content":{"rendered":"<p>The MS-DOS <code>.com<\/code> file format is very simple: It just a memory dump of the 16-bit address space starting at offset <code>0100h<\/code>, and continuing for the size of the program.<\/p>\n<p>The memory below <code>0100h<\/code> also had a specific format, known as the <i>Program Segment Prefix<\/i>. There&#8217;s a lot of stuff in there, but the stuff that&#8217;s interesting for today&#8217;s discussion are the following:<\/p>\n<ul>\n<li>At offset <code>0000h<\/code> is an <code>int 20h<\/code> instruction.<\/li>\n<li>At offset <code>0005h<\/code> is a <code>jmp<\/code> instruction.<\/li>\n<li>At offset <code>005Ch<\/code> is a file control block that contains the first command line argument, parsed as if it were a file name.<\/li>\n<li>At offset <code>006Ch<\/code> is a file control block that contains the second command line argument, parsed as if it were a file name.<\/li>\n<li>At offset <code>0080h<\/code> is the command line.<\/li>\n<\/ul>\n<p>The <code>int 20h<\/code> is the &#8220;exit program&#8221; system call. One theory is that it is placed at offset <code>0000h<\/code> so that if execution runs off the end of the code segment, the instruction pointer will wrap back around to zero, and then the program will terminate.<\/p>\n<p>An interesting theory, but unlikely. The odds of execution running harmlessly off the end of the code segment are slim to none.<\/p>\n<p>These specific bytes are significant because they line up exactly with how CP\/M organized its zero page. Keeping these important addresses the same made it easier to port CP\/M programs to MS-DOS.<\/p>\n<p>And CP\/M put the &#8220;exit program&#8221; system call at offset <code>0000h<\/code> because it started each program with <code>0000h<\/code> on the stack. If the program executed a <code>ret<\/code> instruction, it would return back to zero, and exit the program. Just like if you do a <code>return<\/code> from <code>main<\/code>.<\/p>\n<p>And although <code>int 21h<\/code> was the primary system call for MS-DOS, it supported the CP\/M system call address: <code>call 0005h<\/code>. To further ease the porting effort from CP\/M to MS-DOS, MS-DOS chose system call function codes to match the CP\/M function codes.<\/p>\n<p>In other words, the <code>int 20h<\/code> is at offset <code>0000h<\/code> for backward compatibility with CP\/M.<\/p>\n<p><b>Bonus chatter<\/b>: The CP\/M history also calls out how unlikely it is for execution to run off the end of the segment and wrap around. In order for that to happen, it would have to somehow execute through the operating system itself, because CP\/M put the operating system at the highest available address. (Also, the highest available address may not be <code>0xFFFF<\/code> because the system could very well have less than 64KB of memory.)<\/p>\n<p><b>Follow-up<\/b>: Commenter <a href=\"https:\/\/twitter.com\/_jimnelson_\"> Jim Nelson<\/a> points out that <a href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20200309-00\/?p=103547#comment-136371\"> this jump instruction deserves an entire article by itself<\/a>, and fortunately he also provided a link <a href=\"http:\/\/www.os2museum.com\/wp\/who-needs-the-address-wraparound-anyway\/\"> to that article<\/a>. It&#8217;s a wild tale of deception, lies, and <a href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20120206-00\/?p=8373\"> the A20 line<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In case you end up there, but how?<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[2],"class_list":["post-103547","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-history"],"acf":[],"blog_post_summary":"<p>In case you end up there, but how?<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/103547","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=103547"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/103547\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=103547"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=103547"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=103547"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}