{"id":104038,"date":"2020-08-03T07:00:00","date_gmt":"2020-08-03T14:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=104038"},"modified":"2020-08-03T07:26:00","modified_gmt":"2020-08-03T14:26:00","slug":"20200803-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20200803-00\/?p=104038","title":{"rendered":"Peeking inside the implementation of AnsiUpper and AnsiLower in Windows 1.0"},"content":{"rendered":"<p>Windows 1.0 had functions called <code>AnsiUpper<\/code> and <code>AnsiLower<\/code>. You passed these functions a pointer to a string, and it converted the string in place to uppercase or lowercase, respectively. If the segment portion of the pointer is zero, then the offset is treated as a character code, and it returned the uppercase version of that character code in the low byte of the return value.<\/p>\n<p>The single-character version could anachronistically be wrapped like this:\u00b9<\/p>\n<pre>inline char AnsiUpperChar(char c)\r\n{\r\n return reinterpret_cast&lt;char&gt;(\r\n    AnsiUpper(reinterpret_cast&lt;LPSTR&gt;(\r\n      static_cast&lt;unsigned char&gt;(c))));\r\n}\r\n<\/pre>\n<p>This is an anachronism because in 1983, there was no <code>reinterpret_cast<\/code>, no <code>static_cast<\/code>, no inline functions, and no C++.<\/p>\n<p>It was more likely to be a macro.<\/p>\n<pre>#define AnsiUpperChar(c) ((char)AnsiUpper((LPSTR)(unsigned char)(c)))\r\n<\/pre>\n<p>The implementations of these functions is entirely in assembly language.<\/p>\n<pre>; Entry: pointer on stack\r\n; Exit:  If single character, AL = converted character\r\n;        If string, DX:AX = original pointer\r\n\r\nAnsiUpper proc far\r\n        mov bx, sp          ; custom stack frame\r\n        push di             ; save registers\r\n        push si\r\n        les di, ss:[bx+4]   ; es:di -&gt; string\r\n        mov cx, es          ; cx:ax -&gt; string\r\n        mov ax, di\r\n        call UpperChar      ; uppercase the character in AL\r\n        jcxz aup90          ; Exit if CX = 0\r\n        call UpperString    ; uppercase the string in ES:DI\r\n        mov dx, es          ; return the original pointer\r\n        mov ax, ss:[bx+4]\r\naup90:  pop si\r\n        pop di\r\n        ret 4\r\nAnsiUpper endp\r\n\r\n; Entry: AL = character\r\n; Exit:  AL = uppercase version of character\r\n; Modifies: No other registers\r\n\r\nUpperChar proc near\r\n        cmp al, 0x61        ; Q: Less than 'a'?\r\n        jb uch90            ; Y: Nothing to do\r\n        cmp al, 0x7a        ; Q: Less than 'z'?\r\n        jbe uch80           ; Y: Convert to uppercase\r\n        cmp al, 0xe0        ; Q: Less than '\u00e0'?\r\n        jb uch90            ; Y: Nothing to do\r\n        cmp al, 0xfe        ; Q: More than '\u00fe'?\r\n        ja uch90            ; Y: Nothing to do\r\nuch80:  sub al, 0x20        ; Convert lowercase to uppercase\r\nuch90:  ret\r\nUpperChar endp\r\n\r\n; Entry: ES:DI -&gt; string to convert to uppercase\r\n; Exit: String has been converted to uppercase in place\r\n; Modifies: SI, DI, AL\r\n\r\nUpperString proc near\r\n        cld                 ; Ensure we walk forward\r\n        mov si, di          ; ES:SI and ES:DI both -&gt; string\r\nust10:  lodsb es:[si]       ; Load character and advance SI\r\n        call UpperChar      ; Convert to uppercase\r\n        stosb               ; Save result and advance DI\r\n        or al, al           ; Q: End of string?\r\n        jnz ust10           ; N: Keep converting\r\n        ret\r\nUpperString endp\r\n<\/pre>\n<p>The <code>AnsiLower<\/code> function is entirely analogous, so I won&#8217;t bother writing it out.<\/p>\n<p>The <code>AnsiUpper<\/code> function doesn&#8217;t use the usual <code>BP<\/code> stack frame. To save code space, it uses <code>BX<\/code> as the stack frame pointer. That way, it doesn&#8217;t need to do all the usual frame setup and teardown stuff. This code does not call out to other code segments, so we won&#8217;t trigger any <a href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20120622-00\/?p=7303\"> segment-not-present thunks<\/a> that would require <a href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20110316-00\/?p=11203\"> stack patching<\/a>, so the lack of a proper <code>BP<\/code> frame is not going to cause a problem.<\/p>\n<p>The structure of the <code>AnsiUpper<\/code> function is rather odd. It first assumes that you&#8217;re calling it with a single character and converts the offset from lowercase to uppercase. Only after the conversion does it check whether you actually called it that way. If so, then it jumps to the exit with the converted character. Otherwise, it throws away all the work it did and starts over by converting the pointed-to string.<\/p>\n<p>Why does it structure the code this way? Because it saves an instruction. Instead of<\/p>\n<pre>    if condition goto branch2\r\n    do_branch1\r\n    goto end\r\nbranch2:\r\n    do_branch2\r\nend:\r\n<\/pre>\n<p>you speculatively front-load one of the branches and discard it if it turns out to be the wrong branch.<\/p>\n<pre>    do_branch2\r\n    if condition goto end\r\n    do_branch1\r\nend:\r\n<\/pre>\n<p>This removes the <code>goto end<\/code> from the instruction stream, saving two bytes.<\/p>\n<p>Of course, this trick requires that <code>do_branch2<\/code> has no side effects, or at least that the side effects can be rolled back if the speculation turns out to have been unwarranted.<\/p>\n<p>The <code>UpperChar<\/code> function has a custom register-based calling convention. This is common in hand-written assembly language, allowing you to tailor the calling convention to the usage pattern.<\/p>\n<p>You may have noticed that the <code>UpperChar<\/code> function doesn&#8217;t consult any code page tables to figure out which characters are uppercase and which are lowercase. It just hard-codes the special knowledge of code page 1252, which was the ANSI code page that Windows 1.0 used.<\/p>\n<p>In <a href=\"https:\/\/en.wikipedia.org\/wiki\/Windows-1252#Character_set\"> the layout of code page 1252<\/a>, the letters are in two blocks: One from A to Z, and another from \u00c0 to \u00de. Furthermore, the uppercase and lowercase versions are exactly 32 slots apart, so adding 32 gets you from uppercase to lowercase, and subtracting 32 gets you from lowercase to uppercase.<\/p>\n<p>Okay, back to <code>AnsiUpper<\/code>. If it turns out that we have a string, then the work is done by the <code>UpperString<\/code> function. This function takes advantage of the special <code>LODSB<\/code> and <code>STOSB<\/code> instructions to load a single byte from the string and to write a single byte to the string. These are single-byte instructions that replace two larger instructions (load a byte and increment the index register), so they are handy when trying to squeeze every code byte out of your program.<\/p>\n<p>You may have spotted some quirks in this conversion code.<\/p>\n<p>The <code>CharUpper<\/code> function treats U+00D7 \u00d7 as the uppercase version of U+00F7 \u00f7. If you ask for the lowercase version of the multiplication symbol, you get the division symbol, and conversely when converting from lowercase to uppercase.<\/p>\n<p>Another quirk is that the code doesn&#8217;t try to capitalize \u00df to SS. It just leaves it as \u00df. There is no uppercase \u1e9e in code page 1252.<\/p>\n<p>Believe it or not, there was a point to this exercise beyond just digging up ancient code designed under very different constraints and marveling how it worked. We&#8217;ll put this function into context next time.<\/p>\n<p>\u00b9 You might be tempted to use this:<\/p>\n<pre>inline char AnsiUpperChar(char c)\r\n{\r\n return reinterpret_cast&lt;char&gt;(\r\n    AnsiUpper(reinterpret_cast&lt;LPSTR&gt;(c)));\r\n}\r\n<\/pre>\n<p>but that doesn&#8217;t work because <code>char<\/code> is probably a signed type, so the <code>char<\/code> will be sign-extended, which means that a character in the <code>0x80<\/code> to <code>0xFF<\/code> range will produce a pointer of the form <code>0xFFFF:0xFFxx<\/code>. Since this does not have zero in the high word, it will be treated as a pointer and corrupt random memory.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Some retro-reverse-engineering and the weird micro-optimizations.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[2],"class_list":["post-104038","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-history"],"acf":[],"blog_post_summary":"<p>Some retro-reverse-engineering and the weird micro-optimizations.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/104038","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=104038"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/104038\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=104038"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=104038"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=104038"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}