{"id":11773,"date":"2011-01-12T07:00:00","date_gmt":"2011-01-12T07:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/2011\/01\/12\/my-what-strange-nops-you-have\/"},"modified":"2011-01-12T07:00:00","modified_gmt":"2011-01-12T07:00:00","slug":"my-what-strange-nops-you-have","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20110112-00\/?p=11773","title":{"rendered":"My, what strange NOPs you have!"},"content":{"rendered":"<p>\nWhile cleaning up my office, I ran across some old documents\nwhich reminded me that there are a lot of weird NOP instructions\nin Windows&nbsp;95.\n<\/p>\n<p>\nCertain early versions of the 80386 processor\n(manufactured prior to 1987) are known as\n<i>B1 stepping<\/i> chips.\nThese early versions of the 80386 had some obscure bugs that\naffected Windows.\nFor example,\nif the instruction following\na string operation (such as <code>movs<\/code>)\nuses opposite-sized addresses from that in the string instruction\n(for example, if you performed a <code>movs es:[edi], ds:[esi]<\/code>\nfollowed by a <code>mov ax, [bx]<\/code>)\nor if the following instruction accessed an opposite-sized stack\n(for example, if you performed a <code>movs es:[edi], ds:[esi]<\/code>\non a 16-bit stack, and the next instruction was a <code>push<\/code>),\nthen the <code>movs<\/code> instruction would not operate correctly.\nThere were quite a few of these tiny little\n&#8220;if all the stars line up exactly right&#8221; chip bugs.\n<\/p>\n<p>\nMost of the chip bugs only affected mixed 32-bit and 16-bit operations,\nso if you were running pure 16-bit code or pure 32-bit code,\nyou were unlikely to encounter any of them.\nAnd since Windows&nbsp;3.1 did very little mixed-bitness programming\n(user-mode code was all-16-bit and kernel-mode code was all-32-bit),\nthese defects didn&#8217;t really affect Windows&nbsp;3.1.\n<\/p>\n<p>\nWindows&nbsp;95, on the other hand, contained a lot of mixed-bitness\ncode since it was the transitional operating system that brought\npeople using Windows out of the 16-bit world into the 32-bit world.\nAs a result, code sequences that tripped over these little chip\nbugs turned up not infrequently.\n<\/p>\n<p>\nAn executive decision had to be made whether to continue supporting\nthese old chips or whether to abandon them.\nA preliminary market analysis of potential customers showed that\nthere were enough computers running old 80386 chips to be worth\nmaking the extra effort to support them.\n<\/p>\n<p>\nEverybody who wrote assembly language code was alerted to the various\ncode sequences that would cause problems on a B1 stepping, so that\nthey wouldn&#8217;t generate those code sequences themselves, and so they\ncould be on the lookout for existing code that might have problems.\nTo supplement the manual scan,\nI wrote a program that studied all the Windows&nbsp;95 binaries\ntrying to find these troublesome code sequences.\nWhen it brought one to my attention, I studied the offending code,\nand if I agreed with the program&#8217;s assessment, I notified the developer\nwho was responsible for the component in question.\n<\/p>\n<p>\nIn nearly all cases, the troublesome code sequences could be fixed\nby judicious insertion of <code>NOP<\/code> statements.\nIf the problem was caused by\n&#8220;instruction of type&nbsp;X followed by instruction of type&nbsp;Y&#8221;,\nthen you can just insert a <code>NOP<\/code> between the two instructions\nto &#8220;break up the party&#8221; and sidestep the problem.\nSometimes, the standard <code>NOP<\/code> would end up classified\nas an instruction of type&nbsp;Y,\nso you had to insert a <i>special kind of <code>NOP<\/code><\/i>,\none that was not of type&nbsp;Y.\n<\/p>\n<p>\nFor example, here&#8217;s one code sequence from a function\nwhich does color format conversion:\n<\/p>\n<pre>\n        push    si          ; borrow si temporarily\n        ; build second 4 pixels\n        movzx   si, bl\n        mov     ax, redTable[si]\n        movzx   si, cl\n        or      ax, blueTable[si]\n        movzx   si, dl\n        or      ax, greenTable[si]\n        shl     eax, 16     ; move pixels to high word\n        ; build first 4 pixels\n        movzx   si, bh\n        mov     ax, redTable[si]\n        movzx   si, ch\n        or      ax, blueTable[si]\n        movzx   si, dh\n        or      ax, greenTable[si]\n        pop     si\n        stosd   es:[edi]    ; store 8 pixels\n        <u>db      67h, 90h    ; 32-bit NOP fixes stos (B1 stepping)<\/u>\n        dec     wXE\n<\/pre>\n<p>\nNote that we couldn&#8217;t use just any old <code>NOP<\/code>;\nwe had to use a <code>NOP<\/code> with a 32-bit address override prefix.\nThat&#8217;s right, this isn&#8217;t just a regular <code>NOP<\/code>;\nthis is a <i>32-bit <code>NOP<\/code><\/i>.\n<\/p>\n<p>\nFrom a B1 stepping-readiness standpoint,\nthe folks who wrote in C had a little of the good news\/bad news thing going.\nThe good news is that the compiler did the code generation and you\ndidn&#8217;t need to worry about it.\nThe bad news is that\nyou also were dependent on the compiler writers to have\ntaught their code generator how to avoid these B1 stepping pitfalls,\nand some of them were quite subtle.\n(For example, there was one bug that manifested itself in incorrect\ninstruction decoding if\na conditional branch instruction had just the right sequence\nof taken\/not-taken history, and the branch instruction was followed\nimmediately by a selector load, <i>and<\/i> one of the first two\ninstructions at the destination of the branch was itself a jump, call,\nor return.\nThe easy workaround: Insert a <code>NOP<\/code> between the branch\nand the selector load.)\n<\/p>\n<p>\nOn the other hand, some quirks of the B1 stepping were easy to sidestep.\nFor example,\nthe B1 stepping did not support virtual memory in the first 64KB of memory.\nFine, don&#8217;t use virtual memory there.\nIf virtual memory was enabled,\nif a certain race condition was encountered inside the hardware\nprefetch,\nand if you executed a floating point coprocessor instruction\nthat accessed memory at an address in the range 0x800000F8 through\n0x800000FF,\nthen the CPU would end up reading from addresses 0x000000F8\nthrough 0x0000000FF instead.\nThis one was easy to work around:\nNever allocate valid memory at 0x80000xxx.\nAnother reason for the\n<a HREF=\"http:\/\/blogs.msdn.com\/oldnewthing\/archive\/2003\/10\/08\/55239.aspx\">\nno man&#8217;s land<\/a> in the address space near the 2GB boundary.\n<\/p>\n<p>\nI happened to have an old computer with a B1 stepping in my office.\nIt ran slowly, but it did run.\nI think the test team &#8220;re-appropriated&#8221; the computer for their labs\nso they could verify that Windows&nbsp;95 still ran correctly on a\ncomputer with a B1 stepping CPU.\n<\/p>\n<p>\nLate in the product cycle (after Final Beta),\nupper management reversed their earlier decision and decide not to\nsupport the B1 chip after all.\nMaybe the testers were finding too many bugs where other subtle\nB1 stepping bugs were being triggered.\nMaybe the cost of having to keep an eye on all the source code\n(and training\/retraining all the developers to be aware of B1 issues)\nexceeded the benefit of supporting a shrinking customer base.\nFor whatever reason, B1 stepping support was pulled,\nand customers with one of these older chips got an error message\nwhen they tried to install Windows&nbsp;95.\nAnd just to make it easier for the product support people to recognize\nthis failure,\nthe error code for the error message was\n<a HREF=\"http:\/\/support.microsoft.com\/kb\/119118\">\nError&nbsp;B1<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>While cleaning up my office, I ran across some old documents which reminded me that there are a lot of weird NOP instructions in Windows&nbsp;95. Certain early versions of the 80386 processor (manufactured prior to 1987) are known as B1 stepping chips. These early versions of the 80386 had some obscure bugs that affected Windows. [&hellip;]<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[2],"class_list":["post-11773","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-history"],"acf":[],"blog_post_summary":"<p>While cleaning up my office, I ran across some old documents which reminded me that there are a lot of weird NOP instructions in Windows&nbsp;95. Certain early versions of the 80386 processor (manufactured prior to 1987) are known as B1 stepping chips. These early versions of the 80386 had some obscure bugs that affected Windows. [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/11773","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=11773"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/11773\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=11773"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=11773"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=11773"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}