{"id":93163,"date":"2016-03-16T07:00:00","date_gmt":"2016-03-16T21:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/?p=93163"},"modified":"2019-03-13T10:31:33","modified_gmt":"2019-03-13T17:31:33","slug":"20160316-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20160316-00\/?p=93163","title":{"rendered":"Randomly-generated passwords still have to be legal strings"},"content":{"rendered":"<p>If you need to generate a password for programmatic use, then you don&#8217;t have to worry about generating characters that are difficult or impossible to type on a keyboard. Go ahead and mix Cyrillic with Vietnamese and throw in some Linear B while you&#8217;re at it. There is no keyboard that can type all of these characters, but it doesn&#8217;t matter because nobody will be typing it. <\/p>\n<p>However, you should make sure that your password is a legal string. <\/p>\n<blockquote CLASS=\"q\">\n<p>We generate our password from a cryptographically secure random number generator. Basically, we take 256 random bits and treat them as sixteen 16-bit values. (If one of the 16-bit values is zero, then we ask for 16 more bits.) <\/p>\n<p>We found that sometimes (no predictable pattern), we have interoperability problems between systems. The password produced by one system is not recognized by the other. <\/p>\n<\/blockquote>\n<p>After much investigation, the problem was traced back to the fact that taking a bunch of non-null 16-bit values and declaring them to be a Unicode (UTF-16LE) string does not always result in a valid Unicode string. <\/p>\n<p>UTF-16 has the concept of <i>surrogate pairs<\/i>, which encode characters outside the BMP as a pair of 16-bit values. The first entry in the pair is a <i>high surrogate<\/i> in the range <code>0xD800<\/code>&ndash;<code>0xDBFF<\/code>, and the second is a <i>low surrogate<\/i> in the range <code>0xDC00<\/code>&ndash;<code>0xDFFF<\/code>. <a HREF=\"https:\/\/en.wikipedia.org\/wiki\/UTF-16#U.2B10000_to_U.2B10FFFF\">Together, they encode a character in a supplementary plane<\/a>. <\/p>\n<p>If your randomly-generated string contains a value in the range <code>0xD800<\/code>&ndash;<code>0xDFFF<\/code>, then unless you are very lucky, it will not be part of a valid surrogate pair. The string is therefore not well-formed, and various parts of the system might decide to reject them with <code>ERROR_INVALID_PARAMETER<\/code>, or they might &#8220;fix&#8221; the problem by changing the illegal values to <code>U+FFFD<\/code>, the <a HREF=\"https:\/\/en.wikipedia.org\/wiki\/Specials_(Unicode_block)#Replacement_character\">Unicode Replacement Character<\/a>, which is used for unknown or unrepresentable character. For example, if the protocol specifies that the password is transmitted in UTF-8, then the presence of an unpaired surrogate causes the conversion from UTF-16 to UTF-8 to fail, and consequently, the password fails to replicate to the other machine. <\/p>\n<p>If you want to generate a random password, make sure your algorithm produces legal character sequences. A simple solution is to generate the desired amount of entropy, then hex-encode it. Yes, it isn&#8217;t very space-efficient, but it gets the job done. (Assuming you don&#8217;t have to meet password complexity rules.) <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Well-formed strings according to the encoding.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-93163","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>Well-formed strings according to the encoding.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/93163","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=93163"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/93163\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=93163"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=93163"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=93163"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}