{"id":39833,"date":"2004-04-13T07:00:00","date_gmt":"2004-04-13T07:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/2004\/04\/13\/unicode-collation-is-hard\/"},"modified":"2004-04-13T07:00:00","modified_gmt":"2004-04-13T07:00:00","slug":"unicode-collation-is-hard","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20040413-00\/?p=39833","title":{"rendered":"Unicode collation is hard"},"content":{"rendered":"<p>  The principle of &#8220;garbage in, garbage out&#8221; applies to Unicode collation.  If you hand it a meaningless string and ask to compare it to another  meaningless string, you get meaningless results.  <\/p>\n<p>  I am not a Unicode expert; I just play one on the web.  A real Unicode expert is Michael Kaplan,  whose  <a href=\"http:\/\/groups.google.com\/groups?&amp;selm=ePxLtVAnDHA.2244%40TK2MSFTNGP12.phx.gbl\">  explanation of how comparing invalid Unicode strings result  in nonsensical results<\/a>  I strongly recommend to those who attempt to generate  random test strings in Unicode.  <\/p>\n","protected":false},"excerpt":{"rendered":"<p>The principle of &#8220;garbage in, garbage out&#8221; applies to Unicode collation. If you hand it a meaningless string and ask to compare it to another meaningless string, you get meaningless results. I am not a Unicode expert; I just play one on the web. A real Unicode expert is Michael Kaplan, whose explanation of how [&hellip;]<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-39833","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>The principle of &#8220;garbage in, garbage out&#8221; applies to Unicode collation. If you hand it a meaningless string and ask to compare it to another meaningless string, you get meaningless results. I am not a Unicode expert; I just play one on the web. A real Unicode expert is Michael Kaplan, whose explanation of how [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/39833","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=39833"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/39833\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=39833"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=39833"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=39833"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}