{"id":108763,"date":"2023-09-13T07:00:00","date_gmt":"2023-09-13T14:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=108763"},"modified":"2023-09-13T09:53:41","modified_gmt":"2023-09-13T16:53:41","slug":"20230913-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20230913-00\/?p=108763","title":{"rendered":"How do I perform a case-insensitive comparison of two strings in the Deseret script?"},"content":{"rendered":"<p>A customer reported that they were having difficulty sorting strings in the Deseret script in a case-insensitive manner. The <code>CompareStringEx<\/code> function, when called with the <code>LOCALE_<wbr \/>NAME_<wbr \/>INVARIANT<\/code> locale and the <code>NORM_<wbr \/>IGNORE\u00adCASE<\/code> flag, reported the strings as unequal even if they differed only in case.<\/p>\n<pre>\/\/ U+10412 (DESERET CAPITAL LETTER BEE)\r\nwchar_t Bee[] = L\"\\xD801\\xDC12\";\r\n \r\n\/\/ U+1043A (DESERET SMALL LETTER BEE)\r\nwchar_t bee[] = L\"\\xD801\\xDC3A\";\r\n \r\nauto result = CompareStringEx(LOCALE_NAME_INVARIANT,\r\nNORM_IGNORECASE, Bee, -1, bee, -1, NULL, NULL, 0));\r\n<\/pre>\n<p>The customer suspected that they were using the wrong locale, but they couldn&#8217;t find a Deseret locale. Is there one?<\/p>\n<p>No, Windows does not have a Deseret locale, and that means that there is no custom sorting information for the Deseret script, which means that the Windows locale system doesn&#8217;t know that U+10412 and U+1043A are case variants.<\/p>\n<p>The <a href=\"https:\/\/cldr.unicode.org\/\"> Unicode Common Locale Data Repository<\/a> (CLDR) does have an entry for Deseret, known as <a href=\"https:\/\/github.com\/unicode-org\/cldr\/blob\/main\/seed\/main\/en_Dsrt.xml\">en_Dsrt<\/a>, but it is a seed locale, meaning that it contains only minimal data and consequently is of limited\/questionable quality.<\/p>\n<p>The locale team was curious about how the customer is using Deseret script, because the next step will vary depending on the use case.<\/p>\n<p>If the customer has a real-world corpus of text in Deseret script that they are processing, then they should share some details with the Windows locale team so that they can better understand how Deseret script is being used by customers, which may affect prioritization of future work. In the meantime, the customer can use the <a href=\"http:\/\/site.icu-project.org\/\"> International Components for Unicode<\/a> (ICU) library (<a href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20210527-00\/?p=105255\">now included with Windows<\/a>) to collate strings encoded in Deseret script. For example, <a href=\"https:\/\/unicode-org.github.io\/icu\/userguide\/collation\/api.html#compare\"> the <code>ucol_strcoll<\/code> function<\/a> understands the case mapping rules for Deseret script.<\/p>\n<p>If the customer merely noticed this discrepancy while running automated testing over the entire Unicode character set, then adding support in their program for text in the Deseret script may very well be unnecessary, since the usage case was artificial. If they have no organic use of Deseret script among their install base, it could be a significant architectural change to their program for no real-world benefit. In that case, they should just mark those tests as exceptions.<\/p>\n<p>The team speculated that maybe the customer was doing genealogy or processing historical documents. (They ruled out the possibility that the customer was just tinkering around as a hobby, since those types of customers typically wouldn&#8217;t bother their customer liaison about it, <a title=\"How do I change among the three levels of play in Space Cadet Pinball?\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20140513-00\/?p=1003\"> though sometimes they do it anyway<\/a>.) My quick reading of the Wikipedia page for the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Deseret_alphabet\"> Deseret alphabet<\/a> suggests that the script underwent reform during its brief lifetime. If that reform included changes to collation (I don&#8217;t know whether it did), then the customer will also have to take into account <i>when<\/i> the text was generated in order to collate it correctly.<\/p>\n<p>We never heard back from the customer.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It sort of depends on why you&#8217;re comparing them.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-108763","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>It sort of depends on why you&#8217;re comparing them.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/108763","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=108763"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/108763\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=108763"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=108763"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=108763"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}