{"id":110848,"date":"2025-02-07T07:00:00","date_gmt":"2025-02-07T15:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=110848"},"modified":"2025-02-07T08:21:26","modified_gmt":"2025-02-07T16:21:26","slug":"20250207-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20250207-00\/?p=110848","title":{"rendered":"Using alternate locales to get more interesting case mapping than the C"},"content":{"rendered":"<p>Last time, <a title=\"The default C locale is not a very interesting one\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20250206-00\/?p=110846\"> we saw that the default C locale is not a very interesting one<\/a>. So how do you get a locale that does something better?<\/p>\n<p>One way to get functions like <code>_strlwr<\/code> and <code>_wcslwr<\/code> to follow a specific locale is to set that other locale as the current C runtime locale.<\/p>\n<pre>\/\/ Set the C runtime locale for character\r\n\/\/ classification (which includes case mapping)\r\n\/\/ to the user's default locale\r\n_wsetlocale(LC_CTYPE, L\"\");\r\n\r\n\/\/ Now you can convert to lowercase in a locale-aware manner\r\nwchar_t example[] = L\"\\x00C0\" L\"BC\"; \/\/ \u00c0BC\r\n_wcslwr_s(example); \/\/ Result: probably \u00e0bc\r\n<\/pre>\n<p>It is convenient that an empty string is interpreted by <code>_wsetlocale()<\/code> to mean &#8220;the user&#8217;s default locale&#8221;, as determined by <code>Get\u00adUser\u00adDefault\u00adLocale\u00adName<\/code>.\u00b9<\/p>\n<p>A major problem with this approach is that it is <a title=\"Don't use global state to manage a local problem\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20081211-00\/?p=19873\"> using global state to solve a local problem<\/a>. The C runtime locale is a process-wide setting, so you changed the locale not just for your call to <code>_wcslwr_s<\/code>, but for everybody else&#8217;s call to <code>_wcslwr_s<\/code> as well.<\/p>\n<p>Better would be to leave the global locale alone and just say &#8220;For this call to <code>_wcslwr<\/code>, use the user&#8217;s default locale.&#8221;<\/p>\n<pre>\/\/ Create a locale that represents the user's default locale\r\nauto l = _wcreate_locale(LC_CTYPE, L\"\");\r\n\r\n\/\/ Convert to lowercase according to that locale\r\nwchar_t example[] = L\"\\x00C0\" L\"BC\"; \/\/ \u00c0BC\r\n_wcslwr_s_l(example, l); \/\/ Result: probably \u00e0bc\r\n<\/pre>\n<p>Even if you go all this trouble, you are still failing to handle the case where <a title=\"A popular but wrong way to convert a string to uppercase or lowercase\" href=\"https:\/\/devblogs.microsoft.com\/oldnewthing\/20241007-00\/?p=110345\"> changing the case of a string changes its length<\/a>. For that, you have to go to <code>LCMapStringEx<\/code> or the corresponding ICU function <code>u_strToLower<\/code> or <code>u_strToUpper<\/code>.<\/p>\n<pre>wchar_t example[] = L\"\\x00C0\" L\"BC\"; \/\/ \u00c0BC\r\n\r\n\/\/ Error checking elided for expository purposes\r\nwchar_t lowercase[256];\r\nLCMapStringEx(LOCALE_NAME_USER_DEFAULT,\r\n    LCMAP_LOWERCASE, example, ARRAYSIZE(example),\r\n    lowercase, ARRAYSIZE(lowercase),\r\n    nullptr, 0);\r\n\/\/ Result: probably \u00e0bc\r\n<\/pre>\n<p>Here&#8217;s a dirty little secret: When you call <code>_wcslwr<\/code> and the locale is not the C locale, then the Visual C++ runtime just calls <code>LCMapStringEx<\/code>. So you&#8217;re doing the same thing at the end of the day, just with the ability to accommodate strings that change length during a change of case.<\/p>\n<p><b>Bonus chatter<\/b>: <a href=\"https:\/\/doxygen.reactos.org\/d2\/d20\/wcslwr_8c_source.html\"> Not all implementations of <code>wcslwr<\/code><\/a> <a href=\"https:\/\/doxygen.reactos.org\/d3\/d42\/ctype_8c_source.html#l00901\"> or <code>towlower<\/code><\/a> are high quality.<\/p>\n<p>\u00b9 The user default locale may not be the best locale for your thread because the caller may have called a function like <code>Set\u00adThread\u00adLocale<\/code> or <code>Set\u00adThread\u00adPreferred\u00adUILanguages<\/code> to change the thread&#8217;s preferred locale to something other than the user&#8217;s default. You need to call a function like <code>Get\u00adThread\u00adPreferred\u00adUILanguages<\/code> to see those thread custom locales and pick the one (probably the first one) to use for case mapping.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Looking for something better.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-110848","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>Looking for something better.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/110848","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=110848"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/110848\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=110848"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=110848"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=110848"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}