{"id":110846,"date":"2025-02-06T07:00:00","date_gmt":"2025-02-06T15:00:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/oldnewthing\/?p=110846"},"modified":"2025-02-06T09:40:43","modified_gmt":"2025-02-06T17:40:43","slug":"20250206-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20250206-00\/?p=110846","title":{"rendered":"The default C locale is not a very interesting one"},"content":{"rendered":"<p>Although the C and C++ languages provide facilities for localization, the default locale is the so-called &#8220;C&#8221; locale, which barely understands anything.<\/p>\n<p>In the &#8220;C&#8221; locale, the uppercase characters are &#8220;<tt>A<\/tt>&#8221; through &#8220;<tt>Z<\/tt>&#8220;; the lowercase characters are &#8220;<tt>a<\/tt>&#8221; through &#8220;<tt>z<\/tt>&#8220;, the decimal separator is &#8220;<tt>.<\/tt>&#8220;, and there is no thousands separator.<\/p>\n<p>The &#8220;C&#8221; locale is designed to be minimal. But it also means that unless you&#8217;ve taken special efforts to change your process&#8217;s locale to something else, functions like <code>towupper<\/code> and <code>_wcslwr<\/code> produce only extremely rudimentary results. All they know is the characters in the 7-bit ASCII set. They don&#8217;t even know that the uppercase version of <tt>\u00e4<\/tt> is <tt>\u00c4<\/tt>.<\/p>\n<p>Support for any locales beyond the &#8220;C&#8221; locale is implementation-defined, and the standard considers it a quality of implementation issue. Microsoft&#8217;s Visual C++ compiler uses BCP47 for locale names, like <tt>sr-Cyrl-BA<\/tt> for &#8220;Serbian, Cyrillic script, as used in Bosnia and Herzegovina.&#8221; The gcc library appears to use <a href=\"https:\/\/www.gnu.org\/software\/libc\/manual\/html_node\/Locale-Names.html\">a custom format<\/a>, such as <tt>de_<wbr \/>AT.<wbr \/>iso885915@<wbr \/>euro<\/tt> for &#8220;German, as used in Austria, using the ISO-8859-15 character set and the Euro as the currency.&#8221;<\/p>\n<p>This means that if you just dive in and call <code>towlower<\/code> without doing any locale preparation, all you&#8217;re going to get support for is characters U+0041 (LATIN CAPITAL LETTER A) through U+005A (LATIN CAPITAL LETTER Z) mapping to U+0061 (LATIN SMALL LETTER A) through U+007A (LATIN SMALL LETTER Z).<\/p>\n<p>The Microsoft Visual C++ compiler standard library comes with bonus functions like <code>_strlwr<\/code> and <code>_wcslwr<\/code> for converting strings to lowercase. By default, these follow the current C runtime locale, so again, if you don&#8217;t do any locale preparation, you&#8217;re going to get the na\u00efve case mapping.<\/p>\n<pre>wchar_t example[] = L\"\\x00C0\" L\"BC\"; \/\/ \u00c0BC\r\n_wcslwr_s(example); \/\/ Result: \u00c0bc\r\n<\/pre>\n<p>Next time, we&#8217;ll look at how to get <code>_wcslwr<\/code> to operate on more interesting locales than the C locale.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It barely understands anything.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-110846","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>It barely understands anything.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/110846","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=110846"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/110846\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=110846"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=110846"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=110846"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}