{"id":71,"date":"2021-04-12T16:19:28","date_gmt":"2021-04-12T23:19:28","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/i18n\/?p=71"},"modified":"2021-04-12T16:19:48","modified_gmt":"2021-04-12T23:19:48","slug":"culture-data-shouldnt-be-considered-stable","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/i18n\/culture-data-shouldnt-be-considered-stable\/","title":{"rendered":"Culture data shouldn&#8217;t be considered stable"},"content":{"rendered":"<p>I thought I&#8217;d start off with a topic I&#8217;ve discussed <a href=\"https:\/\/docs.microsoft.com\/en-us\/archive\/blogs\/shawnste\/culture-data-shouldnt-be-considered-stable-except-for-invariant\">before<\/a> on my old blog.\u00a0 It comes up every once in a while, so it doesn&#8217;t hurt to have a reminder and update.<\/p>\n<h2><span style=\"font-size: 24pt;\">&#8220;Culture data shouldn&#8217;t be considered stable&#8221;<\/span><\/h2>\n<p>Computers like to manipulate data, but eventually that data needs to be presented in a form that a human person can easily understand.\u00a0 Culture, aka Locale, Region, or Language information, is required so that programs can present data in a human readable form.\u00a0 Things like 3\/12\/21 vs 12-3-2021, or March vs M\u00e4rz vs marzo.\u00a0 1,21 compared to 1.21, etc.\u00a0 Values often need to be displayed in a form that actual humans expect for their culture.<\/p>\n<p>It is fairly easy for a developer to learn and understand that users in some regions have different expectations.\u00a0 Preference of m\/d\/y or d\/m\/y date formats is probably one of the first localization things most developers discover.\u00a0 What is less obvious is that the preference may not be the same next year or week.\u00a0 Or, worse, that there are less obvious cultural variations, and those might vary as well.<\/p>\n<p>Some people might learn about a cultural change, such as moving Daylight Savings Time, or the revaluation of a currency.\u00a0 Those might be newsworthy enough to notice, particularly if its a culture of interest to the developer.\u00a0 These events can make it appear that these changes are rare, or even historical, events.\u00a0 It is easy to miss the impact of cultural data changes on modern applications, particularly those with a world-wide audience.<\/p>\n<h2><span style=\"font-size: 18pt;\">Cultural Expectations<\/span><\/h2>\n<p>Locale\/Culture data represents a cultural, regional, admin and\/or user preference for cultural expectations.\u00a0 Applications should NOT make any assumptions that rely on this data being stable.\u00a0 This premise holds regardless of the operating system or platform the application is using.\u00a0 .Net relies on the OS data.\u00a0 Windows typically relies on &#8220;NLS&#8221; (National Language Support) data that has been collected over years.\u00a0 Some applications use ICU (International Components for Unicode).\u00a0 Most other OS&#8217;s base their support of a version of ICU tailored for their business needs.<\/p>\n<p>Right off the bat, that paragraph should make it obvious that all of these components and applications are getting data from different sources, and so it follows pretty easily that the information one piece of software provides may differ on another platform or application.\u00a0 What is less obvious is that this data can change over time.<\/p>\n<h2><span style=\"font-size: 24pt;\">Reasons Cultural Data Changes<\/span><\/h2>\n<p>There are many reasons that culture data can change, here are a few:<\/p>\n<ul>\n<li>The most obvious reason is that there is a bug in the data that was corrected.\u00a0 (Believe it or not platforms make mistakes ;-))\u00a0 In this case our users (and yours too) want culturally correct data, so we have to fix the bug even if it breaks existing applications.<\/li>\n<li>Another reason is that cultural preferences can change.\u00a0 There&#8217;re lots of ways this can happen, but it does happen:\n<ul>\n<li>Global awareness, cross cultural exchange, the changing role of computers and so forth can all effect a cultural preference.<\/li>\n<li>International treaties, trade, etc. can change values.\u00a0 The adoption of the Euro changed many countries currency symbol to \u20ac.<\/li>\n<li>National or regional regulations can impact these values too.<\/li>\n<li>Preferred spelling of words can change over time.<\/li>\n<li>Preferred date formats, etc. can shift to attempt to address ambiguity or to better fit their neighbors.<\/li>\n<li>For many folks it is hard to imagine this stuff changing, one of the most obvious that some people may have encountered is changing preferences for Daylight Savings Time.<\/li>\n<\/ul>\n<\/li>\n<li>Multiple preferences may exist for a culture.\u00a0 The preferred best choice can then change over time.<\/li>\n<li>Data may be subject to periodic changes that make users feel like it is stable in the moment.\u00a0 However, that data may have always been expected to shift.\n<ul>\n<li>Early developers used a convention of 2 digit year forms.\u00a0 With the year 2000, &#8220;Y2K&#8221; made it obvious that 4 digit years were needed in many cases.\u00a0 Twenty years later people are shifting back to the 2 digit abbreviations.<\/li>\n<li>The Japanese Calendar adds Eras, which is typically a generational event making the era seems stable in the moment.\u00a0 But then shifts like the addition of the Reiwa era remind users and developers that the perceived stability would one day change.<\/li>\n<\/ul>\n<\/li>\n<li>Users could have overridden some values, like date or time formats.\u00a0 Some platforms allow requesting locale data without these user overrides, however we recommend that applications respect user preferences as those indicate what the user indicated they desired.\u00a0 Apps shouldn&#8217;t be second guessing the user&#8217;s cultural needs.<\/li>\n<li>Users or administrators could have created a replacement culture, replacing common default values for a culture with company specific, regional specific, or other variations of the standard data.\n<ul>\n<li>Some cultures may have preferences that vary depending on the setting.\u00a0 A business might have a more formal form than an Internet Caf\u00e9.<\/li>\n<li>An enterprise may require a specific date format or time format for the entire organization.<\/li>\n<li>One obvious case is a 12 hour or 24 hour clock preference in locales where either can be used.<\/li>\n<\/ul>\n<\/li>\n<li>Differing versions of the same custom culture, or one that&#8217;s custom on one machine and a windows only culture on another machine.<\/li>\n<li>Data could originate on different machines, devices, platforms or architectures.\u00a0 Even when they use the same source for cultural information, those systems could be on different revisions.<\/li>\n<\/ul>\n<h2><span style=\"font-size: 24pt;\">Pitfalls of Changing Culture Information<\/span><\/h2>\n<p>This topic probably wouldn&#8217;t be as interesting if there weren&#8217;t some serious traps that apps can fall into when they don&#8217;t consider the shifting nature of culture data.<\/p>\n<p>For example, a common operation is to format a string with a particular date format.\u00a0 Then the app may want to try to parse that string value later, returning the original date.\u00a0 What happens if the machine changed, if the framework version changed (newer data), if the platform changed, if a custom culture was modified, or even if the user just changes their preference from M\/D\/Y to D\/M\/Y?<\/p>\n<p>Apps that persist data in a human form and try to recover that in a machine format later are at risk.\u00a0 The form that is useful for a person is typically more ambiguous than something a computer would try to consume.<\/p>\n<p><span style=\"font-size: 24pt;\">Avoiding the Trap of Changing Information<\/span><\/p>\n<p>There are some patterns that developers can use to avoid difficulty with changing cultural preferences.<\/p>\n<h2><span style=\"font-size: 18pt;\">Remember the Audience<\/span><\/h2>\n<p>Machines need unambiguous representations of data.\u00a0 Humans need data formatted in the manner they are accustomed to.<\/p>\n<h3><span style=\"font-size: 14pt;\">Techniques for Machines<\/span><\/h3>\n<p>Oftentimes machine data is stored in a well-defined binary form.\u00a0 Other times it&#8217;s exchanged through XML or json type formats.\u00a0 The key point is to ensure that the data is stored in a well-defined and consistent format.\u00a0 Oftentimes this is a standards based format, like ISO 8601 date formats.<\/p>\n<p>Particular care should be used when creating new protocols and storage mechanisms.\u00a0 It can be unfortunate to allow a dependency on a linguistic locale and then find that the behavior shifts over time.<\/p>\n<p>Machine compatible data should be processed in a non-linguistic manner for consistency.\u00a0 For string formatting in .Net and Windows, an &#8220;Invariant&#8221; Culture (Locale) is available.\u00a0 The expectation is that CultureInfo.InvariantCulture provides stable formatting over time.\u00a0 (Although even InvariantCulture can shift for comparisons (collation)).<\/p>\n<p>I prefer formats that are explicit and can be handled through something like a simple sscanf() call rather than a more complex parser &#8211; though sscanf still needs to be be sure to use the &#8220;C&#8221; locale to avoid problems like variations of decimal separators.<\/p>\n<p>Whether snapping to an existing standard or creating your own data type, the key point for machine readable data is to ensure that the format is explicitly defined and consistent.<\/p>\n<h3><span style=\"font-size: 14pt;\">Making Humans Happy<\/span><\/h3>\n<p>Of course you can&#8217;t have both &#8220;correct&#8221; display for the current user and perfect round tripping if the culture data changes. The earlier machine techniques help prevent problems corruption of data, but may not be great for humans.<\/p>\n<p>The key point for human presentation is to recognize when the applications context has moved on from data storage to presentation.\u00a0 When finally presenting data to the user is the appropriate time to use the culture specific behavior to satisfy the expectations of the user.<\/p>\n<p>After making the data pretty and formatting it for human presentation, the application should then recognize that the human formatted data is no longer appropriate for machine processing and interchange.\u00a0 If a bunch of subsidiaries are collecting data and sending it up to their headquarters, they should likely transmit the machine readable data rather than the human formatted reports.\u00a0 Particularly if those subsidiaries are in different locales with different expectations.<\/p>\n<h2><span style=\"font-size: 24pt;\">Conclusion<\/span><\/h2>\n<p>Now we&#8217;re back to the beginning:\u00a0 &#8220;Culture data shouldn&#8217;t be considered stable.&#8221;\u00a0 By keeping the context in mind, we prevent problems and errors exchanging and interpreting computerized data.\u00a0 Go ahead and show real people the pretty data that they expect, but make sure the machines on the back end have the orderly versions they need.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Realizing that people&#8217;s preferences change and that those preferences impact data output and consumed by computers.  By remembering the context of an operation we can avoid corruption and confusion of our data.  Understanding when data is intended for human readability or consumption by another machine avoids problems with data confusion.<\/p>\n","protected":false},"author":17042,"featured_media":6,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[5,4],"tags":[9,8,11],"class_list":["post-71","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cultures-locales","category-data","tag-culture-locale","tag-data","tag-tag"],"acf":[],"blog_post_summary":"<p>Realizing that people&#8217;s preferences change and that those preferences impact data output and consumed by computers.  By remembering the context of an operation we can avoid corruption and confusion of our data.  Understanding when data is intended for human readability or consumption by another machine avoids problems with data confusion.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/i18n\/wp-json\/wp\/v2\/posts\/71","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/i18n\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/i18n\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/i18n\/wp-json\/wp\/v2\/users\/17042"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/i18n\/wp-json\/wp\/v2\/comments?post=71"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/i18n\/wp-json\/wp\/v2\/posts\/71\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/i18n\/wp-json\/wp\/v2\/media\/6"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/i18n\/wp-json\/wp\/v2\/media?parent=71"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/i18n\/wp-json\/wp\/v2\/categories?post=71"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/i18n\/wp-json\/wp\/v2\/tags?post=71"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}