{"id":7703,"date":"2012-05-04T07:00:00","date_gmt":"2012-05-04T07:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/2012\/05\/04\/how-does-the-multibytetowidechar-function-treat-invalid-characters\/"},"modified":"2012-05-04T07:00:00","modified_gmt":"2012-05-04T07:00:00","slug":"how-does-the-multibytetowidechar-function-treat-invalid-characters","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20120504-00\/?p=7703","title":{"rendered":"How does the MultiByteToWideChar function treat invalid characters?"},"content":{"rendered":"<p>\nThe <code>MB_ERR_INVALID_CHARS<\/code> flag\ncontrols how the\n<code>Multi&shy;Byte&shy;To&shy;Wide&shy;Char<\/code>\nfunction treats invalid characters.\nSome people claim that the following sentences in the documentation\nare contradictory:\n<\/p>\n<ul>\n<li>&#8220;Starting with Windows Vista, the function does not drop\n    illegal code points if the application does not set the flag.&#8221;<\/p>\n<li>&#8220;Windows XP: If this flag is not set,\n    the function silently drops illegal code points.&#8221;<\/p>\n<li>&#8220;The function fails if\n    <code>MB_ERR_INVALID_CHARS<\/code> is set\n    and an invalid character is encountered in the source string.&#8221;\n<\/ul>\n<p>\nActually, the three sentences are talking about different cases.\nThe first two talk about what happens if you omit the flag;\nthe third talks about what happens if you include the flag.\n<\/p>\n<p>\nSince people seem to like tables, here&#8217;s a description of\nthe <code>MB_ERR_INVALID_CHARS<\/code> flag\nin tabular form:\n<\/p>\n<table BORDER=\"1\" STYLE=\"border-collapse: collapse;border: solid .75pt black\">\n<tr>\n<th><code>MB_ERR_INVALID_CHARS<\/code> set?<\/th>\n<th>Operating system<\/th>\n<th>Treatment of invalid character<\/th>\n<\/tr>\n<tr>\n<td>Yes<\/td>\n<td>Any<\/td>\n<td>Function fails<\/td>\n<\/tr>\n<tr>\n<td ROWSPAN=\"2\">No<\/td>\n<td>XP and earlier<\/td>\n<td>Character is dropped<\/td>\n<\/tr>\n<tr>\n<td>Vista and later<\/td>\n<td>Character is not dropped<\/td>\n<\/tr>\n<\/table>\n<p>\nHere&#8217;s a sample program that illustrates the possibilities:\n<\/p>\n<pre>\n#include &lt;windows.h&gt;\n#include &lt;ole2.h&gt;\n#include &lt;windowsx.h&gt;\n#include &lt;commctrl.h&gt;\n#include &lt;strsafe.h&gt;\n#include &lt;uxtheme.h&gt;\nvoid MB2WCTest(DWORD flags)\n{\n WCHAR szOut[256];\n int cch = MultiByteToWideChar(CP_UTF8, flags,\n                               \"\\xC0\\x41\\x42\", 3, szOut, 256);\n printf(\"Called with flags %d\\n\", flags);\n printf(\"Return value is %d\\n\", cch);\n for (int i = 0; i &lt; cch; i++) {\n  printf(\"value[%d] = %d\\n\", i, szOut[i]);\n }\n printf(\"-----\\n\");\n}\nint __cdecl main(int argc, char **argv)\n{\n MB2WCTest(0);\n MB2WCTest(MB_ERR_INVALID_CHARS);\n return 0;\n}\n<\/pre>\n<p>\nIf you run this on Windows&nbsp;XP, you get\n<\/p>\n<pre>\nCalled with flags 0\nReturn value is 2\nValue[0] = 65\nValue[1] = 66\n-----\nCalled with flags 8\nReturn value is 0\n-----\n<\/pre>\n<p>\nThis demonstrates that passing the\n<code>MB_ERR_INVALID_CHARS<\/code> flag\ncauses the function to fail,\nand omitting it causes\nthe invalid character \\xC0 to be dropped.\n<\/p>\n<p>\nIf you run this on Windows&nbsp;Vista, you get\n<\/p>\n<pre>\nCalled with flags 0\nReturn value is 3\nValue[0] = 65533\nValue[1] = 65\nValue[2] = 66\n-----\nCalled with flags 8\nReturn value is 0\n-----\n<\/pre>\n<p>\nThis demonstrates again that passing the\n<code>MB_ERR_INVALID_CHARS<\/code> flag\ncauses the function to fail,\nbut this time, if you omit the flag,\nthe invalid character \\xC0 is converted to U+FFFD,\nwhich is\n<a HREF=\"http:\/\/en.wikipedia.org\/wiki\/Specials (Unicode block)#Replacement_character\">\nREPLACEMENT CHARACTER<\/a>.\n(Note that it does not appear to be documented precisely\n<i>what<\/i> happens to invalid characters, aside from the fact\nthat they are not dropped.\nPerhaps code pages other than <code>CP_UTF8<\/code> convert\nthem to some other default character.)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The MB_ERR_INVALID_CHARS flag controls how the Multi&shy;Byte&shy;To&shy;Wide&shy;Char function treats invalid characters. Some people claim that the following sentences in the documentation are contradictory: &#8220;Starting with Windows Vista, the function does not drop illegal code points if the application does not set the flag.&#8221; &#8220;Windows XP: If this flag is not set, the function silently drops [&hellip;]<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-7703","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>The MB_ERR_INVALID_CHARS flag controls how the Multi&shy;Byte&shy;To&shy;Wide&shy;Char function treats invalid characters. Some people claim that the following sentences in the documentation are contradictory: &#8220;Starting with Windows Vista, the function does not drop illegal code points if the application does not set the flag.&#8221; &#8220;Windows XP: If this flag is not set, the function silently drops [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/7703","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=7703"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/7703\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=7703"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=7703"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=7703"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}