{"id":5683,"date":"2004-05-18T09:17:00","date_gmt":"2004-05-18T09:17:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/buckh\/2004\/05\/18\/converting-a-text-file-from-one-encoding-to-another\/"},"modified":"2004-05-18T09:17:00","modified_gmt":"2004-05-18T09:17:00","slug":"converting-a-text-file-from-one-encoding-to-another","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/buckh\/converting-a-text-file-from-one-encoding-to-another\/","title":{"rendered":"Converting a text file from one encoding to another"},"content":{"rendered":"<p><P>The .NET framework handles file encodings very nicely.&nbsp; Not too long ago, I needed to convert files from one encoding to another for a library that didn&#8217;t handle the encoding of the original file.&nbsp; Since someone on an internal alias asked about doing this a couple of weeks ago, I thought it would be useful to post it here.<\/P>\n<P>The .NET runtime&nbsp;uses <A href=\"http:\/\/www.unicode.org\/\">Unicode<\/A>&nbsp;as the encoding for all strings.&nbsp; The <FONT face=\"Courier New\"><A href=\"http:\/\/msdn.microsoft.com\/library\/default.asp?url=\/library\/en-us\/cpref\/html\/frlrfSystemIOStreamReaderClassTopic.asp\">StreamReader<\/A><\/FONT> and <FONT face=\"Courier New\"><A href=\"http:\/\/msdn.microsoft.com\/library\/default.asp?url=\/library\/en-us\/cpref\/html\/frlrfsystemiostreamwriterclasstopic.asp\">StreamWriter<\/A><\/FONT> classes in <A href=\"http:\/\/msdn.microsoft.com\/library\/default.asp?url=\/library\/en-us\/cpref\/html\/frlrfsystemio.asp\">System.IO<\/A> take an <A href=\"http:\/\/msdn.microsoft.com\/library\/default.asp?url=\/library\/en-us\/cpref\/html\/frlrfsystemtextencodingclasstopic.asp\">Encoding<\/A> as a parameter.&nbsp; So, to convert from one encoding to another, we just need to specify the original encoding and&nbsp;read&nbsp;the file contents into a string followed by writing out the string in the desired encoding.<\/P>\n<P>The <FONT face=\"Courier New\"><A href=\"http:\/\/msdn.microsoft.com\/library\/default.asp?url=\/library\/en-us\/cpref\/html\/frlrfsystemiopathclasstopic.asp\">Path<\/A><\/FONT> class, also in System.IO,&nbsp;provides us with an easy way to create temporary files in the Windows temporary directory.&nbsp; We can write the results to a temporary file so that if anything goes wrong, the destination file is not overwritten.&nbsp; Also, it allows the conversion to work when the source and destination are the same file.<\/P>\n<P><FONT face=\"Courier New\">StreamReader<\/FONT> allows us to read the source file in blocks so that we don&#8217;t have any size limitations on the file that need to convert.<\/P>\n<P>The <FONT face=\"Courier New\">Main()<\/FONT> method below is just a trivial wrapper to call the <FONT face=\"Courier New\">ConvertFileEncoding()<\/FONT><FONT face=\"Times New Roman\">since it wasn&#8217;t oringally a standalone app.<\/FONT><\/P>\n<P><FONT face=\"Courier New\">\/\/ Example: convert test.cs test-conv.cs ascii utf-8<\/FONT><\/P><PRE>using System;\nusing System.IO;\nusing System.Text;<\/p>\n<p>public class Convert\n{\n    public static void Main(String[] args)\n    {\n        \/\/ Print a simple usage statement if the number of arguments is incorrect.\n        if (args.Length != 4)\n        {\n            Console.WriteLine(&#8220;Usage: {0} inputFile outputFile inputEncoding outputEncoding&#8221;,\n                              Path.GetFileName(Environment.GetCommandLineArgs()[0]));\n            Environment.Exit(1);\n        }<\/p>\n<p>        ConvertFileEncoding(args[0], args[1], Encoding.GetEncoding(args[2]),\n                            Encoding.GetEncoding(args[3]));\n    }<\/p>\n<p>    \/\/\/ &lt;summary&gt;\n    \/\/\/ Converts a file from one encoding to another.\n    \/\/\/ &lt;\/summary&gt;\n    \/\/\/ &lt;param name=&#8221;sourcePath&#8221;&gt;the file to convert&lt;\/param&gt;\n    \/\/\/ &lt;param name=&#8221;destPath&#8221;&gt;the destination for the converted file&lt;\/param&gt;\n    \/\/\/ &lt;param name=&#8221;sourceEncoding&#8221;&gt;the original file encoding&lt;\/param&gt;\n    \/\/\/ &lt;param name=&#8221;destEncoding&#8221;&gt;the encoding to which the contents should be converted&lt;\/param&gt;\n    public static void ConvertFileEncoding(String sourcePath, String destPath,\n                                           Encoding sourceEncoding, Encoding destEncoding)\n    {\n        \/\/ If the destination&#8217;s parent doesn&#8217;t exist, create it.\n        String parent = Path.GetDirectoryName(Path.GetFullPath(destPath));\n        if (!Directory.Exists(parent))\n        {\n            Directory.CreateDirectory(parent);\n        }<\/p>\n<p>        \/\/ If the source and destination encodings are the same, just copy the file.\n        if (sourceEncoding == destEncoding)\n        {\n            File.Copy(sourcePath, destPath, true);\n            return;\n        }<\/p>\n<p>        \/\/ Convert the file.\n        String tempName = null;\n        try\n        {\n            tempName = Path.GetTempFileName();\n            using (StreamReader sr = new StreamReader(sourcePath, sourceEncoding, false))\n            {\n                using (StreamWriter sw = new StreamWriter(tempName, false, destEncoding))\n                {\n                    int charsRead;\n                    char[] buffer = new char[128 * 1024];\n                    while ((charsRead = sr.ReadBlock(buffer, 0, buffer.Length)) &gt; 0)\n                    {\n                        sw.Write(buffer, 0, charsRead);\n                    }\n                }\n            }\n            File.Delete(destPath);\n            File.Move(tempName, destPath);\n        }\n        finally\n        {\n            File.Delete(tempName);\n        }\n    }\n}\n<\/PRE><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The .NET framework handles file encodings very nicely.&nbsp; Not too long ago, I needed to convert files from one encoding to another for a library that didn&#8217;t handle the encoding of the original file.&nbsp; Since someone on an internal alias asked about doing this a couple of weeks ago, I thought it would be useful [&hellip;]<\/p>\n","protected":false},"author":94,"featured_media":10268,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[2],"class_list":["post-5683","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized","tag-c"],"acf":[],"blog_post_summary":"<p>The .NET framework handles file encodings very nicely.&nbsp; Not too long ago, I needed to convert files from one encoding to another for a library that didn&#8217;t handle the encoding of the original file.&nbsp; Since someone on an internal alias asked about doing this a couple of weeks ago, I thought it would be useful [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/buckh\/wp-json\/wp\/v2\/posts\/5683","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/buckh\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/buckh\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/buckh\/wp-json\/wp\/v2\/users\/94"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/buckh\/wp-json\/wp\/v2\/comments?post=5683"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/buckh\/wp-json\/wp\/v2\/posts\/5683\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/buckh\/wp-json\/wp\/v2\/media\/10268"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/buckh\/wp-json\/wp\/v2\/media?parent=5683"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/buckh\/wp-json\/wp\/v2\/categories?post=5683"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/buckh\/wp-json\/wp\/v2\/tags?post=5683"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}