Converting a text file from one encoding to another

Avatar

Buck

The .NET framework handles file encodings very nicely.  Not too long ago, I needed to convert files from one encoding to another for a library that didn’t handle the encoding of the original file.  Since someone on an internal alias asked about doing this a couple of weeks ago, I thought it would be useful to post it here.


The .NET runtime uses Unicode as the encoding for all strings.  The StreamReader and StreamWriter classes in System.IO take an Encoding as a parameter.  So, to convert from one encoding to another, we just need to specify the original encoding and read the file contents into a string followed by writing out the string in the desired encoding.


The Path class, also in System.IO, provides us with an easy way to create temporary files in the Windows temporary directory.  We can write the results to a temporary file so that if anything goes wrong, the destination file is not overwritten.  Also, it allows the conversion to work when the source and destination are the same file.


StreamReader allows us to read the source file in blocks so that we don’t have any size limitations on the file that need to convert.


The Main() method below is just a trivial wrapper to call the ConvertFileEncoding()since it wasn’t oringally a standalone app.


// Example: convert test.cs test-conv.cs ascii utf-8

Avatar
Buck Hodges

Follow Buck   

Tagged

No Comments.