Hey, Scripting Guy! How Can I Remove Unwanted Tabs From a Text File?

Hey, Scripting Guy! Question

Hey, scripting Guy! I have about 5,000 text files that have unwanted tabs at the end of each line. How can I write a script that removes these unwanted tabs?

— TT

Spacer Hey, Scripting Guy! Answer Script Center

Note. If this looks – and sounds – a little different than the typical Hey, Scripting Guy! column, well, there’s a good reason for that: that’s because it is a little different than the typical Hey, Scripting Guy! column. The original column for October 8, 2007 had to be taken down because it referred to a third-party company (flattering references, but references nonetheless). We were given the option of “reworking” the article, but there was really no way to do that without rewriting the entire column. Therefore, we decided to just post the code and the explanation of the code and call it good. We hope you enjoy this column as much as we’ve enjoyed bringing it to you!

We start this script off the same way we start off many of our text file scripts: by defining a pair of constants (ForReading and ForWriting) that will be used when we go to open our text file. After the constants are defined, we create an instance of the Scripting.FileSystemObject, then use the OpenTextFile method to open the file C:\Scripts\Test.txt for reading:

Once the file is open we set up a Do Until loop that runs until the file’s AtEndOfStream property is True. (Which is just a fancy way of saying that we’ll keep looping until we’ve read the entire file.) Inside that loop, we use the ReadLine method to read the first line in the file, storing the value in a variable named strLine:

That brings us to this line of code:

Although TT said that all the lines in his text files end in an unwanted tab we decided to use this line of code to verify that a given line ends with a tab character. How do we do that? By using the Right function to take a peek at the very last character in the string. If this character has an ASCII value of 9 that can mean only one thing: it’s a tab character. And that means that it has to go.

OK, so then how do we remove that unwanted tab from the end of the string? Well, to begin with, we use the Len function to determine the total number of characters in the string, storing that value in a variable named intLength. We then use the Left function to grab all the characters except the very last one. (That’s what the construction intLength – 1 is for.) In other words, suppose our string was this: cat. The length of the string (total characters) is 3; the length minus 1 is, um, hold on a second … 2. Therefore, if we start at the beginning of the string and take the first two characters, that means our new string is equal to this: ca.

Just in case there was any confusion as to what we were doing.

From there we use this line of code to add the modified string value (plus a carriage return-linefeed character) to a variable named strContents:

And then we loop around and repeat this process with the next line in the text file, removing the unwanted tab and then tacking that modified line onto the end of strContents. And so on and so on and so on.

By the time we exit the loop we will have constructed a brand-new version of our file in memory, a version where all the unwanted tabs have been removed. In order to write this new version back to the actual file itself (C:\Scripts\Test.txt) we first need to close the file, then use the OpenTextFile method to reopen it, this time for writing:

Once the file is open we use the Write method to write the new version of the file to Test.txt. As soon as that’s done we call the Close method to close the file a final time and then we call it a day.

Good question: why didn’t we just use the Replace function to replace all the tabs? Well, we could have, except for one thing: it’s possible that there are other tabs in the file, tabs that shouldn’t be removed. The Replace functions would remove all tabs; the approach we showed you only removes tabs found at the end of a line.

As long as we’re on the subject, it’s also possible that some lines could have more than one unwanted tab tacked onto the end. Is it possible to remove multiple tabs at the end of a line? You bet it is. Here’s a modified Do Loop that features a second Do Until loop; in that second loop, we check the last character in the string and, if it’s a tab, we remove it. We then check the last character in our modified string; if that happens to be a tab we remove it as well. This continues until we’ve removed all the tab characters found at the end of the line.

Here’s the modified code:


No Comment.