How Can I Pick Out and Save Specific Lines in a Text File?


Hey, Scripting Guy! Question

Hey, Scripting Guy! I’d like to be able to read through a text file, select the lines that begin with a particular word (like Failure), and then save only those lines back to the same text file. Is there any way to do that?

— AC

SpacerHey, Scripting Guy! AnswerScript Center

Hey, AC. Hey, AC. For simplicity’s sake, we’re assuming you have a text file that looks similar to this:

Success – Operation succeeded 10/1/2004.
Success – Operation succeeded 10/2/2004.
Failure – Operation failed 10/3/2004.
Success – Operation succeeded 10/4/2004.
Failure – Operation failed 10/5/2004.
Success – Operation succeeded 10/6/2004.
Failure – Operation failed 10/7/2004.
Failure – Operation failed 10/8/2004.

You’d like to have a script read through the file, toss out all the lines that begin with Success, and then save the file, a file which will then hold only information about the operations that failed. In other words, you want the revised file to look like this:

Failure – Operation failed 10/3/2004.
Failure – Operation failed 10/5/2004.
Failure – Operation failed 10/7/2004.
Failure – Operation failed 10/8/2004.

Can you do this with a script? Of course you can:

Const ForReading = 1
Const ForWriting = 2

Set objFSO = CreateObject(“Scripting.FileSystemObject”) Set objTextFile = objFSO.OpenTextFile _ (“test.log”, ForReading)

Do Until objTextFile.AtEndOfStream strLine = objTextFile.ReadLine If Left(strLine, 7) = “Failure” Then strNewText = strNewText & strLine & vbCrLf End If Loop

objTextFile.Close Set objTextFile = objFSO.OpenTextFile _ (“test.log”, ForWriting) objTextFile.Write(strNewText) objTextFile.Close

This script looks a tad bit complicated, but that’s because there’s no way to directly edit a text file using a script. Instead, we have to open the text file, read in the current contents, and then close the file. We then do our “editing” in memory, re-open the text file, replace the current contents with our new data, and then close the file again. And that’s exactly what this script does.

After defining a couple of constants (ForReading and ForWriting, constants we’ll need to open the text file), the script opens the file test.log for reading. We then create a Do loop that runs until we reach the end of the text file; in other words, until we are at the end of the text stream.

And what happens inside that loop? We begin by using the ReadLine method to read the current line of the text file; that line of text gets stored in the variable strLine. We then check to see if the first 7 characters of the line happen to be Failure; that’s what the command If Left(strLine, 7) = “Failure” does. If the first seven characters are anything butFailure, then we simply loop back around and read the next line in the file.

But what if the first seven characters areFailure? In that case, we have another variable – strNewText – that we use to store the data we want to save. The line of code strNewText = strNewText & strLine & vbCrLf just takes whatever happens to be in strNewText at the moment, append the value of strLine, and then add a carriage return-linefeed to the end (vbCrLf). This builds up a new dataset in memory; by the time we finish reading the entire file, strNewText is equal to this:

Failure – Operation failed 10/3/2004.
Failure – Operation failed 10/5/2004.
Failure – Operation failed 10/7/2004.
Failure – Operation failed 10/8/2004.

In other words, all we’ve done is read the file and keep a list of all the lines beginning with the word Failure; any lines beginning with anything else have been ignored.

Now that we have our new dataset, we need to close the text file and then reopen it, this time using the constant ForWriting. (Yes, we know, but that’s the way the FileSystemObject works: you have to open it for reading, then close it and reopen it for writing.) With the file open we use the Write method to replace the existing contents of test.log with the value of our variable strNewText. We then close the file, which saves the change we just made. The net effect? Test.log now contains only a list of operations that failed:

Failure – Operation failed 10/3/2004.
Failure – Operation failed 10/5/2004.
Failure – Operation failed 10/7/2004.
Failure – Operation failed 10/8/2004.

A tiny bit cumbersome, but it works just fine.

Of course, often times the word you are looking for isn’t found at the beginning of the line, but is instead found somewhere in the middle of the line. For example, your log file might look like this:

10/1/2004     Success – Operation succeeded.
10/2/2004     Success – Operation succeeded.
10/3/2004     Failure – Operation failed.
10/4/2004     Success – Operation succeeded.
10/5/2004     Failure – Operation failed.
10/6/2004     Success – Operation succeeded.
10/7/2004     Failure – Operation failed.
10/8/2004     Failure – Operation failed.

If that’s the case, then checking to see the value of the first 7 characters in each line won’t do you much good; you need to check to see if the word Failure appears anywhere within the line. But that’s all right; you can just use the VBScript function InStr. With InStr you pass it two parameters: the string to search (the variable strLine) and the item to search for the word Failure). InStr will respond by telling you the character position at which the word Failure begins. For example, in this line the word Failure begins at character 15, so InStr returns a 15:

10/5/2004     Failure – Operation failed.

If the search term can’t be found in the string, then IntStr returns a 0. Therefore, we just use InStr to search each line for the word Failure. If InStr is greater than 0, that means the word was found, and we add the line to the variable strNewText. Here’s a revised Do loop that searches for the value Failure anywhere within the line:

Do Until objTextFile.AtEndOfStream
    strLine = objTextFile.ReadLine
    intFailure = InStr(strLine, “Failure”)
    If intFailure > 0 Then
        strNewText = strNewText & strLine & vbCrLf
    End If

Could we have used this approach earlier? Sure. But when we know sure that the term we’re looking for will be found at the beginning of the line, we think Left is a better choice (and the Right function is best if we know for sure that the term in question appears at the end of the line). That’s because it’s at least theoretically possible to “fool” the InStr function. For example, InStr will peg this line as marking a failure, even though it actually represents a successful operation:

Success – Failure troubleshooter successfully loaded 10/1/2004.

A somewhat unlikely occurrence, sure. But why take any chances?

P.S. We know we’ll get a bunch of letters from people saying, “What about regular expressions?” Yes, regular expressions are very powerful, but they’re also a bit too complex to explain in this column. We’ll take up regular expressions sometime soon, but somewhere other than the Hey, Scripting Guy! column.


Discussion is closed.

Feedback usabilla icon