How Can I Insert a Page Break in a Text File After Each Line Where the Only Character is the Number 1?
Hey, Scripting Guy! How can I open a text file and insert a page break after each line where the only character is the number 1?
Hey, JR. OK, to begin with, there’s no doubt that the Scripting Guy who writes this column doesn’t keep up with the world of fashion; for example, he considers “getting dressed up” to mean going out without a baseball hat, something he does only for a wedding, a funeral, or a good old-fashioned drenching. And there’s no doubt that the Scripting Guy who writes this column has some … interesting … ideas when it comes to fashion; the Scripting Editor, who has had occasion to listen to some of these rants, can vouch for that. (Or maybe not. Usually when he starts rambling on about why men shouldn’t wear sandals she always remembers that she left a pot of water boiling on the stove and leaves.)
Still, picture this if you will. (Or if you dare). A brisk summer morning in Seattle; the Scripting Guy who writes this column is sitting in the stands waiting for the Scripting Son’s baseball game to start. (And because you’re all dying to know this, yes, the Scripting Son pitched three innings of relief and picked up the win in that game.) In walks a guy wearing:
A fleece jacket.
Blue jean cutoff shorts. (And cut off a little too short, if you ask us.)
A cowboy hat.
So was there anything wrong with that? Admittedly, there probably isn’t anything wrong with that. But maybe there should be.
Oh, well; to each his own, right? Besides, what does that have to do with inserting page breaks in a text file? Darned if we know. But at least the following script does have something to do with inserting page breaks in a text file:
Const wdReplaceAll = 2
Set objWord = CreateObject(“Word.Application”) objWord.Visible = True
Set objDoc = objWord.Documents.Open(“C:\Scripts\Test.txt”) Set objSelection = objWord.Selection objSelection.Find.Text = “^p1^p” objSelection.Find.Forward = TRUE objSelection.Find.Replacement.Text = “^p1^p^m”
First things first. To begin with, you might have noticed that this script uses Microsoft Word to open the text file and to insert the page breaks. Why? Well, as far as we know, the only way to insert page breaks into a text file is to use an application like Word; the FileSystemObject can’t insert page breaks into a text file, and the text file wouldn’t know what to do with those page breaks anyway. (Try finding the Insert Page Break command in Notepad. That’s going to be a little difficult, to say the least.) Because of that we need to open the text file in Word, insert the page breaks, and then save that file as a Word document. (That’s a step we left out in our sample script, but we have at least one article that tells how to save a file as a Word document.) Admittedly we no longer have a text file at that point, but we don’t know any way to work around that.
All right. Having done first things first, let’s now do second things second and explain how the script actually works. For starters, we define a constant named wdReplaceAll and set the value to 2; we’ll explain what that’s for in just a second. We then create an instance of the Word.Application object and set the Visible property to True; that gives us a running instance of Microsoft Word that we can see on screen. After that we use these two lines of code to open the file C:\Scripts\Test.txt and create an instance of Word’s Selection object:
Set objDoc = objWord.Documents.Open(“C:\Scripts\Test.txt”) Set objSelection = objWord.Selection
Why do we need an instance of Word’s Selection object? That’s easy: in order to do a search-and-replace operation (which is what we’re about to do) we need to use the Find object. And Find just happens to be a child object of the Selection object.
Still with us? Good. As it turns out, JR has a text file that looks something like this:
Here’s some text. 1 Here’s some more text. 1 Here’s even more text. And more text. 1 And here’s the last bit of text.
Notice the lines that have nothing but the number 1 on them? Our task is to insert a page break after each of those lines.
And just how do we propose to do that? We’re glad you asked that question. What we’re going to do is search for all the number 1s that happen to be on a line all by themselves. How can we tell if the number 1 is on a line by itself? That’s also easy: if we have a paragraph return followed by the number 1 followed by another paragraph return then we must have a line where the number 1 is sitting there all by its lonesome. (If that doesn’t make sense do this: create a new Word document, press ENTER, type the number 1, then press ENTER again. Voila!) The following line of code lets us specify search Text that consists of a paragraph return (using Word’s ^p syntax) followed by the number 1 followed by another paragraph return:
objSelection.Find.Text = “^p1^p”
All we’ve done here is define the text we want to search for. Once that’s done we then set the Forward property to True; that tells the script to begin the search at the current selection point and then move forward through the document. Because the selection point is currently at the very beginning of the document (the default location when you create an instance of the Selection object) this ensures that our script searches the entire document from start to finish.
After that we define our replacement text:
objSelection.Find.Replacement.Text = “^p1^p^m”
As you no doubt recall, we’re searching for the following bit of text: ^p1^p. We’re now stating that we want to replace that bit of text with this: ^p1^p^m. And yes, you’re absolutely right: the ^m is Word’s syntax for a manual page break. The net effect here is that we’re adding a page break after each line in the file where the number 1 appears by itself. And that’s good; after all, that’s exactly what we want to do.
All that’s left now is to call the Execute method and actually perform the search-and-replace:
We won’t discuss all the empty parameters required for this operation; for details see our Office Space article on finding and replacing text in a Word document. For now just note that all the commas are required; that’s to ensure that the constant wdReplaceAll is passed in the correct position. In addition, the constant wdReplaceAll is required to ensure that all instances of the target text get replaced; without specifying this value the script will replace the first instance of the target text and then call it good.
And that would not be good.
That should do it, JR. Give this baby a try and see what happens.
As for the Scripting Guy who writes this column, he’s busy rethinking some of his opinions on fashion. For example, after seeing a guy wearing short shorts and cowboy boots he’s beginning to think that maybe guys wearing sandals isn’t such a bad thing after all.
That said, however, he still firmly and resolutely believes that – what’s that? You just remembered that you left a pot of water boiling on the stove? Well, OK; you better go take care of that. We’ll hold this thought until you get back.