Hey, Scripting Guy! How can I search for formatted text in a Word document and then apply HTML tags to that text?
— ES
Hey, ES. You know, it’s funny that you should ask this question. Just last night at the dinner table, the Scripting Guy who writes this column finally had The Talk with the Scripting Son. No sooner had they started eating before the Scripting Son blurted out, “Listen, Dad, I’m 17-years-old now, and there are some things I need to know before I become an adult.”
Needless to say, those are the words every father dreads to hear. But duty calls, right? “OK, son,” said the Scripting Dad. “What is it you need to know?”
“I need to know this, Dad: how can I search for formatted text in a Word document and then apply HTML tags to that text?”
Like we said, no parent (or at least no father) really looks forward to having to explain the facts of Microsoft Word scripting life to their child; to be honest, it’s embarrassing to have to talk about things like that with your kids. Nevertheless, being a parent means more than just going to watch your son play baseball. (Although the Scripting Guy who writes this column wishes someone had told him that before he became a parent.) As awkward as it might have been, the Scripting Guy who writes this column sat down and patiently explained how the Scripting Son could search for formatted text in a Word document and then apply HTML tags to that text. (And he tried his best not to sound too preachy about this, although he did note that Microsoft Word search scripts should only be written by someone who truly loves system administration scripting. Admittedly, casual scripting and recreational scripting sound fun, but, more often than not, they simply lead to heartache. The Scripting Dad’s advice? Get married first and then write a script that can search for formatted text in a Word document and then apply HTML tags to that text.)
For those of you who have yet to have The Talk with your own children, here’s a transcript of what the Scripting Dad told his son, beginning with the script itself:
Set objWord = CreateObject(“Word.Application”) objWord.Visible = TrueSet objDoc = objWord.Documents.Open(“C:\Scripts\Test.doc”) Set objSelection = objWord.Selection
objSelection.Find.Forward = True objSelection.Find.Format = True objSelection.Find.Font.Bold = True
Do While True objSelection.Find.Execute If objSelection.Find.Found Then objSelection.Text = “<b>” & objSelection.Text & “</b>” Else Exit Do End If Loop
“OK, son,” the Scripting Dad continued. “Let’s see if we can figure out how the miracle of searching for text in a Microsoft Word document occurs. As you can see, we start out by creating an instance of the Word.Application object and setting the Visible property to True; that gives us a running instance of Microsoft Word that we can see onscreen. After using the Open method to open the file C:\Scripts\Test.doc, we then use this line of code to create an instance of Word’s Selection object (which, by default, also positions the cursor at the beginning of the document):
Set objSelection = objWord.Selection
“What’s that? Are we done yet? No, not quite, but we’re getting there. Next we need to define three key properties of the Find object (a child object of the Selection object). In particular, we need to:
• |
Set the Forward property to True. This tells the script that we want the search to move forward through the document; because we’re starting our search at the very beginning of the document (we know that, because that’s where the cursor is positioned) that means that we’ll end up searching the entire file. |
• |
Set the Format property to True. This tells the script that we want to search for formatting rather than a specific string value. |
• |
Set the Font.Bold property to True. This tells the script – hey, that’s right: this tells the script that we want to search for boldface text. Say, if you don’t mind my asking, who told you about the Font.Bold property? And is there any chance you could give me a few pointers about Font.Bold? |
“After we configure these property values we set up a Do While loop designed to run forever. Or, at any rate, designed to run as long as True is equal to True. That’s why we can’t say that the loop actually will run forever; U.S. politics being what they are these days, it’s just a matter of time before True is no longer considered equal to True. But that’s a lesson for another day.
“Once inside the loop, we call the Execute method; this tells the script to start searching the document, and to keep searching until one of two things happens: it either finds an instance of the target text (something in boldface), or it reaches the end of the document.
“Still with me? Good. Now, let’s suppose that some boldface text is found; in that case, the Find object’s Found property will be set to True. That’s something we can test for using this line of code:
If objSelection.Find.Found Then
“And assuming that we have found some boldface text, that means we need to put the HTML <b> tag at the beginning of that value and the </b> at the end of the value. That’s what we do with this line of code:
objSelection.Text = “<b>” & objSelection.Text & “</b>”
“As you can see, there’s really nothing mysterious about this: in technical terms, we’re simply replacing the value of the Selection object’s Text property (which will be the boldface text; when executing a search any found item is automatically selected) with the current value of the selected text plus our HTML tags. What does that mean? That means that a phrase that starts out looking like this:
Boldface text
“Will end up looking like this:
<b>Boldface text</b>
“We then loop around, call the Execute method, and search for the next instance of the target text. What happens if there isn’t a next instance of the target text (that is, no more boldface text)? In that case the Found property will not be equal to True and, in turn, we call the Exit Do statement to exit the not-so-endless loop.
“And that, Scripting Son, explains how birds and bees manage to search a Word document for formatted text and then put HTML tags around any such text the search uncovers.”
That should do it, ES. Incidentally, the Scripting Guy who writes this column apologizes for not directly answering your question, but instead giving you a transcript of something he told the Scripting Son. The truth is, the Scripting Guy who writes this column was a little embarrassed about answering this question, particularly in front of a large audience. Yes, we know: scripts that search for formatted text in a Word document and then apply HTML tags to that text are a normal (some would argue a beautiful) part of life, and there’s really nothing to be embarrassed about. Be that as it may, there are some subjects that the Scripting Guy who writes this column simply has problems discussing in this column.
Well, we mean besides system administration scripting.
0 comments