May 14th, 2009

Hey, Scripting Guy! How Can I Use Windows PowerShell to Look for and Replace a Word in a Microsoft Word Document?

Hey, Scripting Guy! Question

Hey, Scripting Guy! The week of April 17, 2009, you did a weeklong series of “Hey, Scripting Guy!” articles on regular expressions. I tried to change one of your scripts to open an office Word document that uses Windows PowerShell to look for a particular word. Guess what? It did not work. Instead, it filled my screen with gibberish, the speakers started beeping “dah, dah, beep, beep, beeep,” and the computer became unresponsive. Do I have to get a new version of Windows PowerShell? Do I have a virus? What’s going on?

– KL

SpacerHey, Scripting Guy! Answer

Hi KL,

You know, diagnosing computer problems via e-mail is like doing brain surgery over the Internet. It might sound like a good idea at the time. It is also really tricky to do well. And the unintended consequences could be rather severe. However, I will venture a guess. Either you accidentally clicked on the song “Down Under” by Men at Work (one of my favorite songs by the way), or you tried to open a Word document that uses the Get-Content cmdlet. This last observation is not as Kreskinesque as you might think, because you already said this is what you did. The thing is that the Get-Content cmdlet is great at opening and reading text files, but it does not do as well with other file formats. All the beeping is occurring when Get-Content tries to convert the strange data to ASCII. This causes the speaker to beep every time that it hits hexadecimal 7. Here’s the gibberish created by trying to read a .doc file:

Image of the gibberish created by trying to read a .doc file

 

This week we are looking at how to migrate VBScript to Windows PowerShell. You should definitely check out the VBScript-to-Windows PowerShell Conversion Guide. This is included as Appendix C in the Microsoft Press book, Microsoft Windows PowerShell Step by Step. It is also in the Windows PowerShell Graphical Help File. Clearly, we are proud of that thing. You may also want to check out our Windows PowerShell Scripting Hub where you will find links to the Windows PowerShell Owner’s Manual (very popular!) and other resources that will help you to convert VBScript to Windows PowerShell. One additional book that would be useful is the Microsoft Press book, Windows PowerShell Scripting Guide. This book is useful if you are working with WMI, or if you are trying to go beyond simple line-by-line translations of one script to another.

There was a good “Hey, Scripting Guy!” article written on August 8, 2006, called “How can I replace text in a Microsoft Word document?” Of course back in 2006, all scripts on the Script Center were being written in VBScript. Let’s consider translating it to Windows PowerShell. The original VBScript is seen here:

ReplaceWordinWord.vbs

Const wdReplaceAll = 2

Set objWord = CreateObject(“Word.Application”)
objWord.Visible = True

Set objDoc = objWord.Documents.Open(“C:\Scripts\Test.doc”)
Set objSelection = objWord.Selection

objSelection.Find.Text = “<computername>”
objSelection.Find.Forward = TRUE
objSelection.Find.MatchWholeWord = TRUE

objSelection.Find.Replacement.Text = “atl-ws-01”

objSelection.Find.Execute ,,,,,,,,,,wdReplaceAll

The ReplaceWordinWord.ps1 script resembles the ReplaceWordinWord.vbs script. The one change we made to the Windows PowerShell version is using variables for the eleven parameters that are used in this script for the Execute method. This is something that was a recommended technique for VBScript as well. It is just virtually impossible for most IT pros to be able to examine a line such as this one and be able to quickly tell which position the wdReplaceAll constant is occupying. Most students, I found while teaching, had problems even counting the padding in front of wdReplaceAll. It is too easy to lose your position when all the little commas begin to run together, as shown here:

objSelection.Find.Execute ,,,,,,,,,,wdReplaceAll

In the ReplaceWordInWord.ps1 script, we replace that section of the code with the following code, which is much easier to read:

objSelection.Find.Execute($FindText,$MatchCase,
  $MatchWholeWord,$MatchWildcards,$MatchSoundsLike,
  $MatchAllWordForms,$Forward,$Wrap,$Format,
  $ReplaceText,$ReplaceAll)

If this is still too jumbled, you could spread it out so that each parameter was on its own line. We tried to group the parameters together so that related parameters are on the same line.

The ReplaceWordInWord.ps1 script begins by creating an instance of the Word.Application object. Then it sets the visible property to $true and opens the Word document. As soon as the document is open, it creates a selection object. The Word.Application object and the Selection object were both discussed on Wednesday. This is shown here:

$objWord = New-Object -ComObject word.application
$objWord.Visible = $True
$objDoc = $objWord.Documents.Open(“C:\fso\test.doc”)
$objSelection = $objWord.Selection

The text.doc document, as shown here, has some misspelled words:

Image of misspelled words in the text.doc document

 

The text that the script looks for is stored in the $findText variable. The replacement text is contained in the $ReplaceText variable. This is seen here.

$FindText = "mispelled"
$ReplaceText = "spelled incorrectly"

The remaining section of code is setting the replacement options:

$ReplaceAll = 2
$FindContinue = 1
$MatchCase = $False
$MatchWholeWord = $True
$MatchWildcards = $False
$MatchSoundsLike = $False
$MatchAllWordForms = $False
$Forward = $True
$Wrap = $FindContinue
$Format = $False

The parameters of the Execute method from the Find object are documented in Table 1.

Table 1  Parameters of the Execute method of the Find object
Name Required/Optional DataType Description

FindText

Optional

Variant

The text to be searched for. Use an empty string (“”) to search for formatting only. You can search for special characters by specifying appropriate character codes. For example, “^p” corresponds to a paragraph mark and “^t” corresponds to a tab character.

MatchCase

Optional

Variant

True to specify that the find text be case-sensitive. Corresponds to the Match case check box in the Find and Replace dialog box (Edit menu).

MatchWholeWorld

Optional

Variant

True to have the find operation locate only whole words, not text that is part of a larger word. Corresponds to the Find whole words only check box in the Find and Replace dialog box.

MatchWildcards

Optional

Variant

True to have the Find text be a special search operator. Corresponds to the Use wildcard characters check box in the Find and Replace dialog box.

MatchSoundsLike

Optional

Variant

True to have the Find operation locate words that sound similar to the Find text. Corresponds to the Sounds like check box in the Find and Replace dialog box.

MatchAllWordForms

Optional

Variant

True to have the Find operation locate all forms of the find text (for example, “sit” locates “sitting” and “sat”). Corresponds to the Find all word forms check box in the Find and Replace dialog box.

Forward

Optional

Variant

True to search forward (toward the end of the document).

Wrap

Optional

Variant

Controls what happens if the search begins at a point other than the beginning of the document and the end of the document is reached (or vice versa if Forward is set to False). This argument also controls what happens if there is a selection or range and the search text is not found in the selection or range. Can be one of the WdFindWrap constants.

Format

Optional

Variant

True to have the Find operation locate formatting in addition to, or instead of, the Find text.

ReplaceWith

Optional

Variant

The replacement text. To delete the text specified by the Find argument, use an empty string (“”). You specify special characters and advanced search criteria just as you do for the Find argument. To specify a graphic object or other nontext item as the replacement, move the item to the Clipboard and specify “^c” for ReplaceWith.

Replace

Optional

Variant

Specifies how many replacements are to be made: one, all, or none. Can be any WdReplace constant.

MatchKashida

Optional

Variant

True if find operations match text with matching kashidas in an Arabic-language document. This argument may not be available to you, depending on the language support (U.S. English, for example) that you have selected or installed.

MatchDiacritics

Optional

Variant

True if Find operations match text with matching diacritics in a right-to-left language document. This argument may not be available to you, depending on the language support (U.S. English, for example) that you have selected or installed.

MatchAlefHamza

Optional

Variant

True if Find operations match text with matching alef hamzas in an Arabic-language document. This argument may not be available to you, depending on the language support (U.S. English, for example) that you have selected or installed.

MatchControl

Optional

Variant

True if Find operations match text with matching bidirectional control characters in a right-to-left language document. This argument may not be available to you, depending on the language support (U.S. English, for example) that you have selected or installed.

MatchPrefix

Optional

Variant

True to match words that begin with the search string. Corresponds to the Match prefix check box in the Find and Replace dialog box.

MatchSuffix

Optional

Variant

True to match words ending with the search string. Corresponds to the Match suffix check box in the Find and Replace dialog box.

MatchPhrase

Optional

Variant

True ignores all white space and control characters between words.

IgnoreSpace

Optional

Variant

True ignores all white space between words. Corresponds to the Ignore white-space characters check box in the Find and Replace dialog box.

IgnorePunct

Optional

Variant

True ignores all punctuation characters between words. Corresponds to the Ignore punctuation check box in the Find and Replace dialog box.

The completed ReplaceWordInWord.ps1 script is seen here.

ReplaceWordInWord.ps1

$objWord = New-Object -ComObject word.application
$objWord.Visible = $True
$objDoc = $objWord.Documents.Open(“C:\fso\test.doc”)
$objSelection = $objWord.Selection

$FindText = “mispelled”
$ReplaceText = “spelled incorrectly”

$ReplaceAll = 2
$FindContinue = 1
$MatchCase = $False
$MatchWholeWord = $True
$MatchWildcards = $False
$MatchSoundsLike = $False
$MatchAllWordForms = $False
$Forward = $True
$Wrap = $FindContinue
$Format = $False

$objSelection.Find.Execute($FindText,$MatchCase,
  $MatchWholeWord,$MatchWildcards,$MatchSoundsLike,
  $MatchAllWordForms,$Forward,$Wrap,$Format,
  $ReplaceText,$ReplaceAll)

As soon as the script is run, the corrected Word document is displayed:

Image of the corrected Word document that is displayed

 

Well, KL, we have got you up and running. If we were to modify the script, it would be to put it into a function to allow for us to more easily search for and replace multiple words. We would definitely move the hard-coded values, such as the path of the document and the search/replace words, into parameters. It is becoming late, and I have to make my presentation tomorrow at Tech·Ed. It is in Room 152 at 2:45 P.M. If you are at Tech·Ed, stop by. It will be an awesome talk. Hope to see you then. Make sure that you join us for Quick-Hits Friday, when we will again open up the mail bag and grab those questions that can be answered in just a few paragraphs. See ya!

 

Ed Wilson and Craig Liebendorfer, Scripting Guys

Author

0 comments

Discussion are closed.