November 10th, 2008

Hey, Scripting Guy! How Can I Check Spelling and Grammar in Microsoft Office Word?

Hey, Scripting Guy! Question

Hey, Scripting Guy! I loved your last “Hey, Scripting Guy!” article about formatting paragraphs in a Microsoft Office Word document. What I need to do is a little bit more mundane, but for me much more important. I have a number of Word documents that are stored in a folder. Unfortunately, the person who wrote those documents doesn’t know how to spell, how to write in a grammatically correct fashion, or how to use the spelling and grammar checker in Word. There are literally hundreds of these functionally illiterate documents and I have to clean them up. Is there some way I can automate this process?

– JA

SpacerHey, Scripting Guy! Answer

Hi JA,

You will pardon me if I am a bit distracted today. You see, this is this first time in five years that I have been in the United States during the fall. Generally, I have been working in Europe, and I am really relishing some of the things I missed about the United States in the fall. The Major League Baseball World Series, little ones coming around dressed in funny costumes bumming candy (also known as Halloween by the way), and other things. Anyway, you were saying? Oh, yeah, something about Word. Here in Charlotte, North Carolina, in the United States (not all Microsofties live in Redmond, Washington) the leaves are turning colors. The weather is cool. From the window in my office I can see—sorry, got distracted.

Let’s just start a new paragraph and be done with it. So let’s look at your requirements. According to your e-mail message, these are the things you want to script:

•

Find a bunch of .doc files in a folder.

•

Check the .doc files for spelling.

•

Check the .doc files for grammar.

•

Save the changes.

We can most certainly do this. Have you looked at the Script Center Script Repository? We have a section that focuses specifically on Microsoft Office products. Keep your eye on this section, as the Scripting Editor and I are going to be making some nice changes here and adding many new scripts. We also have the “Hey, Scripting Guy!” archive that is sorted by category. One of those categories is Microsoft Office. These are all some really good resources, but they are all in VBScript. So I want to answer this question by using Windows PowerShell. Without further ado, here is your script, which I named CheckSpellingAndPrint.ps1. Well, let’s add one bit of ado. How about if we go ahead and print out the file when we are done with the changes. It will be easy to do and will make the script a bit more useful.

CheckSpellingAndPrint.ps1

$word = New-Object -comobject word.application
$word.visible = $true
$path = "c:\fso\*"
$files = Get-ChildItem -Path $path -Include *.doc
foreach($file in $files)
{ 
$file.fullname
 $doc = $word.documents.open($file.fullname)
 $doc.checkSpelling()
 $doc.checkGrammar()
 $doc.save()
 $doc.printOut()
 $doc.close()
}
 $word.quit()

We begin this script in exactly the same way that we have started other scripts that intend to automate Word: We create an instance of the Word application object. This is the main object that is used to automate Word. To create an instance of the Word application object, we use the program ID word.application and feed it to the New-Object cmdlet with the -comobject parameter. This is seen here:

$word = New-Object -comobject word.application

We now want to see the Word document as it is opened. This is because of the requirement to check the grammar. At times, I found during testing that it made more sense to see more than just the grammar dialog box. If you do not wish to see the Word document and you want the Word document to remain hidden, you only need to set the value of the visible property to $false. For me, however, I will set it to true in this script as seen here:

$word.visible = $true

We now need to provide the path that we wish to search for the Word document files. To do this, we are using the $path variable, and we give it the path to the folder. This is seen here:

$path = "c:\fso\*"

Note. When specifying the path to search for the Word documents, make sure you include a trailing *, which will tell the Windows PowerShell Get-ChildItem cmdlet to look for all the items in the folder.

As you can see in this image, there is a mixed bag of files in the c:\fso folder:

Image of the files in the c:\fso folder

 

Now we need to use the Get-Childitem cmdlet to retrieve all the .doc files in the folder. To do this, we use the -path parameter to supply the folder path we will search, and we use the -include parameter to filter out only .doc files. If you were looking for .docx files, you could say -Include *.docx. If you wanted both .doc and .docx files, you could simply use *.doc* in your include parameter. We are storing the collection of fileinfo objects in the variable $files. If you had a very large number of files, you would want to pipeline the results instead of storing the results in a variable. This line of code is shown here:

$files = Get-ChildItem -Path $path -Include *.doc

We now need to walk through a collection. Just as in VBScript, when you hear the phrase “walk through a collection,” we will use the foreach statement. This command is shown here:

foreach($file in $files)

We now use a set of curly brackets to set off our code that will be performed once for each file in the collection of files. The first thing we do is print out the fullname property of the fileinfo object. This property gives us not only the name of the file, but also the folder that contains the file. We will use this path to the file in the open method from the document object in the next line of code, but we are not there yet:

$file.fullName

Now we are there and it is time to open the document. To do this we are using the open method from the document object. This is seen here:

$doc = $word.documents.open($file.fullname)

To check the spelling and grammar, we use the checkspelling and the checkgrammar methods from the document object. This is seen here:

$doc.checkSpelling()

The cool thing about checking the spelling and making the document visible is that the Word document can be seen, and you are presented with the spell checker’s dialog box. This is seen here:

Image of the spell checker’s dialog box

 

When we check the grammar, it is really important to be able to see the context because context can be everything when editing. This is shown here:

Image of the context being presented

 

At other times, the grammatical offense can mean the revocation of one’s poetic license. In these cases, hiding the Word document might make sense. For these cases, the text does not need to be seen, and changes can be made directly in the dialog box. This is seen here:

Image of Word document hidden

 

$doc.checkGrammar()

We now want to save, print, and close the document. Again, we use the save, printout, and close methods from the document object. It is seen here:

$doc.save()
$doc.printOut()
$doc.close()

JA, because you did not ask about printing the document, if you do not want to print out the file, all you need to do is to add a pound sign # in front of the $doc.printout() method call. When we are done looping through all the documents we found in the folder, we want to exit the Word application. To do this we use the quit method, as seen here:

$word.quit()

Well, JA, that is it. I think I am going to go outside and do something really fun that I have not done in years. I am going to get my scripting rake and rake a big pile of leaves, and then I’ll take a scripting run and jump into that pile of leaves. It is the simple pleasures in life that I enjoy, be it a finely crafted Word automation model or a nice big pile of leaves. See you tomorrow.

Ed Wilson and Craig Liebendorfer, Scripting Guys

Author

0 comments

Discussion are closed.