{"id":63273,"date":"2008-01-16T21:52:00","date_gmt":"2008-01-16T21:52:00","guid":{"rendered":"https:\/\/blogs.technet.microsoft.com\/heyscriptingguy\/2008\/01\/16\/hey-scripting-guy-how-can-i-extract-specific-information-from-a-word-document-and-then-use-that-information-to-rename-the-document\/"},"modified":"2008-01-16T21:52:00","modified_gmt":"2008-01-16T21:52:00","slug":"hey-scripting-guy-how-can-i-extract-specific-information-from-a-word-document-and-then-use-that-information-to-rename-the-document","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/scripting\/hey-scripting-guy-how-can-i-extract-specific-information-from-a-word-document-and-then-use-that-information-to-rename-the-document\/","title":{"rendered":"Hey, Scripting Guy! How Can I Extract Specific Information From a Word Document and Then Use That Information to Rename the Document?"},"content":{"rendered":"<p><img decoding=\"async\" class=\"nearGraphic\" title=\"Hey, Scripting Guy! Question\" height=\"34\" alt=\"Hey, Scripting Guy! Question\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/29\/2019\/02\/q-for-powertip.jpg\" width=\"34\" align=\"left\" border=\"0\" \/> <\/p>\n<p>Hey, Scripting Guy! I have a bunch of Word documents in a folder. I need to open each document, read in some information from the first two lines, then use that information to rename the file. However, I\u2019m having some trouble getting this to work. Can you help me?<\/p>\n<p>&#8212; TG<\/p>\n<p><img decoding=\"async\" height=\"5\" alt=\"Spacer\" src=\"https:\/\/devblogs.microsoft.com\/scripting\/wp-content\/uploads\/sites\/29\/2019\/05\/spacer.gif\" width=\"5\" border=\"0\" \/><img decoding=\"async\" class=\"nearGraphic\" title=\"Hey, Scripting Guy! Answer\" height=\"34\" alt=\"Hey, Scripting Guy! Answer\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/29\/2019\/02\/a-for-powertip.jpg\" width=\"34\" align=\"left\" border=\"0\" \/><a href=\"http:\/\/go.microsoft.com\/fwlink\/?linkid=68779&amp;clcid=0x409\"><img decoding=\"async\" class=\"farGraphic\" title=\"Script Center\" height=\"288\" alt=\"Script Center\" src=\"http:\/\/img.microsoft.com\/library\/media\/1033\/technet\/images\/scriptcenter\/ad.jpg\" width=\"120\" align=\"right\" border=\"0\" \/><\/a> <\/p>\n<p>Hey, TG. As devoted readers of <i>Hey, Scripting Guy!<\/i> know, the Scripting Guy who writes this column is usually all business: he never lets <i>anything<\/i> distract him from his job of answering questions about system administration scripting. Today, however, we have to make an exception: the following story was just too weird to pass up.<\/p>\n<p>Last week a 66-year-old man died in his sleep. His roommates \u2013 in what must have been some sort of tribute to their deceased comrade \u2013 decided to try and cash the man\u2019s most-recent Social Security check. When they did, however, they were told that the person who\u2019s name was on the check had to be present before the check could be cashed. The two mean responded by returning home, dressing the deceased the best they could, then plopping the body into a desk chair and wheeling him back to the check cashing store. They left the dead man on the sidewalk, then went back inside the store and tried to cash the check.<\/p>\n<p>To make a long \u2013 and weird \u2013 story short, a crowd gathered, a policeman came by to see what the hubbub was all about, and the two men were arrested. The pair claimed they had no idea that their friend was dead. \u201cHe looked like that every morning,\u201d said one man. \u201cI didn\u2019t know he was dead. He had $500 in his pocket. I had $200. Why would I rob the guy?\u201d<\/p>\n<p>In other words, their friend looked like that \u2013 dead \u2013 every morning. Hard to believe? Maybe. But, then again, none of you have ever seen the Scripting Guy who writes this column first thing in the morning. <\/p>\n<p>Like we said, that was just too weird to pass up. But now it\u2019s time to regain our focus. TG is looking for a script that can open a Microsoft Word document (actually a whole slew of Word documents), grab some information from the first two lines, then use that information to rename the file. How can he do that? Why, like this, of course:<\/p>\n<pre class=\"codeSample\">Set objWord = CreateObject(\"Word.Application\")\nobjWord.Visible = True\n\nSet objDoc = objWord.Documents.Open(\"C:\\Scripts\\Test.doc\")\n\nstrText = objDoc.Paragraphs(1).Range.Text\narrText = Split(strText, vbTab)\nintIndex = Ubound(arrText)\nstrUserName = arrText(intIndex)\n\narrUserName = Split(strUserName, \" \")\nintLength = Len(arrUserName(1))\nstrName = Left(arrUserName(1), intlength - 1)\n\nstrUserName = strName &amp; \", \" &amp; arrUserName(0)\n\nstrText = objDoc.Paragraphs(2).Range.Text\narrText = Split(strText, vbTab)\nintIndex = Ubound(arrText)\n\nstrDate = arrText(intIndex)\nstrDate = Replace(strDate, \"\/\", \"\")\n\nintLength = Len(strDate)\nstrDate = Left(strDate, intlength - 1)\n\nstrFileName = \"C:\\Scripts\\\" &amp;  strUserName &amp; \" \" &amp; strDate &amp; \".doc\"\n\nobjWord.Quit\n\nWscript.Sleep 5000\n\nSet objFSO = CreateObject(\"Scripting.FileSystemObject\")\nobjFSO.MoveFile \"C:\\Scripts\\Test.doc\", strFileName\n<\/pre>\n<p>To tell you the truth, this turned out to be a tiny bit more complicated than we initially thought; that\u2019s due to the way paragraphs are formatted in Word. Of course, the fact that this script is a tiny bit complicated could also be due to the fact that \u2013 more often than not \u2013 the Scripting Guy who writes this column has no real idea what he\u2019s doing, and today\u2019s column was no exception. But everyone should be used to <i>that<\/i> by now.<\/p>\n<p>The script starts out in simple enough fashion, creating an instance of the <b>Word.Application<\/b> object and then setting the <b>Visible<\/b> property to True; that gives us a running instance of Word that we can see on screen. We then use this line of code to open the document in question (C:\\Scripts\\Test.doc):<\/p>\n<pre class=\"codeSample\">Set objDoc = objWord.Documents.Open(\"C:\\Scripts\\Test.doc\")\n<\/pre>\n<p>As TG noted in his email, this document (and all the other Word documents in this folder) start out in the same fashion; the first two lines in each document look something like this:<\/p>\n<pre class=\"codeSample\">Person Name        Ken Myer\nEncounterDate      01\/01\/08\n<\/pre>\n<p>Is that good news? You bet it is. That means that we can grab the user name simply by reading in the <b>Text<\/b> property of the first paragraph in the document. In fact, that means we can grab the user name simply by executing the following line of code:<\/p>\n<pre class=\"codeSample\">strText = objDoc.Paragraphs(1).Range.Text\n<\/pre>\n<p>That\u2019s also good \u2026 sort of. The one problem here is that we don\u2019t just have the user name, we have this value:<\/p>\n<pre class=\"codeSample\">Person Name        Ken Myer\n<\/pre>\n<p>Talk about too much information, eh? On top of that, when we go to rename the file we want the user name to look like this:<\/p>\n<pre class=\"codeSample\">Myer, Ken\n<\/pre>\n<p>So what does all that mean? That means that we still have some work to do.<\/p>\n<p>According to TG, there are two tab characters separating <i>Person Name<\/i> and <i>Ken Myer<\/i> (the person\u2019s name). With that in mind, we can use the <b>Split<\/b> function to turn this string value into an array, splitting the string on the Tab character (represented by the VBScript constant vbTab):<\/p>\n<pre class=\"codeSample\">arrText = Split(strText, vbTab)\n<\/pre>\n<p>That\u2019s going to give us an array that looks like this, with the asterisk representing the Tab character:<\/p>\n<pre class=\"codeSample\">Person Name\n*\n*\nKen Myer\n<\/pre>\n<p>As you can see, the user name is the last item in the array. How can we actually <i>retrieve<\/i> that value? Well, for starters, we can use this line of code, and the <b>UBound<\/b> function, to determine the index number of the last item in the array:<\/p>\n<pre class=\"codeSample\">intIndex = Ubound(arrText)\n<\/pre>\n<p>And then we can use <i>this<\/i> line of code to grab the user name and store it in a variable named strUserName:<\/p>\n<pre class=\"codeSample\">strUserName = arrText(intIndex)\n<\/pre>\n<p>Now we\u2019re getting somewhere. Next we have to figure out a way to reformat this name (that is, turn <i>Ken Myer<\/i> into <i>Myer, Ken<\/i>). To do that, we start by creating yet another array, this one splitting strUserName on the blank space between <i>Ken<\/i> and <i>Myer<\/i>:<\/p>\n<pre class=\"codeSample\">arrUserName = Split(strUserName, \" \")\n<\/pre>\n<p>That creates a new array, with item 0 equal to <i>Ken<\/i> and item 1 equal to <i>Myer<\/i>.<\/p>\n<p>This, by the way, is where we ran into a problem. (Not as big a problem as the two guys trying to cash a dead man\u2019s Social Security check, but a problem nonetheless.) Because the user\u2019s last name comes at the end of a line it has an end-of-paragraph mark appended to it; that means that array item 1 is actually equal to <i>this<\/i>, with the asterisk representing the end-of-paragraph mark:<\/p>\n<pre class=\"codeSample\">Myer*\n<\/pre>\n<p>If we hope to get output that\u2019s actually readable we need to get rid of that last character. (Trust us on this one; we learned that the hard way.) One easy way to get rid of that character is to first use the <b>Len<\/b> function to determine the total number of characters in the string:<\/p>\n<pre class=\"codeSample\">intLength = Len(arrUserName(1))\n<\/pre>\n<p>Once we\u2019ve done that we can then use the <b>Left<\/b> function to grab all the characters in the string <i>except<\/i> the last one (the length of the string minus 1). That\u2019s what we do here:<\/p>\n<pre class=\"codeSample\">strName = Left(arrUserName(1), intlength - 1)\n<\/pre>\n<p>Now, at long last, we can reformat our name using this line of code, storing the value in the variable strUserName:<\/p>\n<pre class=\"codeSample\">strUserName = strName &amp; \", \" &amp; arrUserName(0)\n<\/pre>\n<p>That takes care of line 1 in the Word document. For line 2, we use a similar approach, this time grabbing the Text of the second paragraph in the file:<\/p>\n<pre class=\"codeSample\">strText = objDoc.Paragraphs(2).Range.Text\n<\/pre>\n<p>After isolating the date portion of the string we then use this line of code to replace all the \/\u2019s in the date with, well, nothing:<\/p>\n<pre class=\"codeSample\">strDate = Replace(strDate, \"\/\", \"\")\n<\/pre>\n<table class=\"dataTable\" id=\"ERG\" cellSpacing=\"0\" cellPadding=\"0\">\n<thead><\/thead>\n<tbody>\n<tr class=\"record\" vAlign=\"top\">\n<td class=\"\">\n<p><b>Note<\/b>.Why do we do that? That\u2019s right: because you can\u2019t use the \/ in a file name.<\/p>\n<p>And neither can we.<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"dataTableBottomMargin\"><\/div>\n<p>We next we use these two lines of code to remove the end-of-paragraph mark from the date:<\/p>\n<pre class=\"codeSample\">intLength = Len(strDate)\nstrDate = Left(strDate, intlength - 1)\n<\/pre>\n<p>Got all that? Good. Let\u2019s take a moment to recap where we are. We now have a variable (strUserName) equal to this:<\/p>\n<pre class=\"codeSample\">Myer, Ken\n<\/pre>\n<p>We also have a second variable (strDate) equal to this:<\/p>\n<pre class=\"codeSample\">010108\n<\/pre>\n<p>Now that we have those two items, we can go ahead and construct a new file path using this line of code:<\/p>\n<pre class=\"codeSample\">strFileName = \"C:\\Scripts\\\" &amp;  strUserName &amp; \" \" &amp; strDate &amp; \".doc\"\n<\/pre>\n<p>That\u2019s going to result in strFileName being equal to this:<\/p>\n<pre class=\"codeSample\">C:\\Scripts\\Myer, Ken 010108.doc\n<\/pre>\n<p>Which is just exactly what we <i>wanted<\/i> it to be equal to.<\/p>\n<p>At this point we no longer have any need for the Word document; consequently, we call the <b>Quit<\/b> method to terminate Microsoft Word:<\/p>\n<pre class=\"codeSample\">objWord.Quit\n<\/pre>\n<p>And then we use this line of code to pause the script for 5 seconds:<\/p>\n<pre class=\"codeSample\">Wscript.Sleep 5000\n<\/pre>\n<p>Why do we pause the script? Well, if we close Word and then immediately try to rename the file we\u2019re likely to get an \u201cAccess Denied\u201d error; that\u2019s going to happen if Word hasn\u2019t fully terminated, and thus still has a lock on the file. Pausing for 5 seconds gives Word enough time to close before we rename the file.<\/p>\n<p>Speaking of which, these two lines of code rename the file:<\/p>\n<pre class=\"codeSample\">Set objFSO = CreateObject(\"Scripting.FileSystemObject\")\nobjFSO.MoveFile \"C:\\Scripts\\Test.doc\", strFileName\n<\/pre>\n<p>As you can see, <i>this<\/i> part of the script isn\u2019t the least bit complicated. All we do here is create an instance of the <b>Scripting.FileSystemObject<\/b> object, then use the <b>MoveFile<\/b> method to rename the file.<\/p>\n<table class=\"dataTable\" id=\"EDAAC\" cellSpacing=\"0\" cellPadding=\"0\">\n<thead><\/thead>\n<tbody>\n<tr class=\"record\" vAlign=\"top\">\n<td class=\"\">\n<p class=\"lastInCell\"><b>Note<\/b>. OK, maybe it\u2019s not <i>complicated<\/i>, but it is a little confusing. For some reason, the FileSystemObject doesn\u2019t have a Rename method; instead it requires you to \u201cmove\u201d the file from one path to another. If those paths are in the same folder, however, that effectively causes the file to be renamed. (Give it a try and you\u2019ll see what we mean.)<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"dataTableBottomMargin\"><\/div>\n<p>By the way, we should also note that this script is designed to run only on the local computer. Could it be redesigned to work against the files on a remote computer? Yes, but it can be a little tricky to use Word with files on a remote machine. But what the heck: if this is something you\u2019d like us to write about, well, just let us know. We\u2019ll see what we can do. <\/p>\n<p>This is actually a very nice little script, except for one thing: what TG <i>really<\/i> wanted to do was be able to run this script against all the files in a particular folder. Hey, no problem. Now that you have a basic understanding of what the script does, and how it does it, we can show you the do-this-for-all-the-files-in-a-folder version, and without any additional explanation. Enjoy!<\/p>\n<p>Oh, right; here\u2019s the script:<\/p>\n<pre class=\"codeSample\">Set objFSO = CreateObject(\"Scripting.FileSystemObject\")\nSet objFolder = objFSO.GetFolder(\"C:\\Temp\")\n\nSet objWord = CreateObject(\"Word.Application\")objWord.Visible = True\n\nFor Each objFile in objFolder.Files\n    Set objDoc = objWord.Documents.Open(objFile.Path)\n\n    strText = objDoc.Paragraphs(1).Range.Text\n    arrText = Split(strText, vbTab)\n    intIndex = Ubound(arrText)\n    strUserName = arrText(intIndex)\n\n    arrUserName = Split(strUserName, \" \")\n    intLength = Len(arrUserName(1))\n    strName = Left(arrUserName(1), intlength - 1)\n\n    strUserName = strName &amp; \", \" &amp; arrUserName(0)\n\n    strText = objDoc.Paragraphs(2).Range.Text\n    arrText = Split(strText, vbTab)\n    intIndex = Ubound(arrText)\n\n    strDate = arrText(intIndex)\n    strDate = Replace(strDate, \"\/\", \"\")\n\n    intLength = Len(strDate)\n    strDate = Left(strDate, intlength - 1)\n\n    strFileName = \"C:\\Temp\\\" &amp;  strUserName &amp; \" \" &amp; strDate &amp; \".doc\"\n\n    objDoc.Close\n    Wscript.Sleep 2000\n\n    Set objFSO = CreateObject(\"Scripting.FileSystemObject\")\n    objFSO.MoveFile objFile.Path, strFileName\nNext\n\nobjWord.Quit\n<\/pre>\n<p>That should do it, TG. By the way, in the interest of full disclosure the Scripting Guys should confess that we once tried to pull a scam very similar to the one tried by the two would-be check cashers. When Scripting Guy Peter Costantini died last year the surviving Scripting Guys put him in a chair and wheeled him down to the Benefits office, hoping to collect his last paycheck and his unused vacation time. Unfortunately, we failed to take into account the fact that we were dealing with Microsoft: they took one look at Peter and immediately promoted him to Program Manager!<\/p>\n<table class=\"dataTable\" id=\"E1AAC\" cellSpacing=\"0\" cellPadding=\"0\">\n<thead><\/thead>\n<tbody>\n<tr class=\"record\" vAlign=\"top\">\n<td class=\"\">\n<p class=\"lastInCell\"><b>Note<\/b>. OK, as it turns out, Peter didn\u2019t really die last year; he\u2019s as alive and well as ever. In fact, now that we think about it, the whole time we were wheeling him down to the Benefits office he kept insisting that he was still alive. Apparently we\u2019ve just gotten so used to ignoring everything Peter says that we didn\u2019t pay any attention to him. Sorry, Peter!<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n","protected":false},"excerpt":{"rendered":"<p>Hey, Scripting Guy! I have a bunch of Word documents in a folder. I need to open each document, read in some information from the first two lines, then use that information to rename the file. However, I\u2019m having some trouble getting this to work. Can you help me? &#8212; TG Hey, TG. As devoted [&hellip;]<\/p>\n","protected":false},"author":595,"featured_media":87096,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[84,49,3,5],"class_list":["post-63273","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-scripting","tag-microsoft-word","tag-office","tag-scripting-guy","tag-vbscript"],"acf":[],"blog_post_summary":"<p>Hey, Scripting Guy! I have a bunch of Word documents in a folder. I need to open each document, read in some information from the first two lines, then use that information to rename the file. However, I\u2019m having some trouble getting this to work. Can you help me? &#8212; TG Hey, TG. As devoted [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts\/63273","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/users\/595"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/comments?post=63273"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts\/63273\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/media\/87096"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/media?parent=63273"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/categories?post=63273"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/tags?post=63273"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}