February 28th, 2008

Hey, Scripting Guy! How Can I Move Files Based on a Portion of the File Name?

Hey, Scripting Guy! Question

Hey, Scripting Guy! I have tens of thousands of files that all have a file name that includes a number bounded by two underscore characters; for example, P_19_L00.jpg. I need to figure out which number is embedded in a file name, then move that file to a folder named after that number. In this case, that means moving P_19_L00.jpg to the folder D:\19. I have been searching the Internet for weeks and can’t come up with anything. Please help.

— MGD

SpacerHey, Scripting Guy! AnswerScript Center

Hey, MGD. You know, this is kind of an unusual day for the Scripting Guy who writes this column. Why? Well, usually when someone fouls things up it’s the Scripting Guy who writes this column. (Anyone else out there remember Event 8 from the 2007 Winter Scripting Games?) As it turns out, we sort of fouled things up with Event 5 in the Beginners Division of this year’s Winter Scripting Games. (Which, we might add, continue to run until March 3rd. Still plenty of time to enter.) Event 5 was an admittedly-tough task, especially for beginners, and it resulted in a lot of people getting zeros for the event; it also resulted in a lot of people being less-than-thrilled with the score they received. But here’s the weird part: that event does not belong to the Scripting Guy who writes this column; instead, it belongs to Scripting Guy Jean Ross. For once someone other than the Scripting Guy who writes this column has done something to cause problems!

Note. Is that because it’s hard to do something that causes problems unless you actually do something? We’d better not say ….

At any rate, remembering all the sympathy and support he’s gotten every time he screws up, the Scripting Guy who writes this column would like to say this about Event 5 in the Beginners Division: if you feel you were wronged then send Scripting Guy Jean Ross an email, right away. Heck, send her 100 emails: let her know that she screwed up. Whatever you do, don’t let her get away with this!

Not that we’re being vindictive or anything. We just feel it’s important that Jean gets what she deserves.

Um, by which we mean that you get what you deserve. Your welfare, and your score in the 2008 Winter Scripting Games, is our only concern. Really.

Note. Although it is kind of nice to see Scripting Guy Jean Ross (who, until now, truly was practically perfect in every way) take a little heat for once.

Anyway, drop Jean a line and let her know how you feel. What if you didn’t even compete in Event 5? Well, so what? After all, you don’t have to own a toxic waste dump in order to be concerned about pollution, right?

Like we said, this is a highly unusual day for the Scripting Guy who writes this column. With that in mind, he decided that the best thing he could probably do is just try and go about his normal routine, to act as though nothing has changed. And what does his normal routine typically consist of? Well, fortunately for MGD, his normal routine typically involves writing scripts that can extract a portion of a file name and then move the file based on that piece of the file name:

Set objRegEx = CreateObject("VBScript.RegExp")

objRegEx.Global = True   
objRegEx.Pattern = "_\d{1,}_"

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFolder = objFSO.GetFolder("C:\Images")

Set colFiles = objFolder.Files

For Each objFile in colFiles
    strSearchString = objFile.Name
    Set colMatches = objRegEx.Execute(strSearchString)  

    For Each strMatch in colMatches
        strFolderName = strMatch.Value
        strFolderName = Replace(strFolderName, "_", "")
        strFolderName = "D:\" & strFolderName & "\"
        If Not objFSO.FolderExists(strFolderName) Then
            Set objNewFolder = objFSO.CreateFolder(strFolderName)
        End If
    Next

    objFSO.MoveFile objFile.Path, strFolderName
Next

As you can see, our script starts out by creating an instance of the VBScript.RegExp object, the object that enables us to use regular expressions within a VBScript script. Do we even need to use a regular expression in this script? Well, maybe we don’t need to, but it definitely makes life much easier. Why? Well, MGD has file names similar to these:

C:\Images\P_19_L00.jpg
C:\Images\P_19_A01.jpg
C:\Images\P_7658_T00.jpg
C:\Images\P_7658_W04.jpg
C:\Images\P_8291517_NI4.jpg

As you can see, the number of digits in the file name can – and does – vary. If the number of digits was always the same we wouldn’t need a regular expression; we could just use the Mid function and grab the middle 2, or 3, or 4 characters. But that works only if the number of digits remains the same in each file name. In MGD’s case, the number of digits varies from file name to file name. The Mid function can’t deal with that; a regular expression can.

That’s why we decided to use a regular expression.

After creating an instance of the RegExp object we configure two properties of this object. To begin with, we set the Global property to True; that simply tells the script to search for all instances of the target text. To be honest, that’s not really important in this case; after all, no file name will ever have more than one instance of the target text anyway. However, more often than not you will want to find all instances of the target text. Therefore, we thought we’d take the time to show you how to do that.

In addition to configuring the Global property we also assign a value to the Pattern property:

objRegEx.Pattern = "_\d{1,}_"

The Pattern property is the spot where we define the target text (that is, the text we are looking for). Using standard regular expression syntax, this line of code tells the script to look for an underscore character (_) followed by 1 or more numbers (\d{1,}) followed by another underscore (_). That Pattern will find the _19_ in P_19_L00.jpg. However, it won’t find the 19A in P_19A_L00.jpg. Why not? That’s right: the Pattern tells the script that the only characters that can appear between the two underscores are the digits 0 through 9. P_19A_L00.jpg fails to make the cut because of the letter A between the two underscores.

That actually makes sense, doesn’t it? Wow, this is an unusual day!

At this point we’re ready to start moving files. To that end, the first thing we do is create an instance of the Scripting.FileSystemObject, then use this line of code to bind to the folder C:\Images:

Set objFolder = objFSO.GetFolder("C:\Images")

Once we’ve made a connection to the folder we can use the following line of code to retrieve a collection of all the files in that folder:

Set colFiles = objFolder.Files

That’s a good point: because we’re using the FileSystemObject (instead of WMI) that does pretty much restrict us to running this script on the local computer, doesn’t it? There are two reasons we decided to go this route. For one thing, that’s what MGD needs; he doesn’t need to run this script against a remote computer. For another, running this script against a remote machine introduces a number of potential problems, including the difficulty in creating folders on a remote computer (although we do have one solution to that problem). Could we perform this task against a remote machine? Yes, albeit with a few limitations. If you have an interest in that let us know and we’ll see if we can address that issue in a future column.

For the present column, however, our next step is to set up a For Each loop to walk us through the collection of files found in C:\Images. Inside that loop, we grab the value of the file’s Name property and store it in a variable named strSearchString:

strSearchString = objFile.Name

Once we’ve done that we then call the Execute method to search the file name for the target text (numbers sandwiched between two underscore characters):

Set colMatches = objRegEx.Execute(strSearchString)

If the target text can be found in the file name then all instances of that target text will be stored in a collection we named colMatches. To get at that information all we have to do is set up a second For Each loop, this one designed to walk through all the items in the colMatches collection:

For Each strMatch in colMatches

So what are we going to do inside this second loop? Well, for starters, we’re going to retrieve the matching text by assigning the match’s Value property to a variable named strFolderName:

strFolderName = strMatch.Value

That’s going to make strFolderName equal to, say, this:

_19_

Good observation: if it wasn’t for those pesky underscore characters that would be the folder name we wanted, wouldn’t it? OK, then why don’t we just rid of those underscore characters:

strFolderName = Replace(strFolderName, "_", "")

All we’re doing here is using the Replace function to replace something (any underscore characters) with absolutely nothing.

Note. Did Microsoft use the Replace function when they added the Scripting Editor to the Scripting Guys team? As a matter of fact, they – well, never mind. We better not get into that, either.

The net result of all this is that strFolderName will now be equal to 19. That’s better, but it’s still not what we need in order to move files; to do that we need a complete path, like D:\19\. That’s what this line of code is for:

strFolderName = "D:\" & strFolderName & "\"

As soon as we have a complete folder path we can then use this line of code to determine whether or not the specified folder already exists:

If Not objFSO.FolderExists(strFolderName) Then

If D:\19 does exist, well, that’s great; that means we can simply continue on with the rest of the script. If D:\19 doesn’t exit, well, then we use the following line of code to create a new folder by that name:

Set objNewFolder = objFSO.CreateFolder(strFolderName)

And once we’ve done all that we can finally move the file P_19_L00.jpg from the folder C:\Images to the folder D:\19; that’s what we’re doing here:

objFSO.MoveFile objFile.Path, strFolderName

From there we simply repeat the process with the next file in the collection.

And that’s all we have to do. By the time the script finishes running each file in the folder C:\Images (or at least each file that has the target text somewhere in its file name) will have been moved to the appropriate folder on drive D. Not bad, eh?

That should do it, MGD; give the script a try and see what happens. As for the Scripting Guys, the team scoreboard now looks a little like this:

Jean Ross’ Major Mistakes

1.

Event 5 in the 2008 Winter Scripting Games

The Scripting Guy Who Writes This Column’s Major Mistakes

1.

Event 8 in the 2007 Winter Scripting Games

2.

The Anti-Spyware Thing

3.

Speaking favorably of the Search Engine That Shall Not Be Named

4.

Making fun of Microsoft Bob in a Script Center article

Etc., etc., etc. (The entire list goes on for about 20 pages or so.)

Hey, Jean: looks like we’re even now, eh?

Author

0 comments

Discussion are closed.