How Can I Identify the Last File in a Sequential List of Files?

ScriptingGuy1

Hey, Scripting Guy! Question

Hey, Scripting Guy! I have a folder which contains thousands of files, all numbered sequentially. How can I tell which file is the last file in the list?

— AC

SpacerHey, Scripting Guy! AnswerScript Center

Hey, AC. You know, one of the fun things about being a Scripting Guy is that we get to spend a lot of time shattering myths. For example, when we first started at Microsoft there was a myth that system administrators had no interest in (or aptitude for) scripting and automation. Shattered! Likewise it was a well-accepted “fact” that WMI was way too difficult for script writers to learn. Shattered! Today we tackle yet another well-entrenched myth: if you want to do something right then you should work hard while doing it.

Need we say it? Shattered!

Note. Technically, the Scripting Guys have nothing against hard work. As long as we don’t have to any of it, well, what difference does it make to us?

As you noted, AC, you have a folder that contains thousands of files, each having a name similar to this:

200500001.pdf
200500002.pdf
200500003.pdf
200500004.pdf
…
200509999.pdf

In the example above, file 200509999.pdf is the last file in the sequential list. The question is, how do you know that? How can you write a script that can identify the last file in a sequential list of files?

As you might expect, the Scripting Guy who writes this column took one look at this question and immediately came up with a solution. Of course, it was a complicated and convoluted solution, and our hero knew it would take some real effort to create the script and then write the column that explained how the script worked. But was the prospect of a little hard work enough to scare off the Scripting Guy who writes this column? You better believe it was.

In fact, instead of rolling up his sleeves and getting down to work, the Scripting Guy who writes this column found all sorts of other things to do, things which he hadn’t thought about in months but which suddenly became high-priority, must-do items. But here’s the moral of the story: being lazy turned out to be a good thing. Had he been the earnest, hard-working type, the Scripting Guy who writes this column would have come up with a script that would have been far more complicated than it needed to be. By contrast, shirking his responsibilities for awhile gave him time to realize that this problem could be solved in far-easier fashion, and without resorting to arrays, bubble sorts, and several eyes of newt, all of which were required in his initial solution. Being lazy turned into a win-win situation: the Scripting Guy who writes this column didn’t have to work very hard, and you ended up with a script that’s short, sweet, and to the point.

In other words, a script that looks like this:

strComputer = “.”

Set objWMIService = GetObject(“winmgmts:\\” & strComputer & “\root\cimv2”)

Set colFileList = objWMIService.ExecQuery _ (“ASSOCIATORS OF {Win32_Directory.Name=’C:\PDFs’} Where ” _ & “ResultClass = CIM_DataFile”)

For Each objFile In colFileList strFile = objFile.FileName Next

Wscript.Echo strFile

So how did our hero almost turn such a simple script into a nightmarish morass of code? Well, the problem was that he thought about it too hard. He knew that, in order to determine which file comes last in the list he would need to sort the files by file name. He also knew that once the files were sorted he had to be able to identify the very last file in the alphabetical list. Determined to make a mountain out of a molehill, he immediately envisioned a solution involving arrays, disconnected recordsets, and an army of highly-trained St. Bernards, none of which were actually needed.

So why weren’t they needed? Two reasons. First, there’s no need to sort the files by file name; if you use WMI to retrieve the files they will, by default, already come sorted by file name. Had he acted on his first impulse, the Scripting Guy who writes this column would have written some fancy-schmancy code whose sole purpose was to sort a collection that was already sorted. That seemed a tad bit silly, even for a Scripting Guy.

Second, it’s true that we need to identify the last file in the collection. And, admittedly, we could do that by storing all the files in an array, using the Ubound function to identify the last item in that array, and then echoing back the value of that last item. But, again, that’s way more work than we need to put in. We already have a collection of pre-sorted files; instead of adding all those files in an array, why not just quickly rifle through the collection? As soon as we run out of files we’ll know which one is the last file in the list: that’ll be the file we just finished looking at.

And so that’s exactly what we do. This far-simpler script begins by connecting to the WMI service on the local computer (although it could just as easily perform this same task on a remote computer). We then use this crazy-looking Associators of query to return a collection of all the files in the folder C:\PDFs:

Set colFileList = objWMIService.ExecQuery _
    (“ASSOCIATORS OF {Win32_Directory.Name=’C:\PDFs’} Where ” _
        & “ResultClass = CIM_DataFile”)

Note that we’re assuming that all the files in this folder have been named using the sequential-numbering scheme. If there are other files in this folder you’ll need to modify your For Each loop to discard files that aren’t part of the target collection

Speaking of For Each loops, that’s our next step: we set up a For Each loop to walk through the collection of files, files – we might add – that are already sorted for us by file name (first 200500001.pdf, then 200500002.pdf, etc.). Notice that, inside the loop, we only do one thing: we assign the FileName of the current file to a variable named strFile:

strFile = objFile.FileName

Why do we do that? Well, let’s say we have three files in the collection. The first time through the loop strFile will be assigned the name of the first file in the collection (2000500001.pdf). The second time through the loop, strFile gets assigned the name of the second file in the collection (200500002.pdf). And – oh, you’re way ahead of us, aren’t you? Yes, the third and final time through the loop strFile gets assigned the name of the third file in the collection. At that point we exit the loop, with strFile equal to 200500003.pdf.

Believe it or not that is a big deal. We’re looking for the name of the last file in the sequence, right? Well, guess what: we just found it. The last file we looked at is also the last file in the sequence; to get our answer all we have to do is echo back the value of strFile:

Wscript.Echo strFile

It’s not fancy, but it does the trick, and – best of all – it does it with a minimum amount of effort. And it’s reasonably fast, too: we tried it with 5,000 files in a folder and the script completed its work in less than 15 seconds. That was good enough for us.

Of course, we should point out that this script works only because your naming system includes leading zeroes in the file names. Unfortunately, this script will not work if you have file names similar to this:

20056.pdf
20057.pdf
20058.pdf
20059.pdf
200510.pdf

Why not? Because these names are going to be sorted alphabetically and not numerically, which means they’ll be sorted like this:

200510.pdf
20056.pdf
20057.pdf
20058.pdf
20059.pdf

Unfortunately, the script will identify file 20059.pdf as the last file in the sequence, which is definitely not the case. But we shouldn’t blame the script; type dir c:\pdfs and the files will be sorted in the same fashion: alphabetic rather than numeric.

So is it possible to deal with file names that don’t include leading zeroes? Yes, although that would require quite a bit of effort, and you already know how the Scripting Guys feel about hard work and effort. But if that’s something you really need to know, drop us a line and we’ll see what we can do. We’ll start resting up, just in case.

0 comments

Discussion is closed.

Feedback usabilla icon