Share this post: |
Hey, Scripting Guy! We have a number of files that were manually created by various people in a common folder. Unfortunately, they did not follow a specific naming convention as well as they should, and as a result they do not sort very well. Some people preceded the file names with the word “File,” and others appended their name to the file. In the middle are numbers that correspond to purchase order numbers. I would like to get rid of the word “File” and the person’s name from the files and be left with only the purchase order numbers for the file names. There are hundreds of files in this folder, and it will take an entire weekend to clean this junk up. With football season starting up in the United States, I would much rather watch football and consume junk food than spend the weekend cooped up in an office cleaning up after clueless users.
— JPC
Hello JPC,
Microsoft Scripting Guy Ed Wilson here. You are right. Football season in the United States is a special time of the year. Because it is often accompanied by the transition from hot summer months to more temperate fall weather, it is a special time of the year. Or maybe it is just the marching bands and their half time shows that make things interesting. In Charlotte, North Carolina, we have a professional football team as well as several college and university teams we can watch. It makes things interesting on the weekend if you are trying to consume as much football and junk food as possible. Speaking of interesting, a similar question was asked a few years ago by RE when RE wanted to know how to delete specified characters from the beginning and end of file names. In that article, a pretty cool VBScript was created. It is seen here.
DeleteCharactersFromBeginningAndEndOfFileNames.vbs
strComputer = "."
Set objWMIService = GetObject("winmgmts:\\" & strComputer & "\root\cimv2")
Set colFiles = objWMIService.ExecQuery _
("ASSOCIATORS OF {Win32_Directory.Name=’C:\Test’} Where " _
& "ResultClass = CIM_DataFile")
For Each objFile In colFiles
strPath = objFile.Drive & objFile.Path
strExtension = objFile.Extension
strFileName = objFile.FileName
If Left(strFileName, 5) = "File " Then
intLength = Len(strFileName)
strFileName = Right(strFileName, intLength – 5)
End If
If Right(strFileName, 7) = " George" Then
intLength = Len(strFileName)
strFileName = Left(strFileName, intLength – 7)
End If
strNewName = strPath & strFileName & "." & strExtension
errResult = objFile.Rename(strNewName)
Next
JPC, a Windows PowerShell script could be produced that would use the Associators Of query from WMI and could mimic rather closely the syntax of the VBScript. But this script points out the danger of blindly translating VBScript code into Windows PowerShell code because in Windows PowerShell there is a much better way of performing this task than using the rather complicated Associators Of WMI query. For more information about translating Associators Of WMI queries from VBScript to Windows PowerShell (and why you might need to do such a thing), see the How Do I Migrate My VBScript Queries to Windows PowerShell? article.
I actually wrote two Windows PowerShell scripts that will delete words from the beginning and end of the file names that are stored in the folder. The folder containing the target files is seen here:
Both scripts use the replace operator to replace the strings with an empty space. The first script is called DeleteCharactersFromBeginningAndEndOfFileName.ps1 and is shown here.
DeleteCharactersFromBeginningAndEndOfFileName.ps1
Get-ChildItem -path c:\fso -Filter *.txt |
ForEach-Object {
$name = $_.name -replace "file ",""
$name = $name -replace " george",""
Rename-Item -Path $_.fullname -NewName $name
}
Get-ChildItem -path c:\fso -Filter *.txt
In the DeleteCharactersFromBeginningAndEndOfFileName.ps1 script, the first thing that is done is to retrieve all of the text files. Because text files have a .txt extension, the –filter parameter is used to limit the search results that are returned by the Get-ChildItem cmdlet. The –path parameter tells the Get-ChildItem cmdlet where to begin looking for all the text files. The results of this command are piped to the ForEach-Object cmdlet. This is seen here:
Get-ChildItem -path c:\fso -Filter *.txt |
The ForEach-Object cmdlet is used to allow Windows PowerShell to work with each file as it comes across the pipeline. The first thing that is done is to replace the word “file” with an empty string. The replace operator is used to do pattern matching on a string. The name property from the System.IO.FileInfo class returns a string, and therefore the tostring method is not required to convert the name to a string. The replace operator takes two parameters: the first is the pattern to be matched, and the second is the value to replace the matched pattern with. This is illustrated here:
PS C:\> $a = Get-Item C:\fso\a.txt
PS C:\> $a.Name
a.txt
PS C:\> $a.Name -replace "a","b"
b.txt
PS C:\>
After the word “file” has been replaced, the resulting string is stored in the variable $name. This is seen here:
$name = $_.name -replace "file ",""
This line of code takes care of the problem of the word “file” when it is found at the beginning of the string. Now you need to replace the word “George,” which occurs at the end of some of the strings. To do this, use the string stored in the $name variable, and the replace operator to clean up the end of the file names. This is seen here:
$name = $name -replace " george",""
After the second replace operation has completed, the new file name is used in a call to the Rename-Item cmdlet. When using the Rename-Item cmdlet, the first parameter is the path parameter that must give the complete location to the file. The second parameter is the new name for the file. You cannot use the Rename-Item cmdlet to rename and move a file to a new location. To move a file, you must use the Move-Item cmdlet. The renaming of the file is seen here:
Rename-Item -Path $_.fullname -NewName $name
After all of the files have been renamed, the Get-ChildItem cmdlet is used to produce a listing of the folder to ensure the file names have been changed. This is seen here:
Get-ChildItem -path c:\fso -Filter *.txt
The renamed files are seen in this image:
The second script is a little more efficient and a bit shorter, but is also more complicated to read. The change to the script involves using one replace command instead of two. For this scenario, the results of the two scripts are exactly the same, but the two replace operations are not the same. The second form is more flexible because it will remove everything that is not a number from the file name. To do this, it uses the regular expression pattern \D that matches everything that is not a number.
For more information on using regular expressions from within Windows PowerShell, you can refer to this collection of Hey Scripting Guy! articles. Pay particular attention to the ones from the week of April 13, 2009, because that was Regular Expression Week on the TechNet Script Center.
The use of the \D regular expression pattern is shown here:
PS C:\> $filz = "file 123 ed.txt","fyle 145556 bob.txt"
PS C:\> $filz | foreach-object { $_ -replace "\D","" }
123
145556
PS C:\>
The modified line of code from the RegexRemoveCharactersFromFileName.ps1 script is seen here:
$name = ($_.name -replace "\D","") + ".txt"
The complete RegexRemoveCharactersFromFileName.ps1 script is seen here:
RegexRemoveCharactersFromFileName.ps1
Get-ChildItem -path c:\fso -Filter *.txt |
ForEach-Object {
$name = ($_.name -replace "\D","") + ".txt"
Rename-Item -Path $_.fullname -NewName $name
}
Get-ChildItem -path c:\fso -Filter *.txt
Well, JPC, that is about all there is to renaming folders that are named after a specific pattern. If you want to know exactly what we will be covering tomorrow, follow us on Twitter or Facebook. If you have any questions, send e-mail to us at scripter@microsoft.com or post them on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.
Ed Wilson and Craig Liebendorfer, Scripting Guys
0 comments