August 14th, 2008

Hey, Scripting Guy! How Can I Find Files’ Metadata?

Hey, Scripting Guy! Question

Hey Scripting Guy! I have a folder of media files, documents, etc., and I would like to see the metadata that is associated with each file. Can I do this with Windows PowerShell?

— EJ

SpacerHey, Scripting Guy! Answer

Hi EJ,

I suspect that you are more interested in “data about data.” Various applications store additional descriptive information that is used in different ways. Which metadata is stored depends in part on the type of file in question. For instance, a picture file could store information such as the kind of camera that was used to take the picture, the focal length of the lens, and so on. A music file might store information such as the genre of the song and the bit rate of the recording. For document files, we may be interested in whom the owner of the document is, the authors, and the amount of time spent editing the file. As we can see here, the metadata that is available varies with the file. Note that the rating field is available for the music and for the picture file, but not available for the documents. The camera model is applicable only to the picture file, and does not make sense for a Word document.

The entire discussion of metadata is one of those kind of “it depends” questions that we in IT are faced with so often. You should keep in mind that all metadata is not automatically populated, even if it is germane to the file type. For instance, no program is yet smart enough to prepopulate the rating accurately. However, the camera wizard in Windows Vista is smart enough to populate my photographs with the correct model of camera I used to take the pictures. So once again, the presence of metadata is one of those “it depends” things.

Not only can you retrieve metadata with Windows PowerShell, but it can be done in a rather simple fashion. In our script today, we will use the shell.application com object that we have used in various VBScripts in the past. Here is what we use to display the metadata (we’ll call this the “DisplayMetaData.ps1” script):

($folder = "C:\test") #end param
funLine($strIN) 
{
 $strLine = "=" * $strIn.length
 Write-Host -ForegroundColor Yellow "`n$strIN"
 Write-Host -ForegroundColor Cyan $strLine
} #end funline
funMetaData()
{
foreach($sFolder in $folder)
 {
  $a = 0
  $objShell = New-Object -ComObject Shell.Application
  $objFolder = $objShell.namespace($sFolder)

  foreach ($strFileName in $objFolder.items())
   { FunLine( "$($strFileName.name)")
     for ($a ; $a  -le 266; $a++)
      { 
        if($objFolder.getDetailsOf($strFileName, $a))
          {
            $hash += @{ `
                  $($objFolder.getDetailsOf($objFolder.items, $a))  =`
                  $($objFolder.getDetailsOf($strFileName, $a)) 
                  } #end hash
           $hash
           $hash.clear()
          } #end if
      } #end for 
    $a=0
   } #end foreach
 } #end foreach
} #end funMetadata 

# *** Entry Point ***

The first thing we do in the DisplayMetaData.ps1 script is use the param statement. This statement must be the first noncommented line in the script. It can appear on line 12 if you have 11 commented lines ahead of it or if you have 11 blank lines. If you want to print out something such as “This is my cool new script,” it must appear after the param statement. This requirement is similar to the Option Explicit requirement in VBScript (which also must be on the first noncommented line in the script…hmmm, come to think of it, this requirement is exactly the same as the requirements for Option Explicit.)

The next thing we do is create a function named funline. The logic for the name is hey it’s a function, and it creates a line. I used to use a function that was really similar to this in VBScript. It is really easy code, but is great to use to separate long lines of output. To create a function in Windows PowerShell, we use the function keyword, give it a name, and use the parentheses to contain any input variables. In this instance, we are going to accept a single input variable named $strIN (it is a string and it is coming into the function, hence the name). We are going to use a series of equal signs for the line separator and we will create this string by multiplying the equal sign by the length of the string contained in the $strIN variable. We then use two Write-Host commands. The first one prints out the value of the $strIN variable in yellow, and the next command prints out our line separator in cyan. (There’s a reason they don’t call me the Design Guy.) This section of code is seen here:

funLine($strIN) 
{
 $strLine = "=" * $strIn.length
 Write-Host -ForegroundColor Yellow "`n$strIN"
 Write-Host -ForegroundColor Cyan $strLine
} #end funline

We now come to the main function in the script. After using the Function keyword to create our new function, we come to the foreachstatement. This is the same as the foreach next statement from VBScript, only we do not need to use the next keyword and we need to use smooth parentheses and curly brackets. Other than that, it is the same. Because we say foreach($sFolder in $folder), we can call our script with multiple folders and have it search more than one folder for file metadata. To do this, we would use the following syntax from the command line:

PowerShell:> DisplayMetaData.ps1 -folder "C:\test","C:\test\sub"

The variable $a is used to start the position for obtaining the file metadata. There could be up to 267 different metadata fields and they start numbering at 0. Next we create the shell.application object by using the New-Object cmdlet. We then use the namespace method to connect to the first folder that was specified when the script was run. If there is only one folder that was specified, we will simply connect to that folder. One of the nice things about Windows PowerShell is that we do not need to detect how many folders were specified from the command line and then choose the appropriate method. We can simply use foreach and allow Windows PowerShell to handle the details.

Next we call the funLine function to print out and to underline the file name from the folder. We then use a For/Next loop (only there is no Next part to the loop), and we call the getDetailsOf method to obtain the metadata from the file.

When we have the metadata, we store it in a hash table. To create the hash table, we use the @{key=value} syntax. The first column must be unique because it will be used as the key property. The hash table in Windows PowerShell is very similar to the dictionaryobject from VBScript in that they both store a collection of keys and associated values. In our hash table, we are storing the name of the metadata field and the associated value. After we are finished building our hash table, we print it out by simply listing the name of the hash table: $hash. We then call the clear() method of the hash table object, and go on to the next file in the folder. This section of the script is seen here:

funMetaData()
{
foreach($sFolder in $folder)
 {
  $a = 0
  $objShell = New-Object -ComObject Shell.Application
  $objFolder = $objShell.namespace($sFolder)

  foreach ($strFileName in $objFolder.items())
   { funLine( "$($strFileName.name)")
     for ($a ; $a  -le 266; $a++)
      { 
        if($objFolder.getDetailsOf($strFileName, $a))
          {
            $hash += @{ `
                  $($objFolder.getDetailsOf($objFolder.items, $a))  =`
                  $($objFolder.getDetailsOf($strFileName, $a)) 
                  } #end hash
           $hash
           $hash.clear()
          } #end if
      } #end for 
    $a=0
   } #end foreach
 } #end foreach
} #end funMetadata

The last thing to do is call the funMetaData function. Super simple. Just type the name of the function. This line of code is seen here:

funMetaData

EJ, when the script is run, you will see something like the graphic that follows (depending on the kinds of files in the folder and the amount of metadata stored for each file):

Screen Output File Metadata graphic

 

Ed Wilson and Craig Liebendorfer, Scripting Guys

Author

0 comments

Discussion are closed.