April 21st, 2009

Hey, Scripting Guy! Windows PowerShell and Pipelining

Hey, Scripting Guy! Question

Hey, Scripting Guy! I have seen you refer to this term pipeline many times since you started writing Windows PowerShell articles. What is up with that? Why don’t you just store things in a variable and then walk through the contents of the variable when you are working with Windows PowerShell? This is the way that we did it in VBScript, and it has worked fine for almost a decade. Can you explain what is so great about a pipeline (other than for economically transporting large amounts of liquid materials over great distances)?

– YH

SpacerHey, Scripting Guy! Answer

Hi YH,

Ed here. And I’m bummed you stole my smart alec remark. One of my hobbies is woodworking. I enjoy making sawdust out in my woodworking shop. It is completely relaxing, and when I am finished with a piece of furniture, there is a real sense of pride and accomplishment. The skills I use in my shop are completely different from the kind of skills used to write a script or to write a book. It is a good way to relax. I have been playing around with wood since I was a kid, and therefore I have many years of experience in making things from wood. When I first started out, my dad showed me how to take two skinny boards and make one large board. You join the straightest side of each board, evenly spread glue on the freshly joined boards, and carefully put them in a set of clamps and tighten the clamps. You must do this both quickly and carefully as glue starts to harden in a few minutes (depending of course on the kind of glue you are using) and glue is slippery, meaning that it is easy for the boards to become misaligned. Both facts are a challenge for a 10-year-old boy. As I grew older my interest in woodworking continued, and I began watching a guy named Norm Abrams on the New Yankee Workshop (Norm: If you are reading this, send e-mail to scripter@microsoft.com, and I will send you an autographed copy of the Microsoft Press Windows PowerShell Script Guide book.) Norm is my hero. I even wear a flannel shirt when I am making sawdust out in my shop. He introduced me to a tool called a plate joiner that is used to put biscuits in a board. The main advantage of biscuits is that they help you align two boards when you are gluing two boards.

YH, you asked about how to work with the pipeline, and this article is “Hey, Scripting Guy!” and not “Hey, Woodworking Guy!” So let’s start scripting.

This week we will be looking at the basics of Windows PowerShell. Windows PowerShell is installed by default on Windows 7 and Windows Server 2008 R2. It is an optional installation on Windows Server 2008 and a download for Windows Vista, Windows XP, and Windows Server 2003. The Windows PowerShell Scripting Hub is a good place to get started with Windows PowerShell.

In the SearchTextFileForSpecificWord.vbs script, we create an instance of the Scripting.FileSystemObject, open the file, and store the resulting textstream object in the file variable. We then use the Do…Until…Loop statement to work our way through the textstream object. Inside the loop we read one line at a time from the textstream. As soon as we have a specific line, we use the InStr statement to see whether a specific word is found. If it is, we display the sentence on the screen. The SearchTextFileForSpecificWord.vbs script is seen here:

filepath = “C:\fso\testFile.txt”
word = “text”
set fso = CreateObject(“Scripting.FileSystemObject”)
Set file = fso.OpenTextFile(filepath)

Do Until file.AtEndOfStream line = file.ReadLine If InStr(line, word) Then WScript.Echo line End If Loop

This technique of using the ReadLine method is very efficient and is the recommended way to work with large files. The other way of reading content from a text file in VBScript is the ReadAll method. The problem with using the ReadAll method is that it stores the contents of a text file in memory. This is not a problem if the file is small, but for a very large file that it would consume a very large amount of memory. In addition to the memory consumption issue, if you plan on working with the file one line at a time, which is one of the main reasons for reading a text file, you now have to contrive artificial methods to work your way through the file. When we use the ReadLine method from the TextStream object, the process is similar to pipelining in Windows PowerShell.

With Windows PowerShell, we do not have to write a script to do the same thing the SearchTextFileForSpecificWord.vbs does. We can, in fact, perform the operation in just three lines of code:

PS C:\> $filepath = “C:\fso\TestFile.txt”
PS C:\> $word = “text”
PS C:\> Get-Content -Path $filepath | ForEach-Object {if($_ -match $word){$_}}

When we run the commands, we are given the output shown here:

Image of scriptlike commands being typed directly into the Windows PowerShell console

 

Before we get too far, let’s examine the TestFile.txt file. This will give us a better idea of what we are working with. The file is seen here:

Image of the TestFile.txt file

 

The first two lines that were typed into the Windows PowerShell console assign string values to variables. This serves the same purpose as the first two lines of the SearchTextFileForSpecificWord.vbs script. The last line we typed in the Windows PowerShell console is actually two separate commands. The first one reads the contents of the text file. This is the same as creating an instance of the Scripting.FileSystemObject, opening the text file by using the Do…While…Loop construction, and then calling the ReadLine method. Here is the Get-Content command:

Get-Content -Path $filepath

The results of the Get-Content cmdlet are pipelined to the ForEach-Object cmdlet. The ForEach-Object cmdlet enables us to work inside the pipeline to examine individual lines as they come across the pipe. The variable $_ is an automatic variable that is created when we are working with a pipeline. It is used to enable us to work with a specific item when it is located on the pipeline. In VBScript you use the If…Then…End If construction. In Windows PowerShell, we use an If(…){…} construction. The two serve the same purpose—decision making. In VBScript the condition that is evaluated goes between the If and the Then statement. In Windows PowerShell, the condition that is evaluated goes between two parentheses. In VBScript the action that is taken when a condition is matched goes between the Then and the End If statements. In Windows PowerShell, the action that is matched goes between a pair of braces.

In VBScipt we use the Instr function to look inside the sentence to see whether a match could be found. In Windows PowerShell, we use the –match operator. In VBScript we use the Wscript.Echo command to display the matching sentence on the screen, and in Windows PowerShell we only need to call the $_ variable and it is automatically displayed.

Of course we do not have to use the Get-Content cmdlet if we do not want to, because Windows PowerShell has a cmdlet called Select-String that will look inside a text file and retrieve the matching lines of text. Our three lines of code, seen earlier, could therefore be shortened to this one-line command:

PS C:\> Select-String -Path C:\fso\TestFile.txt -Pattern “text”

The results of this command are shown here:

Image of the Select-String cmdlet reading a file and searching content at the same time

 

One of the things we really like to do with Windows PowerShell is to use the formatting cmdlets. There are three formatting cmdlets that are especially helpful. They are listed here, in the reverse order in which I use them:

Format-Wide

Format-Table

Format-List

Let’s consider using the Format-Wide cmdlet. Format-Wide is useful when you want to display a single property across multiple columns. This might be because you want to have a list of all process names that are currently running on the server. Such a command would resemble the following:

PS C:\> Get-Process | Format-Wide -Property name –AutoSize

The first thing we do is use the Get-Process cmdlet to return all the processes that are running on the computer. We pipeline the process objects to the Format-Wide cmdlet. We use the –property parameter to select the name of each process, and we use the –autosize parameter to tell Format-Wide to use as many columns as possible in the Windows PowerShell console without truncating any of the process names. We can see the results of this command here:

Image of the Format-Wide cmdlet displaying a single property

 

If we were interested in between two and four properties from the processes, we could use the Format-Table cmdlet. The command might resemble the following:

PS C:\> Get-Process | Format-Table -Property Name, Path, Id –AutoSize

We use the Get-Process cmdlet and pipeline the processes to the Format-Table cmdlet. We select three properties from the process objects: name, path and Id. The Format-Table cmdlet has an –autosize parameter exactly as the Format-Wide cmdlet does. This helps arrange the columns in such a way that we do not waste space inside the console. As shown in the following image, because of the length of some paths to process executables, the –autosize parameter had no effect in this example. As a best practice I always include the parameter when I am unsure of what the output will actually resemble.

Image of using the Format-Table cmdlet to make a table

 

The format cmdlet I use the most is the Format-List cmdlet. This is because it is the best way to display lots of information. It is also a good way to see what kind of data might be returned by a particular command. Armed with this information, I then determine whether I want to focus on a more select group of properties and perhaps output the data as a table or just leave it in a list. When you use the Format-List, I will usually use the wildcard * to select all the properties from the objects. Here is an example of obtaining all the property information from all our processes:

PS C:\> Get-Process | Format-List -Property *

We will be unable to show you a picture of all the data that is returned by this command. This is because there is too much information to fit on a single screen. A small sampling of the information is shown here:

Image of the process information displayed by the Get-Process cmdlet

 

There is so much information that all the properties and their values for a single process will not fit on a single screen. When you work with the Format-List cmdlet, if you want to look through all the data, you can pipeline the information to more. This works in the same manner as it did in the old command shell. If we were to use shortcut names (also known as aliases), we have a very compact command at our disposal. As shown here, gps is an alias for the Get-Process cmdlet. The fl command is an alias for Format-List. Because the first parameter of the Format-List cmdlet is the –property parameter, we can leave it out and not type it. We then pipeline the results to more, which will cause the information to be displayed one page at a time. This is shown here:

PS C:\> gps | fl * | more

This concludes our look today at pipelining. As you have undoubtedly seen, the ability to pipeline the results of one command into another command provides us with powerful alternatives to storing everything in a variable. These techniques are not only for working interactively at the command line, but also are best practices when you write scripts. We will continue our discussion of Windows PowerShell basics tomorrow. Until then, take care.

 

Ed Wilson and Craig Liebendorfer, Scripting Guys

Author

0 comments

Discussion are closed.