December 31st, 2010

Write PowerShell Functions That Accept Pipelined Input

Summary: Learn how to write Windows PowerShell functions that accept pipelined input.

 

Hey, Scripting Guy! QuestionHey, Scripting Guy! I really like the way that some Windows PowerShell cmdlets enable me to pipeline things to it. I do not understand why some Windows PowerShell cmdlets do not allow me to pipeline stuff to it, or perhaps I am doing it wrong. Anyway, my question is as follows: How can I write a function that will accept input from the pipeline? I think it would add a lot of additional capability to my functions, and enable them to behave more the way that Windows PowerShell, in general, tends to behave.

— ES

 

Hey, Scripting Guy! AnswerHello ES, Microsoft Scripting Guy Ed Wilson here. One of the weird things about living in Charlotte, North Carolina is that the winters are generally very mild. Weird, some of my Canadian friends say, weird. After having lived in the Deep South for several years, I do, in fact, miss Winter. I imagine I would get tired of shoveling snow, scraping ice, and playing “bumper cars” on the freeway if I had to do it year after year, but as an occasional respite from the normal southern cloudy, cold, and drizzling rain that is the seasonal norm around here, I think I would like to take my chances. In fact, the time that the Scripting Wife and I spent in Quebec, Canada a few years ago when I was teaching Windows PowerShell workshops to Microsoft Premier Customers was our favorite winter. The following image is a picture I took of Quebec from an ice-breaking ferryboat as it crossed the St. Lawrence River.

Of course, one’s perspective on winter depends a great deal upon where one is used to residing. My friends in Sydney, Australia are telling me on Facebook that it is 81 degrees Fahrenheit today (27 degrees Celsius).

One of the most useful things I ever did, Windows PowerShell wise, was my conversion module. I use it almost every day to convert temperatures from Fahrenheit to Celsius and vice versa. Of course, it also converts distance, and volume in addition to other things. I developed it during the first six Weekend Scripter articles. If you only want to see the final product, check out version six of the module.

ES, there are several ways that you can provide input values to a function. In today’s post, I will examine four ways. Two of the methods do not use the pipeline, and two of the methods do use the pipeline. As a quick review, using the pipeline in Windows PowerShell is a technique where the output of one command is streamed directly into the input of another command. A nice example of this would be to read a text file that contains a listing of directories (as seen in the following figure) and pipeline the results of reading the text file to the Get-ChildItem Windows PowerShell cmdlet (Get-ChildItem has the alias of dir).

H

The command to perform this pipeline operation appears here.

PS C:\> Get-Content C:\fso\Folders.txt | Get-ChildItem

 

If I want to configure my function to accept pipelined input, I can use the process directive. This has the advantage of simplicity, and in addition, it makes the script really easy to read. This is seen here.

Function add-oneD

{

 Process { $_ + 1 }

} #end function add-oneD

 

The add-oneD function does not define any input parameters. Instead, it uses the $_ automatic variable to accept the current value that is on the pipeline. To call this function, I can pipeline a group of numbers directly to the add-oneD function. This is seen here.

1..5000000 | add-oneD

 

To check the performance of the function, I use the Measure-Command Windows PowerShell cmdlet. This is seen here, along with the associated output.

Measure-command -Expression { 1..5000000 | add-oneD }

Days              : 0

Hours             : 0

Minutes           : 0

Seconds           : 35

Milliseconds      : 17

Ticks             : 350172660

TotalDays         : 0.000405292430555556

TotalHours        : 0.00972701833333333

TotalMinutes      : 0.5836211

TotalSeconds      : 35.017266

TotalMilliseconds : 35017.266

 

It took around 35 seconds on my computer to add five million numbers – not too bad. But, perhaps I can make it perform a little better. The second way to configure the function to accept direct pipeline input is to use the $input variable. As seen here, the add-oneC function uses the Foreach command to iterate through the $input automatic variable. When anything is piped to a function, the stream will be contained in the $input variable. The add-oneC function appears here.

Function add-oneC

{

 foreach($a in $input)

  { $a + 1 }

} #end function add-oneC

 

To call the add-oneC function, I use the same kind of command that I used when calling the add-oneD function. I pipeline the numbers directly to the function. This command appears here.

1..5000000 | add-oneC

 

To measure the performance of the new function, I use the Measure-Command Windows PowerShell cmdlet as seen here.

Measure-command -Expression { 1..5000000 | add-oneC }

 

The performance increase is remarkable. As seen here, the new function only takes a little more than fifteen seconds to add the five million numbers.

Days              : 0

Hours             : 0

Minutes           : 0

Seconds           : 15

Milliseconds      : 455

Ticks             : 154554599

TotalDays         : 0.000178882637731481

TotalHours        : 0.00429318330555556

TotalMinutes      : 0.257590998333333

TotalSeconds      : 15.4554599

TotalMilliseconds : 15455.4599

 

In a quest for performance, suppose I create the array of numbers, and pass the array directly to the function. Then inside the function I pipeline the items. Will this be faster? Here is the add-oneB function.

Function add-oneB

{

 Param ($a)

 $a | foreach-object { $_ + 1 }

} #end add-oneB

 

To call this function, I first have to create an array with five million numbers in it, and then pass that array to the function. Here is the code that does that.

$a = 1..5000000

add-oneB -a $a

 

When I call the Measure-Command to check the performance, the results are a bit disappointing. Here is the command, and the results.

$a = 1..5000000

Measure-command -Expression { add-oneB -a $a }

Days              : 0

Hours             : 0

Minutes           : 5

Seconds           : 20

Milliseconds      : 372

Ticks             : 3203725668

TotalDays         : 0.00370801581944444

TotalHours        : 0.0889923796666667

TotalMinutes      : 5.33954278

TotalSeconds      : 320.3725668

TotalMilliseconds : 320372.5668

 

Three hundred and twenty seconds; Dude (or Dudette) that is more than five minutes – clearly this is not a very good approach to the problem.

I decided to try one more approach, creating an array, and passing it to the function. Here is the add-oneA function.

Function add-oneA

{

 Param ($a)

 foreach($b in $a)

  { $b + 1 }

} #end function add-oneA

 

Again, I have to create the array with five million numbers in it, and pass it to the function. This is seen here.

$a = 1..5000000

 

add-oneA -a $a

 

To check the performance of the new function, I again call on the Measure-Command Windows PowerShell cmdlet as seen here.

$a = 1..5000000

 

Measure-command -Expression { add-oneA -a $a }

 

The results this time are impressive (as seen here,) the function processed five million numbers in a little less than eight seconds.

Days              : 0

Hours             : 0

Minutes           : 0

Seconds           : 7

Milliseconds      : 964

Ticks             : 79640050

TotalDays         : 9.21759837962963E-05

TotalHours        : 0.00221222361111111

TotalMinutes      : 0.132733416666667

TotalSeconds      : 7.964005

TotalMilliseconds : 7964.005

 

ES, your design decision therefore, boils down to how you anticipate users using your function. If you think they will be pipelining information directly to the function, you may want to consider using the $input variable as seen in function add-oneC. If storing in an array, and then passing to the function will work for you, then using the foreach statement inside the function (as opposed to pipelining the array) would seem to be the better approach. However, in all things, you should test as your performance may vary, as may the performance with different types of data. You should be very careful when making decisions based on mere milliseconds of difference as the Measure-Command Windows PowerShell cmdlet really is not that precise (that is why I decided to use 5,000,000 numbers to ensure the commands would take significant time and would therefore make the differences more appreciable).

For more information about performance testing of changes to Windows PowerShell scripts, refer to the “How can I test the efficacy of my script modifications” Hey, Scripting Guy! blog post.

ES, that is all there is to using the Windows PowerShell pipeline. This concludes script design week. Join me tomorrow as I talk about how to work with music files.

I invite you to follow me on Twitter or Facebook. If you have any questions, send email to me at scripter@microsoft.com or post them on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

 

Ed Wilson, Microsoft Scripting Guy 

Author

0 comments

Discussion are closed.