Pipelined Expressions

Doctor Scripto

Summary: Windows PowerShell MVP , Jeff Wouters, shares an excerpt from his chapter in the book, PowerShell Deep Dives.

Microsoft Scripting Guy, Ed Wilson, is here. This week we will not have our usual PowerTip. Instead we have excerpts from seven books from Manning Press. In addition, each blog will have a special code for 50% off the book being excerpted that day. Remember that the code is valid only for the day the excerpt is posted. The coupon code is also valid for a second book from the Manning collection.

Today, the excerpt is from PowerShell Deep Dives
     Edited by Jeffery Hicks, Richard Siddaway, Oisin Grehan, and Aleksandar Nikolic.

Photo of book cover

Whether you just started using Windows PowerShell, or you are at a more advanced level, there are two things you should always look at while writing a script: performance and execution time. With the introduction of Windows PowerShell 3.0, there are a lot of new modules and cmdlets available to you. What a lot of people don’t realize is that Microsoft also improves and expands already existing modules and their cmdlets. This is especially the case in Windows PowerShell 3.0. In this blog, based on Chapter 10 in PowerShell Deep Dives, author Jeff Wouters discusses one of the most powerful features of Windows PowerShell—the ability to utilize the pipeline.

And now, here’s Jeff…

Finding objects, filtering them down to the ones you want, and performing an action on them can be done very easily by using pipelined expressions, which I refer to as “the pipeline.” Every step is one pipe in the pipeline.  In general, the fewer pipes you use, the shorter the execution time will be, and the fewer resources are used. I’ll illustrate this later.

My problem is that I tend to put everything in a one-liner. Although creating one-liners is fairly easy to learn and understand, it does have some best practices to gain the best performance. If you don’t use the best practices, your script may still work, but you will experience some negative performance and long execution times. Especially with Windows PowerShell 3.0, where lots of new modules, cmdlets, parameters, methods, and member enumeration are introduced, it becomes more important to use the parameters and the pipeline in the most efficient way.

When writing scripts, I always keep my goal in mind: to complete the task at hand in the most efficient way. You are able to combine parameters so you won’t have long commands where objects are piped from one cmdlet to another. This will result in better performance and lower execution times for your scripts. As a secondary result, many times it will also result in less code.

Requirements

To use pipelined expressions, you need the ability to execute Windows PowerShell code. There are a few ways to accomplish this:

  • At a Windows PowerShell prompt
  • Through a scripting editor that supports Windows PowerShell and allows for code execution to test your code, including the ability to view the output of your code
  • By executing Windows PowerShell scripts manually

To measure the execution time for each command, I’ve used the Measure-Command cmdlet, like so:

PS D:\> Measure-Command {Get-WmiObject -Class win32_bios -Property manufacturer | Where-Object {$_.Manufacturer -eq “Hewlett-Packard”}}

 

Days              : 0

Hours             : 0

Minutes           : 0

Seconds           : 0

Milliseconds      : 131

Ticks             : 1315776

TotalDays         : 1,52288888888889E-06

TotalHours        : 3,65493333333333E-05

TotalMinutes      : 0,00219296

TotalSeconds      : 0,1315776

TotalMilliseconds : 131,5776

I will give you the execution times in my environment of the commands provided to show you the benefits of doing it another way.

Pipeline—Rules of Engagement

When I began using PowerShell, I was introduced to the pipeline immediately. I saw how easy it was and I started to pipe everything together, never looking at the execution times or performance impact. Over the last few years, I’ve learned some basic best practices that enabled me to end up with a fraction of the execution time compared to my previous scripts.

Here is an example of what you can accomplish with this:

I wrote a script to provision 1500 user objects in Active Directory by using a CSV file with more than 25 properties defined per user and to make them members of the appropriate groups. This script used to take about 12 minutes to execute, and now it takes somewhere between 55 and 60 seconds. Of course, this depends on the Active Directory server, but you get the idea.

I’ll cover these best practices one by one and elaborate on them.

What is the pipeline?

Before going into the pipeline rules I’ve mentioned, it can be useful to take a look at the pipeline itself. What is the pipeline? A pipeline uses a technique called piping. In simple terms, it is the ability to pass objects from one command to the next. One way of doing this is as follows (in order): get all processes, filter based on the name of the process, and then stop the process.

Get-Process | Where-Object {$_.Name –eq “notepad”} | Stop-Process

Execution time: 61 milliseconds.

What happens here? First, all objects (in this case processes) are received by the Get-Process cmdlet. Those objects are piped to the Where-Object cmdlet where the objects are filtered based on their name. Only the processes with the name “notepad” are piped to the Stop-Process, which in turn actually stops the processes.

Filtering objects

Rule : Filter as early as possible.

You may encounter situations where your code must handle large numbers of objects. In these cases, you will need to filter that list of objects to gain the best performance. In other words, when you put a filter on a list of objects, only the ones that comply with your filter will be shown.

The Get-Process cmdlet has a Name parameter that you can use. This allows you to filter based on the name, but without having to use the Where-Object cmdlet:

Get-Process –Name notepad | Stop-Process

Execution time: 61 milliseconds.

So, all processes are received and filtered by the Get-Process cmdlet. Only then are they piped to the Stop-Process cmdlet. Doing it this way means that the number of objects (processes) passed from the first to the second pipe is significantly less compared to the first example. It also reduces the pipeline to one pipe. This allows for shorter execution times and less resource utilization.

So I’ve shown you how you can filter on object properties already, but let’s take a deeper look at this.

Where-Object

There are two ways to filter down a list of objects to end up with the ones you need. The first way is to use the Where-Object cmdlet in the pipeline. Let’s take an example where you would need to get all files with the .docx or .txt extensions and with “PowerShell” in their names:

PS D:\> Get-ChildItem -Recurse | Where-Object {(($_.Extension -eq “.docx”) –or ($_.Extension –eq “.txt”)) –and ($_.Name –like “*PowerShell*”)}

 

    Directory: D:\

 

Mode           LastWriteTime   Length  Name

—-           ————-   ——  —-

-a—     12-9-2012    10:36   510229  PowerShell ft Hyper-V.docx

-a—     12-9-2012    10:36   8233    PowerShell ft Hyper-V Notes.txt
-a—     2-9-2012     16:24   433672  PowerShell Deep Dives.docx

-a—     2-9-2012     16:24   1285    PowerShell Deep Dives Notes.txt
-a—     21-6-2012    00:52   306913  Practical PowerShell.docx

-a—     21-6-2012    00:52   9835    Practical PowerShell Notes.txt

Execution time: 162 milliseconds.

As you can see, this is done by using a pipelined expression. However, in this case there is a more efficient way to accomplish this: by using the parameters attached to the Get-ChildItem cmdlet. When you take a look at the parameters offered by this cmdlet, you’ll find the Include and Filter parameters. So let’s use those instead of the pipeline:

PS D:\> Get-ChildItem -Recurse –Include *.docx, *.txt –Filter *PowerShell*

 

    Directory: D:\

 

Mode           LastWriteTime   Length  Name

—-           ————-   ——  —-

-a—     12-9-2012    10:36   510229  PowerShell ft Hyper-V.docx

-a—     12-9-2012    10:36   8233    PowerShell ft Hyper-V Notes.txt
-a—     2-9-2012     16:24   433672  PowerShell ft Windows.docx

-a—     2-9-2012     16:24   1285    PowerShell ft Windows Notes.txt
-a—     21-6-2012    00:52   306913  Practical PowerShell.docx

-a—     21-6-2012    00:52   9835    Practical PowerShell Notes.txt

Execution time: 82 milliseconds.

As you can see, it is possible to get the same output without using the pipeline.

In Windows PowerShell 3.0, the Get-ChildItem cmdlet also comes with File and Directory parameters, which allow you to filter for only files or directories. So, if you’re only looking for files, using the File parameter would decrease the execution time of the command because directories are skipped entirely.

This is why I always find it useful to know what parameters are offered, and if I don’t know, the Get-Help cmdlet saves the day.

Parameters vs. Where-Object

Sometimes cmdlets have parameters that can filter the objects, and therefore, completely avoid the pipeline. The following is how you could filter a list of objects based on a condition—in this case, the value of the Manufacturer property:

PS D:\> Get-WmiObject -Class win32_bios -Property manufacturer | Where-Object {$_.Manufacturer –eq “Hewlett-Packard”}

 

__GENUS          : 2

__CLASS          : Win32_BIOS

__SUPERCLASS     :

__DYNASTY        :

__RELPATH        :

__PROPERTY_COUNT : 1

__DERIVATION     : {}

__SERVER         :

__NAMESPACE      :

__PATH           :

Manufacturer     : Hewlett-Packard

PSComputerName   :

 

Execution time: 82 milliseconds.

There is, however, a more efficient way of doing this. The Get-WmiObject parameter offers you the Query parameter. You can use this parameter to search for the object and show it, based on a condition set for the value of the Manufacturer property: 

PS D:\> Get-WMIObject -Query “SELECT Manufacturer FROM Win32_BIOS WHERE Manufacturer=’Hewlett-Packard’”

 

__GENUS          : 2

__CLASS          : Win32_BIOS

__SUPERCLASS     :

__DYNASTY        :

__RELPATH        :

__PROPERTY_COUNT : 1

__DERIVATION     : {}

__SERVER         :

__NAMESPACE      :

__PATH           :

Manufacturer     : Hewlett-Packard

PSComputerName   :

Execution time: 27 milliseconds.

Filtering this way is faster and uses fewer resources. More importantly, it uses the Windows PowerShell System Provider for WMI.

Properties

When you’re done filtering the objects, you still have all of the properties attached to them. This is a lot of information-consuming resources that you may not even need. It can slow your script and system down—so that’s not desired. So how can you clean this up?

This is where the Select-Object cmdlet and the pipeline come into play: 

PS D:\> Get-ChildItem -Recurse –Include *.pdf, *.txt –Filter *PowerShell* | Select-Object LastWriteTime, Name

 

    Directory: D:\

 

     LastWriteTime  Name

     ————-  —-

12-9-2012    10:36  PowerShell ft Hyper-V.docx

12-9-2012    10:36  PowerShell ft Hyper-V Notes.txt
2-9-2012     16:24  PowerShell ft Windows.docx

2-9-2012     16:24  PowerShell ft Windows Notes.txt
21-6-2012    00:52  Practical PowerShell.docx

21-6-2012    00:52  Practical PowerShell Notes.txt

There isn’t another way to filter the number of objects to leave only the ones you want. Select-Object is the way to go here.

Piping is one of the best and most powerful features in Windows PowerShell. This blog showed you how to utilize commands, parameters, and the pipeline in the most efficient way.

Here is the code for the discount offer today at www.manning.comscriptw2
Valid for 50% off PowerShell Deep Dives and SQL Server Deep MVP Dives

Offer valid from April 2, 2013 12:01 AM until April 3 midnight (EST)

I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

Ed Wilson, Microsoft Scripting Guy 

0 comments

Discussion is closed.

Feedback usabilla icon