Back to Basics Part 4: More Ways to Manipulate Data

Doctor Scripto

Summary: Microsoft PFE, Gary Siepser, talks more about the basics of the Windows PowerShell pipeline.

Microsoft Scripting Guy, Ed Wilson, is here. Today Gary Siepser delivers his Part 4 of his five part series.

To follow along, you can read:

In Part 3, we introduced several cool things you can do with your data. In this post, we will continue that investigation by looking at even cooler techniques. We’ll focus on:

  • Grouping your data
  • Using your data in the pipeline while you use it another way too
  • Advanced use of Where-Object filtering that we discussed in Part 2

Grouping data

Sometimes when you are working with typical IT admin-type information, it’s useful to get counts of various things. Often there are common values on objects. For example, when you are looking at a list of users, they all might have an office location. It is a great thing to be able to get a quick count of users per office.

The cmdlet Group-Object (the alias is Group) does a wonderful job of lumping together data based on common values. In our user per office example from Active Directory, you can easily pipe the user objects into Group-Object and tell it to group by the Office property. You will be pleasantly surprised at how quickly you get exactly what you are looking for.

Using the Group-Object cmdlet is actually pretty easy. Identifying the properties that you want to group by is a little more challenging. If you look back at Part 2 in this series, you will see how you can use the Get-Member cmdlet to see the available properties for your objects. That is only half the battle though, because it only makes sense to group objects where the data values have commonality. This is logical, and it would be true even if you were working in Excel to do the same thing. When you know what property you want to group by, you can use it as a parameter for Group-Object.

Group-Object has a number of interesting parameters in addition to what property you choose to group by. If you are truly interested in only the group names and counts (and not the individual object data—which is the case for me most of the time), there is a –NoElement property that will give you only those unique group names and counts. See the Help file for more information about extra things Group-Object can do (Get-Help Group-Object –ShowWindow).

Image of command output

The last example combines grouping with sorting, and it adds a parameter that we introduced in Part 3 of the series from the Select-Object cmdlet. This parameter allows you to grab a chunk from the top of your output. Recall that Select-Object is about choosing part of your objects, but I did call it the Swiss Army knife of Windows PowerShell. The –First parameter is simply another tool being used in the pocket knife. This example is cool because it shows you the five most popular verbs used by cmdlets on your system.

Do two things with your objects at the same time

There is a cmdlet called Tee-Object (the alias is Tee). It allows you to keep your stuff flowing down the pipe, but also store it in a variable for later or even in a text file for long term keeping. I have found this to be a really cool trick for scripts (I know we are focused on the pipeline, but it’s a cool trick).

There are times when I have a nice pipeline that does some cool stuff, but it might take a while. This happens a lot when I use cmdlets like Get-CIMInstance or Get-WMIObject to get information from a lot of distributed systems. Often I want to output that data to a text file, or even to a variable to use later in that script.

Because the pipeline takes a long time to finish, I see nothing on the screen the entire time that I am waiting. When you see nothing for too long, you start to wonder if it’s even working. Tee-Object can make the user experience a lot better because you can see the objects being output steadily as the pipeline does its work, and still get the results in a variable or text file after.

Tee-Object simply splits the objects coming down the pipe in two directions (that is why it’s called Tee). One direction is always to continue flowing down the pipeline, and the other is a choice between a text file and a variable. It’s very simple, use the –FilePath parameter for a file, and the –Variable parameter for a variable.

When you use the –FilePath parameter, there is another parameter called –Append that you can use to continue adding data to a file that already exists. Again, this is a particularly great trick for pipelines that take a long time.

Image of command output

Both of these examples show that we were able to see the results when running the pipeline, and then also in the text file and the variable. Look at these and imagine how this would be nice for a long pipeline. You get to see something the entire time it’s running!

Revisiting the Where-Object cmdlet

This time around, we’ll focus on the more traditional and advanced syntax of Where-Object. The advanced syntax is useful when you want more than a simple filter. You’ll remember from Part 3 of this series that the simplified syntax worked by simply picking the property you wanted to filter by, choosing a filter comparison type, and choosing the value you want to compare to.

The more advanced syntax requires a little bit more in the way of special characters. First of all, we are going to use a parameter of Where-Object called –FilterScript. The trick is that this parameter is the first position, meaning if you leave it off and give Where-Object the value, it knows you meant to use the –FilterScript parameter.

This is the way I normally see it used. Actually, I have found that in many cases, folks don’t even realize they are using a regular old parameter when they use Where-Object. They learned how the syntax works and “went with it”—without even knowing it is a parameter or the parameter’s name.

The value we want to use with the traditional syntax of Where-Object is called a script block in Windows PowerShell. Don’t be confused by the name, it’s just a way to give Windows PowerShell some script, and have that script be the argument. What is important is to realize that the code you have in the script block is going to be surrounded by {curly braces}.

Inside this script block, we need to tell Where-Object what criteria we are going to use to filter this object. Let’s take a look at a couple of comparisons of the simplified syntax vs. the equivalent filter using the traditional advanced syntax.

Image of script

Both examples are functionally the same.

Not only do you see the script block surrounded by curly braces, but you’ll notice a few other differences. The $_ **automatic variable is in use. This is a pretty simple concept. It is a placeholder that represents the objects being piped to **Where-Object.

Where-Object examines each object to see if it needs to be filtered. The special variable allows you to refer to each of those objects together in one script block. You can also interchangeably use $PSItem in place of $_. Either one works the same. $PSItem was introduced in Windows PowerShell 3.0 to give a more readable name and reduce the use of the underscore special character, if desired. I’m so used to the simplicity of the single character name that I stick with $_.

You’ll also notice that we need to use a more traditional way to access an object’s property. That is using a single dot. The single dot allows access to any object members, and the properties and methods for an object. Remember, as we learned in Part 3, you can use the Get-Member cmdlet to explore what is available for an object.

The only other difference is that when using a text value (a string) with the traditional syntax, you need to place the string in quotes.

You can see this is definitely not as pleasing to the eye, nor is it as easy for beginners who are new to Windows PowerShell. Luckily with this introduction in Windows PowerShell 3.0, you can use the simpler syntax. The traditional syntax only needs to be used when you want to do more than a simple test.

One common reason we need the traditional syntax is to combine multiple conditions at the same time. There are operators in Windows PowerShell for this purpose. The great news is they work just like you use them in spoken language. We have –and, –or, and –xor.

The –and is pretty simple: both sides need to be True. The –or means either side or both sides need to be True. Although lesser known, –xor is also really easy. It means one side or the other needs to be True, but not both at the same time.

Take a look at these examples to see how these operators can make your filters richer. In both examples, you will see I used sets of parenthesis to ensure Windows PowerShell compares things correctly in the right order. This is a good habit, though these particular examples wouldn’t have run any differently if I left them off. Parenthesis allow you to control the order of operations if you need them different than the default. It’s just like the algebra you learned in school as a kid.

Image of script

In both of these examples, we have multiple tests going on, thus the traditional Where-Object syntax is required. If you are new to Windows PowerShell, and you think this looks like a lot of confusing syntax, take it slowly. You’ll get it all figured out in time. However, feel some pity for those using Windows PowerShell for a while. Before version 3.0, this was the only way to filter with the Where-Object cmdlet.

There are all sorts of neat things you can do inside those curly braces. The thing to remember is that Where-Object is looking for a True or False result. You can literally put anything in those filters—they simply need to come out as True for the objects you want to keep flowing down the pipeline. Check out the Help file for more details (Get-Help Where-Object –ShowWindow).

As you can see, there are so many cool things you can do with your data when you have it in the pipeline. Parts 3 and 4 in this series have covered the basics and a few cool things, but there are many other great cmdlets that you can use to manipulate your objects. First, check out all the cmdlets that use the noun Object. Beyond them, look at all the script you see around and check out what others are doing in their pipelines.

In Part 5, the end of this series, we’ll learn about a few of the ways to output your objects at the end of your pipeline.

~Gary

Thank you, Gary! I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

Ed Wilson, Microsoft Scripting Guy 

0 comments

Discussion is closed.

Feedback usabilla icon