September 30th, 2016

PowerShell regex crash course – Part 1 of 5

Doctor Scripto
Scripter

Summary: Thomas Rayner, Microsoft Cloud and Datacenter Management MVP, shows the basics of working with regular expressions in PowerShell.

Hello! I’m Thomas Rayner, a proud Cloud and Datacenter Management Microsoft MVP, filling in for The Scripting Guy! this week. You can find me on Twitter (@MrThomasRayner) or posting on my blog, workingsysadmin.com. This week, I’m presenting a five-part crash course about how to use regular expressions in PowerShell. Regular expressions are sequences of characters that define a search pattern, mainly for use in pattern matching with strings. Regular expressions are extremely useful to extract information from text such as log files or documents. This isn’t meant to be a comprehensive series but rather, just as the name says, a crash course. So buckle up!

Many people are intimidated by regular expressions, or “regex”. If you see something like ‘(\d{1,3}\.){3}(\d{1,3})’ and your eyes start glazing over, don’t worry. By the end of this series, you’ll have the skills to identify that pattern matches IP addresses. For the uninitiated, big strings of seemingly random characters appear indecipherable, but regex is an incredibly powerful tool that any PowerShell pro needs to have a grip on.

Today, I’m explaining some of the basics. Regex is a really big topic, and a proper “Intro to regex” post series could probably be broken into five parts on its own. This, however, isn’t an introduction to regex series, it’s a regex crash course.

If you’ve dabbled in PowerShell before, you’ve probably already used rudimentary regex, maybe without even realizing it. Have you ever run something like this?

$ArrayOfStuff = @(‘somethingone’,’somethingtwo’,’noway’)/

$ArrayOfStuff | Where-Object { $_ -match ‘something’ }

This simple example will return the first two items in the array because they match the pattern “something”. The third array item is not returned because the pattern “something” isn’t found in the item anywhere. In other words, the first two items of the array contain a specific pattern of characters that we are interested in, but the third one doesn’t. That sounds like regex!

You can also use the –match operator without putting it inside where-object. In this case, you can test whether a given pattern exists within a string.

somethingone’ –match ‘something’     #returns True

somethingone’ –match ‘pickles’       #returns False

Another operator that some people don’t realize works with regex is –replace. With it, you can do things like the following example. The matching principle works the same as the –match operator, but as you might guess by the name, -replace does more than just report if a pattern exists in a string.

Here is a simple string.’ –replace ‘simple string’,’string that got something replaced

You’ll get “Here is a string that got something replaced.” back from this. What happened is –replace matched the pattern “simple string” and replaced it with “string that got something replaced”. Neat, right?

What if you had a more complicated task? How about if you have a string that’s a file name, and you want to know if it starts with a specific letter and ends in a specific file extension? For instance, does something.txt start with an s and end with .txt? What about notthis.txt? Well for this, we need to introduce quantifiers.

In PowerShell regex, there are three quantifiers: *, +, and ? (star, plus and question mark for those who are new). They all mean different things in regex.

* Means zero or more of something
+ Means one or more of something
? Means zero or one of something

So, if we want to examine the file names from the previous example, the easiest thing to do is use the star character.

‘something.txt’ –match ‘s*.txt’                 #returns True

‘notthis.txt’ –match ‘s*.txt’                       #returns False

In this example, the regex looks for the pattern “s followed by zero or more of anything, followed by .txt”. ‘something.txt’ indeed starts with an s, has zero or more of anything (“omething”) and then has a .txt so it matches. ‘notthis.txt’ doesn’t start with an s, so it fails to match the pattern.

Cool, right? We’ll dig into the other quantifiers later when we’re getting deeper into regex. Stay tuned for next week’s post where I’ll be explaining special characters!

Thanks, Thomas!  We’ll be looking forward to reading up on this next week.

I invite you to follow the Scripting Guys on Twitter and Facebook. If you have any questions, send email to them at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow.

Until then always remember that with Great PowerShell comes Great Responsibility.

Sean Kearney Honorary Scripting Guy Cloud and Datacenter Management MVP

Author

The "Scripting Guys" is a historical title passed from scripter to scripter. The current revision has morphed into our good friend Doctor Scripto who has been with us since the very beginning.

0 comments

Discussion are closed.