{"id":575,"date":"2018-07-05T05:51:37","date_gmt":"2018-07-05T13:51:37","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/koryt\/?p=575"},"modified":"2019-10-09T11:33:57","modified_gmt":"2019-10-09T19:33:57","slug":"regular-expressions-regex-introduction","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/scripting\/regular-expressions-regex-introduction\/","title":{"rendered":"Regular Expressions (REGEX): Introduction"},"content":{"rendered":"<p>Hi all, this week I&#8217;ll be talking about Regular Expressions. I&#8217;ve got a few posts planned to get you set up and going with some basic Regex.<\/p>\n<p>Regex is used for extracting and validating data.\u00a0Essentially, you can think of Regex as windows wild cards on steroids. Anytime we need to match data with a little more clarity than the *s and ?s that windows gives us, we have Regex.<\/p>\n<p>Regex has a reputation for being difficult and confusing, but it really isn&#8217;t so bad when you get used to it. The biggest contributors to Regex&#8217;s reputation are:<\/p>\n<ol>\n<li>Regex uses its own set of symbols, from PowerShell we need to generate plain-text Regex strings and send them into the parser. This means escaping special PowerShell symbols to get them passed correctly.<\/li>\n<li>Regex is confusing to read, but easier to write. I like to joke that Regex is a write only language. because when you see data, and write a pattern in plain English, its not so bad to build the pattern out of symbols. However, when you see a bunch of symbols by themselves it looks like a bunch of spaghetti code. Additionally, there are a bunch of different ways people might build a pattern for the same data. When you use Regex, make sure to leave friendly comments for anyone viewing your code later.<\/li>\n<\/ol>\n<p>With that in mind, let&#8217;s take a look at a sample about why you should care, and then in later posts we will break it down and learn more.<\/p>\n<p>Maybe we have some fake data, like this:<\/p>\n<p><a href=\"https:\/\/msdnshared.blob.core.windows.net\/media\/2018\/06\/MOCK_DATA.txt\">MOCK_DATA<\/a><\/p>\n<p>We&#8217;ll work with just numbers in this case and try to extract those phone numbers. In plain English, we can look at the data and say all the phone numbers break down like this:<\/p>\n<ol>\n<li>3 numbers<\/li>\n<li>dash character<\/li>\n<li>3 more numbers<\/li>\n<li>dash character<\/li>\n<li>4 more numbers<\/li>\n<\/ol>\n<p>Now, we could get false positives, but since we can see the data we can call it &#8220;good enough&#8221; \ud83d\ude42<\/p>\n<p>In Regex, we can use <code>\\d<\/code> to say &#8220;look for a number&#8221; and <code>{min,max}<\/code> to specify a quantity. We&#8217;ll talk more about these symbols later. With that in mind, our pattern could look something like <code>\\d{3}-\\d{3}-\\d{4}<\/code><\/p>\n<p>Now, to use regex, I&#8217;m going to utilize <code>-Match<\/code>\u00a0and the built in variable <code>$matches[0]<\/code>, which will hold the matched data. All we need to do is put these pieces together:<\/p>\n<pre class=\"lang:ps decode:true\">#grab our data\r\n$file = get-content \"$PSScriptRoot\\MOCK_DATA.txt\"\r\n\r\n#make our pattern\r\n$regex = \"\\d{3}-\\d{3}-\\d{4}\"\r\n\r\n#loop through each lin\r\nforeach ($line in $file)\r\n{\r\n#if our line contains our pattern, write the matched data to the screen\r\nif($line -match $regex)\r\n{\r\n$matches[0]\r\n}\r\n}\r\n\r\n<\/pre>\n<p>Results:<\/p>\n<pre class=\"lang:default decode:true\">982-674-7597\r\n275-545-2825\r\n275-609-0729\r\n570-808-4168\r\n726-131-4847\r\n912-974-5105\r\n351-131-8303\r\n938-281-7352\r\n737-424-9922\r\n198-238-7774\r\n199-866-6315\r\n967-153-4550\r\n730-103-5861\r\n464-747-2670\r\n473-232-5315\r\n173-795-8209\r\n424-484-7750\r\n388-383-4977\r\n328-526-8012\r\n710-232-3341\r\n537-744-9215\r\n343-679-9591\r\n404-643-4727\r\n654-476-2559\r\n986-109-0938\r\n199-790-8042\r\n340-974-7318\r\n522-411-1281\r\n874-705-5922\r\n982-223-7617\r\n456-820-5936\r\n157-781-8516\r\n508-552-8426\r\n913-814-8741\r\n318-716-1850\r\n198-231-8411\r\n148-900-9662\r\n544-416-2598\r\n353-429-1125\r\n316-568-4160\r\n425-256-2700\r\n790-673-7772\r\n493-734-9005\r\n813-496-0519\r\n981-114-6637\r\n763-797-9753\r\n820-648-4784\r\n824-511-8491\r\n293-878-6488\r\n832-704-8998<\/pre>\n<p>You can find the code in <a href=\"https:\/\/github.com\/Sambardo\/Regex-Phone-Number-Sample\">this GitHub repo<\/a><\/p>\n<p>Hopefully this gets you excited about Regex! In a couple weeks I&#8217;ll do another post breaking down the basic symbols.<\/p>\n<p>As always, don&#8217;t forget to rate, comment and share! Let me know what you think of the content and what topics you&#8217;d like to see me blog about in the future.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hi all, this week I&#8217;ll be talking about Regular Expressions. I&#8217;ve got a few posts planned to get you set up and going with some basic Regex. Regex is used for extracting and validating data.\u00a0Essentially, you can think of Regex as windows wild cards on steroids. Anytime we need to match data with a little [&hellip;]<\/p>\n","protected":false},"author":7300,"featured_media":87096,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1738,686],"tags":[2221,2125,377,174],"class_list":["post-575","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-powershell","category-regex","tag-kory-thacher","tag-koryt","tag-powershell","tag-regular-expressions"],"acf":[],"blog_post_summary":"<p>Hi all, this week I&#8217;ll be talking about Regular Expressions. I&#8217;ve got a few posts planned to get you set up and going with some basic Regex. Regex is used for extracting and validating data.\u00a0Essentially, you can think of Regex as windows wild cards on steroids. Anytime we need to match data with a little [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts\/575","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/users\/7300"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/comments?post=575"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts\/575\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/media\/87096"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/media?parent=575"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/categories?post=575"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/tags?post=575"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}