{"id":236,"date":"2014-12-06T00:01:00","date_gmt":"2014-12-06T00:01:00","guid":{"rendered":"https:\/\/blogs.technet.microsoft.com\/heyscriptingguy\/2014\/12\/06\/weekend-scripter-remove-non-alphabetic-characters-from-string\/"},"modified":"2019-02-18T10:36:44","modified_gmt":"2019-02-18T17:36:44","slug":"weekend-scripter-remove-non-alphabetic-characters-from-string","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/scripting\/weekend-scripter-remove-non-alphabetic-characters-from-string\/","title":{"rendered":"Weekend Scripter: Remove Non-Alphabetic Characters from String"},"content":{"rendered":"<p><b style=\"font-size:12px\">Summary<\/b><span style=\"font-size:12px\">: Microsoft Scripting Guy, Ed Wilson, talks about using Windows PowerShell to remove all non-alphabetic characters from a string.<\/span><\/p>\n<p>Microsoft Scripting Guy, Ed Wilson, is here. This morning I am drinking a nice up of English Breakfast tea and munching on a Biscotti. I know&#8230;Biscotti is not a very good breakfast. Oh well. I went to my favorite bakery yesterday looking for some nice scones (which do make a good breakfast). For some reason, all of the scones they had were covered with a half-inch thick gunky sugar icing. I mean, DUDE!<\/p>\n<p>There was not enough time to talk the Scripting Wife into making some nice scones, so here I am munching on Biscotti. The tea is nice anyway, and I am listening to some great Duke Ellington on my Surface Pro 3 while I am catching up on the Hey, Scripting Guy! Blog comments.<\/p>\n<p>One of the really cool things about the Hey, Scripting Guy! Blog is that I continue to get comments on posts that were written a long time ago. I ran across a couple of comments for a post that was written more than seven years ago.<\/p>\n<p>The post was written using VBScript, and it was titled <a href=\"https:\/\/devblogs.microsoft.com\/scripting\/how-can-i-remove-all-the-non-alphabetic-characters-in-a-string\/\" target=\"_blank\">How Can I Remove All the Non-Alphabetic Characters in a String?<\/a> The post talks about using regular expressions, and the information is still valid in a Windows PowerShell world. From that standpoint, it makes sense to take a quick look at that post before moving forward.<\/p>\n<h2>PowerShell makes using regular expressions easy<\/h2>\n<p>Lots of Windows PowerShell commands have regular expressions built in to them. That means that I really do not need to do anything special to unleash the power of regular expressions. I do not really need to know regular expressions, but knowing a bit about them does make stuff easier.<\/p>\n<p>At first, it might look like there is a regular expression character class that would do what I want to do here&mdash;that is remove non-alphabetic characters. But unfortunately, that is not the case. There is the <b>\\w<\/b> character class, which will match a <i>word <\/i>character; but here, word<i> <\/i>characters include numbers and letters.<\/p>\n<p><b>&nbsp; &nbsp;Note&nbsp;<\/b> Regular expressions are generally case sensitive, and it is important to remember that. Here, the <b>\\w<\/b> <br \/>&nbsp; &nbsp;character class is different than the <b>\\W<\/b> character class (non-word characters).<\/p>\n<p>According to the previously mentioned Hey, Scripting Guy! Blog post, the trick to solving the problem of removing non-alphabetic characters from a string is to create two letter ranges, a-z and A-Z, and then use the caret character in my character group to negate the group&mdash;that is, to say that I want any character that IS NOT in my two letter ranges. Here is the pattern I come up with:<\/p>\n<p style=\"margin-left:30px\">[^a-zA-Z]<\/p>\n<p><b>&nbsp; &nbsp;Note&nbsp;<\/b> When working with regular expressions, I like to put my RegEx pattern into single quotation marks <br \/>&nbsp; &nbsp;(string literal) to avoid any potentially unexpected string expansion issues that could arise from using <br \/>&nbsp; &nbsp;double quotation marks (expanding string).<\/p>\n<h2>How do I do RegEx?<\/h2>\n<p>I want to replace non-alphabetic characters in a string. Here is a string I can use to perform my test:<\/p>\n<p style=\"margin-left:30px\">$string = &#039;abcdefg12345HIJKLMNOP!@#$%qrs)(*&amp;^TUVWXyz&#039;<\/p>\n<p>I also assign my regular expression pattern to a variable. As I noted earlier, I use single quotation marks (like I did in my test string). Here is my RegEx pattern:<\/p>\n<p style=\"margin-left:30px\">$pattern = &#039;[^a-zA-Z]&#039;<\/p>\n<p>I happen to know that there is a <b>Replace<\/b><i> <\/i>method in the .NET Framework System.String class. Here is what it might look like if I call the System.String <b>Replace<\/b><i> <\/i>method:<\/p>\n<p style=\"margin-left:30px\">$string.Replace($pattern,&#039; &#039;)<\/p>\n<p>Unfortunately, this does not work, and all of the non-alphabetic characters are still in the output string. This is because the <b>Replace<\/b><i> <\/i>method from the System.String class replaces straight-out strings, and it does not accept a RegEx pattern.<\/p>\n<h2>Using the Replace operator<\/h2>\n<p>Luckily, I can use the <b>&ndash;Replace<\/b> operator in Windows PowerShell to do the replacement. I want to replace any non-alphabetic character with a blank space in my output. So I simply use the string that is stored in the <b>$string<\/b> variable. I call the <b>&ndash;Replace<\/b> operator, and I tell the <b>&ndash;Replace<\/b> operator to look for a match with the RegEx pattern that I stored in the <b>$pattern<\/b> variable and to replace any match with a blank space. Here is the syntax of that command:<\/p>\n<p style=\"margin-left:30px\">$string -replace $pattern, &#039; &#039;<\/p>\n<p>When I run the code, the following appears in the output pane of my Windows PowerShell ISE:<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/29\/2019\/02\/hsg-12-6-14-01.png\"><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/29\/2019\/02\/hsg-12-6-14-01.png\" alt=\"Image of command output\" title=\"Image of command output\" \/><\/a><\/p>\n<p>The complete script is shown here:<\/p>\n<p style=\"margin-left:30px\">$string = &#039;abcdefg12345HIJKLMNOP!@#$%qrs)(*&amp;^TUVWXyz&#039;<\/p>\n<p style=\"margin-left:30px\">$pattern = &#039;[^a-zA-Z]&#039;<\/p>\n<p style=\"margin-left:30px\">$string -replace $pattern, &#039; &#039;<span style=\"font-size:12px\">&nbsp;<\/span><\/p>\n<p>I invite you to follow me on <a href=\"http:\/\/bit.ly\/scriptingguystwitter\" target=\"_blank\">Twitter<\/a> and <a href=\"http:\/\/bit.ly\/scriptingguysfacebook\" target=\"_blank\">Facebook<\/a>. If you have any questions, send email to me at <a href=\"mailto:scripter@microsoft.com\" target=\"_blank\">scripter@microsoft.com<\/a>, or post your questions on the <a href=\"http:\/\/bit.ly\/scriptingforum\" target=\"_blank\">Official Scripting Guys Forum<\/a>. See you tomorrow. Until then, peace.<\/p>\n<p><b>Ed Wilson, Microsoft Scripting Guy<\/b><span style=\"font-size:12px\">&nbsp;<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Summary: Microsoft Scripting Guy, Ed Wilson, talks about using Windows PowerShell to remove all non-alphabetic characters from a string. Microsoft Scripting Guy, Ed Wilson, is here. This morning I am drinking a nice up of English Breakfast tea and munching on a Biscotti. I know&#8230;Biscotti is not a very good breakfast. Oh well. I went [&hellip;]<\/p>\n","protected":false},"author":596,"featured_media":87096,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[174,3,4,61,45],"class_list":["post-236","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-scripting","tag-regular-expressions","tag-scripting-guy","tag-scripting-techniques","tag-weekend-scripter","tag-windows-powershell"],"acf":[],"blog_post_summary":"<p>Summary: Microsoft Scripting Guy, Ed Wilson, talks about using Windows PowerShell to remove all non-alphabetic characters from a string. Microsoft Scripting Guy, Ed Wilson, is here. This morning I am drinking a nice up of English Breakfast tea and munching on a Biscotti. I know&#8230;Biscotti is not a very good breakfast. Oh well. I went [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts\/236","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/users\/596"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/comments?post=236"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts\/236\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/media\/87096"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/media?parent=236"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/categories?post=236"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/tags?post=236"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}