Summary: Microsoft Scripting Guy, Ed Wilson, talks about using Windows PowerShell to trim strings and clean up data.
Microsoft Scripting Guy, Ed Wilson, is here. The Scripting Wife heads out today to spend time with her other passion at the Blue Ridge Classic Horse Show. Unfortunately, I will not get to attend much of that event due to a week of training that my group is doing. But I will have the chance to see at least one day of the events.
PowerShell Summit Europe
Speaking of the Scripting Wife’s passions, registration is open now for the Windows PowerShell Summit in Europe. The event will be held Sept 29 – Oct 1, 2014 in Amsterdam in The Netherlands. This will be an awesome event, and it will be a great chance to meet other PowerShellers. Teresa and I were in Amsterdam earlier this year. The city is beautiful, and the people were really friendly.
Trim strings using String methods
Note This post is the last in a series about strings. You might also enjoy reading:
- PowerShell String Theory
- Keep Your Hands Clean: Use PowerShell to Glue Strings Together
- Join Me in a Few String Methods Using PowerShell
- Using the Split Method in PowerShell
One of the most fundamental rules for working with data is “garbage in, garbage out.” This means it is important to groom data before persisting it. It does not matter if the persisted storage is Active Directory, SQL Server, or a simple CSV file. One of the problems with “raw data” is that it may include stuff like leading spaces or trailing spaces that can affect sort and search routines. Luckily, by using Windows PowerShell and a few String methods, I can easily correct this situation.
The System.String .NET Framework class (which is documented on MSDN) has four Trim methods that enable me to easily cleanup strings by removing unwanted characters and white space. The following table lists these methods.
Method |
Meaning |
Trim() |
Removes all leading and trailing white-space characters from the current String object. |
Trim(Char[]) |
Removes all leading and trailing occurrences of a set of characters specified in an array from the current String object. |
TrimEnd |
Removes all trailing occurrences of a set of characters specified in an array from the current String object. |
TrimStart |
Removes all leading occurrences of a set of characters specified in an array from the current String object. |
Trim white space from both ends of a string
The easiest Trim method to use is the Trim() method. It is very useful, and it is the method I use most. It easily removes all white space characters from the beginning and from the end of a string. This is shown here:
PS C:\> $string = " a String "
PS C:\> $string.Trim()
a String
The method is that easy to use. I just call Trim() on any string, and it will clean it up. Unfortunately, the previous output is a bit hard to understand, so let me try a different approach. This time, I obtain the length of the string before I trim it, and I save the resulting string following the trim operation back into a variable. I then obtain the length of the string a second time. Here is the command:
$string = " a String "
$string.Length
$string = $string.Trim()
$string
$string.Length
The command and the associated output from the command are shown in the following image:
Trim specific characters
If there are specific characters I need to remove from both ends of a string, I can use the Trim(char[]) method. This permits me to specify an array of characters to remove from both ends of the string. Here is an example in which I have a string that begins with “a “ and ends with “ a”. I use an array consisting of “a”, “ “ and it removes both ends of the string. Here is the command:
$string = "a String a"
$string1 = $string.Trim("a"," ")
The command and its associated output are shown in the image that follows:
The cool thing is that I can also specify Unicode in my array of characters. This technique is shown here:
$string = "a String a"
$string1 = $string.Trim([char]0x0061, [char]0x0020)
Dr. Scripto says:
“Convenient Unicode tables are available on Wikipedia.”
The following image illustrates this technique by displaying the command and the associated output from that command:
Trim end characters
There are times when I know specifically that I need to trim characters from the end of a string. However, those characters might also be present at the beginning of the string, and I need to keep those characters. For these types of situations, I use the TrimEnd method. The cool thing is that I can automatically use this method three ways:
- I can delete a specific character if I type that specific Unicode character code.
- I can delete a specific character if I type that specific character.
- I can delete Unicode white-space characters if I do nothing but call the method.
Delete a specific character
In this example, I create a string and then specify an array of specific Unicode characters by using the Unicode code value. Because the string begins with the same characters that it ends with, this is a good test to show how I can delete specific characters from the end of the string. Here is the string:
$string = "a String a"
I now specify two Unicode characters by code value to delete from the end of the string. I store the returned string in a variable as shown here:
$string1 = $string.TrimEnd([char]0x0061, [char]0x0020)
The command and the associated output are shown in the following image:
I can also specify the specific character to trim by typing the characters. In this example, I type the< space> a characters as a single array element, so it will only delete <space> a from the end of the string:
PS C:\> $string = "a String a"
PS C:\> $string.Length
10
PS C:\> $string1 = $string.TrimEnd(" a")
PS C:\> $string1
a String
PS C:\> $string1.Length
8
In the following example, I specify the characters as individual elements in an array. In fact, I do not even have them in the same order as they appear in the string. Yet, the results are the same.
PS C:\> $string = "a String a"
PS C:\> $string1 = $string.TrimEnd("a", " ")
PS C:\> $string1.Length
8
PS C:\> $string1
a String
PS C:\>
Delete white space
If I do not specify any characters, the TrimEnd method automatically deletes all Unicode white-space characters from the end of the string. In this example, a string contains both a space and a tab character at the end of the string. The length is 20 characters long. After I trim the end, the length is only 18 characters long, and both the space and the tab are gone. This technique is shown here:
PS C:\> $string = "a string and a tab `t"
PS C:\> $string1 = $string.TrimEnd()
PS C:\> $string.Length
20
PS C:\> $string1
a string and a tab
PS C:\> $string1.length
18
Trim start characters
If I need to trim stuff from the beginning of a string, I use the TrimStart method. It works exactly the same as the TrimEnd method. So it will also work in three ways as follow:
- Delete a specific character from beginning of string if I type that specific Unicode character code.
- Delete a specific character from beginning of string if I type that specific character.
- Delete Unicode white-space characters from beginning of string if I do nothing but call the method.
Delete specific characters from start
Like the TrimEnd method, I can specify Unicode characters by Unicode code value. When I do this, the TrimStart method deletes those characters from the beginning of a string. This technique is shown here:
PS C:\> $string = "a String a"
PS C:\> $string.Length
10
PS C:\> $string1 = $string.TrimStart([char]0x0061, [char]0x0020)
PS C:\> $string1.Length
8
PS C:\> $string1
String a
Instead of using the Unicode code values, I can simply type the array of string characters I want to delete.
Note One disadvantage of typing specific characters, is that “ “ is kind of hard to interpret. Is it a space, a tab, or a null value? Is it a typing error, or is it intentional? By using a specific Unicode code value, I know exactly what is meant, and therefore the script is more specific.
In this example, I type specific characters that need to be removed from the beginning of the string:
PS C:\> $string = "a String a"
PS C:\> $string.Length
10
PS C:\> $string1 = $string.TrimStart(" ", "a")
PS C:\> $string1.Length
8
PS C:\> $string1
String a
PS C:\>
Delete white space
If I call the TrimStart method and do not supply specific characters, the method will simply delete all Unicode white space from the beginning of the string. This technique is shown here:
PS C:\> $string = " String a"
PS C:\> $string.Length
11
PS C:\> $string1 = $string.TrimStart()
PS C:\> $string1.Length
8
PS C:\> $string1
String a
PS C:\>
That is all there is to using String methods. This also concludes String Week. Join me tomorrow when I will talk about exploring the Windows PowerShell $Profile variable.
I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.
Ed Wilson, Microsoft Scripting Guy
the format of this suck! WHat the hell happen to the old site. BOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo!!!!!!