{"id":2528,"date":"2013-11-21T00:01:00","date_gmt":"2013-11-21T00:01:00","guid":{"rendered":"https:\/\/blogs.technet.microsoft.com\/heyscriptingguy\/2013\/11\/21\/use-powershell-to-remove-data-from-csv-file\/"},"modified":"2013-11-21T00:01:00","modified_gmt":"2013-11-21T00:01:00","slug":"use-powershell-to-remove-data-from-csv-file","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/scripting\/use-powershell-to-remove-data-from-csv-file\/","title":{"rendered":"Use PowerShell to Remove Data from CSV File"},"content":{"rendered":"<p><strong>Summary<\/strong>: Guest blogger, Matt Tisdale, talks about using Windows PowerShell to remove data from a .csv file.\nMicrosoft Scripting Guy, Ed Wilson, is here. Welcome back <a href=\"http:\/\/blogs.technet.com\/b\/heyscriptingguy\/archive\/tags\/matt+tisdale\/\" target=\"_blank\">guest blogger, Matt Tisdale<\/a>&hellip;\nLast night a geoscientist told me that he has almost 900 .csv files, and he needs to remove two columns of data from them. He has used Windows PowerShell for some other data management tasks and he wanted to know if Windows PowerShell could help with this task. Of course it can!\nHere is the command I provided to him:<\/p>\n<p style=\"padding-left: 30px\">Get-ChildItem c:temp *.csv | foreach { $filename=$_.fullname; Import-Csv $filename | select * -ExcludeProperty column1,column2 |Export-csv $filename.Replace(&#8220;.csv&#8221;,&#8221;.new.csv&#8221;) -NoTypeInformation }\nHere is a breakdown of each component of the command&hellip;<\/p>\n<p style=\"padding-left: 30px\">Get-ChildItem c:temp *.csv\nThis finds all .csv files directories under c:temp. If we wanted to find all files anywhere in the subfolder structure below c:temp, the command would be Get-ChildItem c:temp *.csv -Recurse.<\/p>\n<p style=\"padding-left: 30px\">foreach { &#8230;. }\nWe are using a <strong>foreach<\/strong> loop to run a series of commands for each file that is enumerated by <strong>Get-ChildItem<\/strong>. In most cases with Windows PowerShell, we can simply use the pipeline ( | ) character. But in this case, we are doing something special, which requires a <strong>foreach<\/strong> loop. I&rsquo;ll discuss more details about this later.<\/p>\n<p style=\"padding-left: 30px\">$filename=$_.fullname; Import-Csv $filename | select * -ExcludeProperty column1,column2 |Export-csv $filename.Replace(&#8220;.csv&#8221;,&#8221;.new.csv&#8221;) -NoTypeInformation\nInside the <strong>foreach<\/strong> loop, we are performing three specific tasks:<\/p>\n<ul>\n<li>Importing data from the csv file<\/li>\n<li>Selecting specific pieces of the data<\/li>\n<li>Exporting the selected data to a new .csv file<\/li>\n<\/ul>\n<p>Now let&#8217;s look at why we are using a <strong>foreach<\/strong> loop here instead of just the pipeline. The pipeline is used to send output from one command to the next command. Here, we take the output from <strong>Get-ChildItem<\/strong> and send it to <strong>Import-csv<\/strong>. Then we take the output from <strong>Import-csv<\/strong> and send it to <strong>Select<\/strong>. Finally, we take the output from <strong>Select<\/strong> and send it to <strong>Export-csv<\/strong>.\nAll of this is pretty simple, but we need output from <strong>Get-ChildItem<\/strong> to make it all the way down the line to <strong>Export-csv<\/strong>. We need this because we want to maintain the original file name we are working with. If we only used the pipeline, the only input <strong>Export-csv<\/strong> would receive is what <strong>Select<\/strong> sent it.\nBecause <strong>Select<\/strong> does not know the file name, we are in trouble. The answer I chose here is to write the file name into a variable and maintain that variable inside the <strong>foreach<\/strong> loop. The next few sections explain more about this.<\/p>\n<p style=\"padding-left: 30px\">$filename=$_.fullname; Import-csv $filename\nThe very first thing we do inside the <strong>foreach<\/strong> loop is save the file name into a variable named <strong>$filename<\/strong>. By doing this, we can use this variable anywhere further down the pipeline, as long as it is inside the <strong>foreach<\/strong> loop. As you can see when looking at the entire command, we reference this variable with <strong>Import-csv<\/strong> and <strong>Export-csv<\/strong> at different levels in the pipeline process.\nNotice the semicolon at the end of this? This is used to run more than one command before passing output through the pipeline to the next command. In this section we are first setting our variable and then running <strong>Import-csv<\/strong>.<\/p>\n<p style=\"padding-left: 30px\">select * -ExcludeProperty column1,column2\nHere we are using <strong>Select-Object<\/strong> (<strong>select<\/strong>) to select only specific pieces of data (columns in this case) from the data sent to us from <strong>Import-csv<\/strong>. In the test data I created while designing this command, I had a number of .csv files with five columns each. I named the columns column1, column2, column3, and so on. Under each column I added various text strings of data.\nWith <strong>Select-Object<\/strong>, we can include specific columns or exclude specific columns. I have no idea how many columns the production data may actually have, but I know the need is to only exclude two of them (by name), so I decided to go with the exclusion method because it will be less effort.\nThere are a few different ways you can include and exclude data with <strong>Select-Object<\/strong>, but I will focus on the specific parameters that I elected to use for today. Feel free to use <strong>Get-Help<\/strong> to research the other parameters that are available with <strong>Select-Object<\/strong> (for example, <strong>-First<\/strong> and <strong>-Last<\/strong>).\nThe asterisk ( * ) initially selects all of the data that is passed from the previous command. The <strong>-ExcludeProperty<\/strong> parameter is used to specifically exclude two columns by name. The result of this command against my test data is that only data in columns 3, 4, and 5 were sent to the next command in the pipeline.<\/p>\n<p style=\"padding-left: 30px\">Export-Csv $filename.Replace(&#8220;.csv&#8221;,&#8221;.new.csv&#8221;) -NoTypeInformation\nThis command takes the data sent to us from <strong>Select-Object<\/strong> and saves it out to a new .csv file. The data sent through the pipeline from <strong>Select-Object<\/strong> does not have any details about the original file name, but because we saved this into a variable earlier, we can now access it and determine the original file name where this data came from. Two thumbs up for Windows PowerShell variables!\nWe want to save our new file by using the original file name with &#8220;.new&#8221; added, so we simply use the <strong>Replace<\/strong> method to find <strong>.csv<\/strong> in the file name and replace it with <strong>.new.csv<\/strong>. If my original file name is c:tempdata1.csv, the new file name will be c:tempdata1.new.csv. The <strong>-NoTypeInformation<\/strong> parameter is used to keep <strong>Export-csv<\/strong> from writing additional information that we do not need at the top of our .csv file.\nThere you have it. Ultimately this is a pretty simple business need, but it would take someone hours of time to manually remove two columns of data from almost 900 files. It only took me about 10 minutes to design and test this command in Windows PowerShell. Now this individual can spend more of his time working to find petroleum and less time performing tedious data management tasks.\nI really enjoy seeing Windows PowerShell provide direct benefit to our business and company bottom line! Please keep your questions flowing in. I look forward to helping IT staff and business employees save more time and money in the future.\nHappy scripting!\n~Matt\nWay cool. Thanks Matt. This is a very helpful technique. I invite you to follow me on <a href=\"http:\/\/bit.ly\/scriptingguystwitter\" target=\"_blank\">Twitter<\/a> and <a href=\"http:\/\/bit.ly\/scriptingguysfacebook\" target=\"_blank\">Facebook<\/a>. If you have any questions, send email to me at <a href=\"http:\/\/blogs.technet.commailto:scripter@microsoft.com\" target=\"_blank\">scripter@microsoft.com<\/a>, or post your questions on the <a href=\"http:\/\/bit.ly\/scriptingforum\" target=\"_blank\">Official Scripting Guys Forum<\/a>. See you tomorrow. Until then, peace.\n<strong>Ed Wilson, Microsoft Scripting Guy<\/strong>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Summary: Guest blogger, Matt Tisdale, talks about using Windows PowerShell to remove data from a .csv file. Microsoft Scripting Guy, Ed Wilson, is here. Welcome back guest blogger, Matt Tisdale&hellip; Last night a geoscientist told me that he has almost 900 .csv files, and he needs to remove two columns of data from them. He [&hellip;]<\/p>\n","protected":false},"author":596,"featured_media":87096,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[169,56,433,464,3,45],"class_list":["post-2528","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-scripting","tag-csv-and-other-delimited-files","tag-guest-blogger","tag-matt-tisdale","tag-scripti9ng-techniques","tag-scripting-guy","tag-windows-powershell"],"acf":[],"blog_post_summary":"<p>Summary: Guest blogger, Matt Tisdale, talks about using Windows PowerShell to remove data from a .csv file. Microsoft Scripting Guy, Ed Wilson, is here. Welcome back guest blogger, Matt Tisdale&hellip; Last night a geoscientist told me that he has almost 900 .csv files, and he needs to remove two columns of data from them. He [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts\/2528","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/users\/596"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/comments?post=2528"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts\/2528\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/media\/87096"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/media?parent=2528"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/categories?post=2528"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/tags?post=2528"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}