September 9th, 2010

Copy CSV Columns to an Excel Spreadsheet by Using PowerShell

Summary: Join the Microsoft Scripting Guys as they show you how to copy CSV columns to a Microsoft Excel spreadsheet by using Windows PowerShell.

 

Hey, Scripting Guy! Question

Hey, Scripting Guy! I love comma-separated value (CSV) files. To me, it seems as if that was the best way to store data. It is almost universally accepted and is still widely used. The files are small and simple to manipulate. I have enjoyed using Windows PowerShell to do text manipulation. And working with a CSV file is, at its core, simple text manipulation.

One reason I like CSV files is that I can easily work with them on server products. And I do not have to eat up several gigabytes worth of space to install Microsoft Office, which is not supported on a server operating system anyway. However, for all its grace and elegance, I will admit that reading a CSV file with the human eye is not the easiest process. For that I prefer to use Microsoft Excel. Though it is true that I can open a CSV file in Microsoft Excel, it is rather annoying when it comes to saving the file. There must be a dozen warnings and prompts I have to navigate. When I have to work on a large number of CSV files, this process becomes loathsome. I would love to be able to copy columns from a CSV file and save the data into an honest to goodness Microsoft Excel spreadsheet and avoid all the prompts. Can this be done?

— JM

 

Hey, Scripting Guy! Answer Hello JM,

Microsoft Scripting Guy Ed Wilson here. Well, for the first time in more than four months, I have the windows open and the air conditioner turned off. The leaves have not started changing colors yet, and the temperature at night is only mid 60s Fahrenheit (18 degrees Celsius), but it is a giant step in the right direction. Soon, American football will begin, and there will be another sport I can ignore while I sit quietly beside the fireplace and read Shakespeare.

JM, email that is sent to scripter@microsoft.com is not something I can ignore. I wrote the Import-ProcessDataToExcel.ps1 script to illustrate importing specific columns of data into a newly created Microsoft Excel spreadsheet. The complete Import-ProcessDataToExcel.ps1 script is shown here.

Import-ProcessDataToExcel.ps1

Param(
 
$csvFile = “C:\fso\process.csv”,
 
$path = “C:\fso\process.xlsx”
)
$processes = Import-Csv -Path $csvFile
$Excel = New-Object -ComObject excel.application
$Excel.visible = $false
$workbook = $Excel.workbooks.add()
$excel.cells.item(1,1) = “name”
$excel.cells.item(1,2) = “Virtual Memory”
$excel.cells.item(1,3) = “Working Set”
$excel.cells.item(1,4) = “Private Memory”
$excel.cells.item(1,5) = “Non-paged memory”
$i = 2
foreach($process in $processes)
{
 
$excel.cells.item($i,1) = $process.name
 
$excel.cells.item($i,2) = $process.vm
 
$excel.cells.item($i,3) = $process.ws
 
$excel.cells.item($i,4) = $process.pm
 
$excel.cells.item($i,5) = $process.npm
 
$i++
} #end foreach process
$workbook.saveas($path)
$Excel.Quit()
Remove-Variable -Name excel
[gc]::collect()
[gc]::WaitForPendingFinalizers()

To create an interesting CSV file to work with for today’s Hey, Scripting Guy! Blog post, I used Windows PowerShell and I exported process information. I used the Get-Process cmdlet to retrieve information about all the processes running on my computer. I then piped the object to the Export-Csv cmdlet. If you are not planning on reconstituting the object, but instead you want to create a traditional CSV file, it is important to include the notypeinformation switch. Without using this switch, type information is written to the first line of the CSV file, and it will confuse things. An example of type information is shown here:

#TYPE System.Diagnostics.Process

Here is the command I used:

Get-Process | Export-Csv -Path C:\fso\process.csv -NoTypeInformation

The resultant CSV file is shown in the following image. JM, I think you were being kind when you said reading the file is not the easiest. To me it looks like a Rorschach test. I think I see a Scripting Guy eating an ANZAC biscuit while drinking a cup of Earl Grey tea. I wonder what that means? Oh, yeah, it must be time for my afternoon snack.

Image of CSV file

The Import-ProcessDataToExcel.ps1 script begins by creating two input parameters. The first parameter is the path to the saved CSV file, and the second parameter is the path for the newly created Microsoft Excel spreadsheet. This is shown here:

Param(

  $csvFile = “C:\fso\process.csv”,

  $path = “C:\fso\process.xlsx”

)

 

One thing to keep in mind about this script—I do not do a check for an existing spreadsheet with the same name in the same location. To do this, you could use the Test-Path cmdlet to see if the file exists, and if it does not exist, use the saveas method. Otherwise, you might delete the file and then call saveas. Alternatively, you might want to update the file by using the save method. As you can tell, there are a number of decisions you might need to make depending on the behavior you want. If the file does exist and the saveas method is called, you will be prompted to overwrite the file.

The Import-ProcessDataToExcel.ps1 script uses the Import-Csv cmdlet to import the data from the CSV file. The imported data is stored in the $processes variable. This is shown here:

$processes = Import-Csv -Path $csvFile

Nearly all of the time when you are working with Microsoft Excel, you will need to create an instance of the excel.application object. There are exceptions, such as Hey, Scripting Guy! How Can I Read from Excel Without Using Excel? article or Hey, Scripting Guy! How Can I Write to Excel Without Using Excel?, but those are rare cases. After the application object has been created, set the visible property to false, and add a new workbook to the workbooks collection. This is shown here:

$Excel = New-Object -ComObject excel.application

$Excel.visible = $false

$workbook = $Excel.workbooks.add()

 

I then decide to add column headings to the first row. To do this, I use the cells collection from the application object. This is a bit of an unusual move because usually the cells collection is accessed from the worksheet object. The cells collection from the application object returns a range object that represents all the cells in the active worksheet. The active worksheet in our example is sheet1 in the newly added workbook. By accessing the cells collection from the application object, we avoid having to create a worksheet object. There is a risk in this technique in that if the active document is not a worksheet, the cells property call will fail, which would cause the entire script to not work. But in this example, there is little danger because as the script is written, the active document must be a worksheet. The first number used by the item method represents the row, and the second number represents the column. Here are the column heads:

$excel.cells.item(1,1) = “name”

$excel.cells.item(1,2) = “Virtual Memory”

$excel.cells.item(1,3) = “Working Set”

$excel.cells.item(1,4) = “Private Memory”

$excel.cells.item(1,5) = “Non-paged memory”

 

We are fished with the first row; therefore, the value of the $i variable (that will be used later to keep track of our row number) is set to 2. The foreach statement is used to walk through the collection of processes that are stored in the $processes variable. This portion of the script is shown here:

$i = 2

foreach($process in $processes)

{

 

The nice thing about working with CSV data that has been imported via the Import-Csv cmdlet is that it allows you to address the data by column name. This means the script is easier to read. To add data to the Microsoft Excel spreadsheet, use the item method from the cells collection, specify the location on the spreadsheet, and assign the data. By using the $i variable to keep track of the current row and assigning the other static columns, the following code is achieved:

$excel.cells.item($i,1) = $process.name

$excel.cells.item($i,2) = $process.vm

$excel.cells.item($i,3) = $process.ws

$excel.cells.item($i,4) = $process.pm

$excel.cells.item($i,5) = $process.npm

 

The ++ operator is used to increment the value of the $i variable by 1. The first time through the loop, the value of $i is equal to 2 because it was set prior to entering the foreach loop. The next time through, the value will be 3 and so on. The ++ operator works the same as saying $i = $i + 1. This is shown here:

PS C:\> $i = 2

PS C:\> $i++

PS C:\> $i

3

PS C:\> $a = 2

PS C:\> $a = $a + 1

PS C:\> $a

3

PS C:\>

 

The saveas method from the workbook object automatically creates the file if it does not already exist. If the file does exist, it prompts to overwrite the file. After the file is saved, the quit method from the application object is used to exit Microsoft Excel. The $excel variable is then deleted, and the garbage collection is forced by calling the collect method from the GC class. The WaitForPendingFinalizers Method causes the script to wait for all finalizers to complete before continuing.

The completed Microsoft Excel workbook is shown in the following image.

Image of completed Excel workbook

JM, that is all there is to using Windows PowerShell to create a Microsoft Excel workbook from previously existing CSV data. This also concludes Microsoft Excel Week. Join us tomorrow for Quick-Hits Friday.

We invite you to follow us on Twitter and Facebook. If you have any questions, send email to us at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

 

Ed Wilson and Craig Liebendorfer, Scripting Guys

Author

0 comments

Discussion are closed.

Feedback