The Goal:
Import data from XLSX files conveniently like import-csv lets you do with simpler data.
The preamble:
Excel is a mainstay of the business world at this point, which means a lot of the data you might have to work with will come at you as an XLSX file or need to be one. This can be a bit annoying when scripting.
If we’re just working in PowerShell-land and we can choose to use simple CSV data we have the handy import-csv and export-csv cmdlets, and those CSV files can open up in excel just fine. However, when we are forced to work with XLSX files it can lead to headaches.
If you search around online, or have worked with excel in PowerShell before, you have probably found solutions involving COM objects that tend to start a little like this:
$ExcelFile = New-Object -ComObject Excel.Application
Then there is lots of methods and properties we can interact with on that COM object. This lets us do what we need, but ultimately leverages excel and can cause huge performance issues when working with a lot of data. Often you might see the excel process stop responding for a while and then finally it finishes its work. This is clunky, confusing and un-fun.
The motivation:
I recently had a folder full of XLSX files that I needed to read in. I only cared about a couple columns, and I wanted to import them as objects like I could with import-csv and then pull only unique values out.
I leveraged the PowerShell Gallery and just searched for “Excel”. There is actually quite a few different options there now, but at the time I saw the description for a module called PSExcel, by a fellow named RamblingCookieMonster. The description was simple:
Work with Excel without installing Excel
That sounded good enough to me, so I figured I’d take it for a spin.
The meat:
Install-module PSExcel Get-command -module psexcel
CommandType Name Version Source ----------- ---- ------- ------ Function Add-PivotChart 1.0.2 psexcel Function Add-PivotTable 1.0.2 psexcel Function Add-Table 1.0.2 psexcel Function Close-Excel 1.0.2 psexcel Function ConvertTo-ExcelCoordinate 1.0.2 psexcel Function Export-XLSX 1.0.2 psexcel Function Format-Cell 1.0.2 psexcel Function Get-CellValue 1.0.2 psexcel Function Get-Workbook 1.0.2 psexcel Function Get-Worksheet 1.0.2 psexcel Function Import-XLSX 1.0.2 psexcel Function Join-Object 1.0.2 psexcel Function Join-Worksheet 1.0.2 psexcel Function New-Excel 1.0.2 psexcel Function Save-Excel 1.0.2 psexcel Function Search-CellValue 1.0.2 psexcel Function Set-CellValue 1.0.2 psexcel Function Set-FreezePane 1.0.2 psexcel
Import-XLSX sounds like exactly what I wanted.
Here I’ve generated a simple XLSX with fake people and companies. What I want to do is pull out just a list of unique company names from the company column. In my real world example the XLSX was a bit more complicated and I had dozens to read in on a loop, but that just involved scaling up these actions.
$path = "$PSScriptRoot\fakepeople.xlsx" import-module psexcel #it wasn't auto loading on my machine $people = new-object System.Collections.ArrayList foreach ($person in (Import-XLSX -Path $path -RowStart 1)) { $people.add($person) | out-null #I don't want to see the output } $people.company | select -unique
Contoso ContosoSuites Fabrikam Parnell Aerospace Humongous Insurance
Just a quick note, if you’re looking to do a bit more with Excel, I found this module as well, which seems a bit more robust.
That’s all for now. Hopefully this helps you if you need to grab some data from XLSX files!
0 comments