Grabbing Excel (XLSX) values with PowerShell

Kory Thacher

The Goal:

Import data from XLSX files conveniently like import-csv lets you do with simpler data.

The preamble:

Excel is a mainstay of the business world at this point, which means a lot of the data you might have to work with will come at you as an XLSX file or need to be one. This can be a bit annoying when scripting.

If we’re just working in PowerShell-land and we can choose to use simple CSV data we have the handy import-csv and export-csv cmdlets, and those CSV files can open up in excel just fine. However, when we are forced to work with XLSX files it can lead to headaches.

If you search around online, or have worked with excel in PowerShell before, you have probably found solutions involving COM objects that tend to start a little like this:

$ExcelFile = New-Object -ComObject Excel.Application

Then there is lots of methods and properties we can interact with on that COM object. This lets us do what we need, but ultimately leverages excel and can cause huge performance issues when working with a lot of data. Often you might see the excel process stop responding for a while and then finally it finishes its work. This is clunky, confusing and un-fun.

The motivation:

I recently had a folder full of XLSX files that I needed to read in. I only cared about a couple columns, and I wanted to import them as objects like I could with import-csv and then pull only unique values out.

I leveraged the PowerShell Gallery and just searched for “Excel”. There is actually quite a few different options there now, but at the time I saw the description for a module called PSExcel, by a fellow named RamblingCookieMonster. The description was simple:

Work with Excel without installing Excel

That sounded good enough to me, so I figured I’d take it for a spin.

The meat:

Install-module PSExcel
Get-command -module psexcel
CommandType Name Version Source
----------- ---- ------- ------
Function Add-PivotChart 1.0.2 psexcel
Function Add-PivotTable 1.0.2 psexcel
Function Add-Table 1.0.2 psexcel
Function Close-Excel 1.0.2 psexcel
Function ConvertTo-ExcelCoordinate 1.0.2 psexcel
Function Export-XLSX 1.0.2 psexcel
Function Format-Cell 1.0.2 psexcel
Function Get-CellValue 1.0.2 psexcel
Function Get-Workbook 1.0.2 psexcel
Function Get-Worksheet 1.0.2 psexcel
Function Import-XLSX 1.0.2 psexcel
Function Join-Object 1.0.2 psexcel
Function Join-Worksheet 1.0.2 psexcel
Function New-Excel 1.0.2 psexcel
Function Save-Excel 1.0.2 psexcel
Function Search-CellValue 1.0.2 psexcel
Function Set-CellValue 1.0.2 psexcel
Function Set-FreezePane 1.0.2 psexcel

Import-XLSX sounds like exactly what I wanted.

Here I’ve generated a simple XLSX with fake people and companies. What I want to do is pull out just a list of unique company names from the company column. In my real world example the XLSX was a bit more complicated and I had dozens to read in on a loop, but that just involved scaling up these actions.

$path = "$PSScriptRoot\fakepeople.xlsx"

import-module psexcel #it wasn't auto loading on my machine

$people = new-object System.Collections.ArrayList

foreach ($person in (Import-XLSX -Path $path -RowStart 1))

{

$people.add($person) | out-null #I don't want to see the output

}

$people.company | select -unique
Contoso
ContosoSuites

Fabrikam

Parnell Aerospace

Humongous Insurance

Just a quick note, if you’re looking to do a bit more with Excel, I found this module as well, which seems a bit more robust.

That’s all for now. Hopefully this helps you if you need to grab some data from XLSX files!

0 comments

Discussion is closed.

Feedback usabilla icon