Summary: Learn how to use Windows PowerShell to create a holiday wish list
Microsoft Scripting Guy Ed Wilson here. It is almost the end of the year and we have decided to devote some posts to the holiday season. We even have guest bloggers from around the world to share some holiday spirit. Today we have Jeffery Hicks.
Jeffery Hicks is a Microsoft MVP in Windows PowerShell and an IT veteran with almost 20 years of experience. Much of it has been spent as an IT consultant specializing in Microsoft server technologies. He works today as an independent author, trainer and consultant. His latest book, with Don Jones, is Windows PowerShell 2.0: TFM (SAPIEN Press 2010). You can keep up with Jeff at http://jdhitsolutions.com/blog and twitter.com/jeffhicks. Or contact him at jhicks@jdhitsolutions.com.
Really Simple Santa Wish List
A time honored holiday tradition, at least in the United States, is for every little boy and girl to make their wish list for Santa. In the pre-Internet age we scribbled our hopes and dreams on a piece of colored construction paper or drafted detailed missives to Jolly Old St. Nick. We also used to walk up hill to school through the snow, both ways. But that’s another story.
In the 21st century, we can derive our wish lists from new technologies such as RSS. For the scripting geek, what would be a better present than scanning an RSS feed for the latest gadget or toy using Windows PowerShell. For the sake of this post, I’m going to use the RSS feeds from NewEgg.com, although the concepts and techniques should apply to just about any RSS feed. In particular, I want to see all the latest Xbox 360 products via http://www.newegg.com/Product/RSS.aspx?Submit=RSSCategorydeals&Depa=0&Category=323&NAME=Xbox-360 .
Out of the box, Windows PowerShell has no cmdlets for working with RSS feeds. Instead we will use the System.Net.WebClient class from the .NET Framework. We’ll create such an object using the New-Object cmdlet.
PS C:\> $webclient = New-Object -typeName System.Net.WebClient
This object has several methods but we are most interested in DownloadString(). This method takes a url as a parameter. Now the fun part. The RSS page we will download is actually an XML document. Because we hope to parse information from the XML document, we should save it to a variable.
PS C:\> [xml]$rss=$webclient.downloadString(http://www.newegg.com/Product/RSS.aspx?Submit=RSSCategorydeals&Depa=0&Category=323&NAME=Xbox-360)
I specified that $rss should be treated as an XML document. Otherwise $rss would be one very long string which would require lots of old-fashioned text parsing. Having an XML document is much easier. We can navigate this document by accessing its properties.
PS C:\> $rss
xml rss
— —
version=”1.0″ encoding=”utf-8″ rss
What we want to look at is the rss property.
PS C:\> $rss.rss
version channel
——- ——-
2.0 channel
If you try to look at $rss.rss.channel which would seem like the logical next step, you’ll receive an error. That’s because of the way the RSS document is structured.
PS C:\> $rss.rss.channel | select item
item
—-
{item, item, item, item…}
What we want to examine are the individual items. Because the item property will be treated as an array we can get the first item like this.
PS C:\> $rss.rss.channel.item[0]
If you try this on your own with the Xbox RSS url you’ll get a screen full of data, some of which is not too useful in its current form. What would be helpful is to discover the properties of the item.
PS C:\> $rss.rss.channel.item[0] | get-member -MemberType property
TypeName: System.Xml.XmlElement
Name MemberType Definition
—- ———- ———-
comments Property System.String comments {get;set;}
description Property System.String description {get;set;}
guid Property System.String guid {get;set;}
link Property System.String link {get;set;}
pubDate Property System.String pubDate {get;set;}
title Property System.String title {get;set;}
With this information I can get just the information I want from the RSS feed.
PS C:\> $rss.rss.channel.item | format-table pubdate,title -autosize
pubDate title
——- —–
Tue, 23 Nov 2010 13:24:01GMT $47.99 – NBA Jam Xbox 360 Game EA
Tue, 23 Nov 2010 13:24:01GMT $49.99 – Assassin’s Creed: Brotherhood Xbox 360 Game UBISOFT
Tue, 23 Nov 2010 13:24:01GMT $56.99 – Need for Speed Hot Pursuit Limited Edition Xbox 360 Game EA
Tue, 23 Nov 2010 13:24:01GMT $119.99 – Rock Band 3 Keyboard Bundle Xbox 360 Game EA
Tue, 23 Nov 2010 13:24:01GMT $199.99 – Microsoft Xbox 360 4 GB Black
Tue, 23 Nov 2010 13:24:01GMT $299.99 – Microsoft Xbox 360 (New Design) 250 GB Hard Drive Black
Tue, 23 Nov 2010 13:24:01GMT $299.99 – Microsoft Xbox 360 Kinect Bundle 4 GB Black
Tue, 23 Nov 2010 13:24:01GMT $399.99 – Microsoft Halo Reach Xbox 360 Limited Edition Bundle 250 GB HD Silver
Wow. What a wish list. But that was a lot of work to get to this point. Because there are not any cmdlets, I’ll write a function.
Get-RSSlist function
Function Get-RSSList {
[cmdletBinding()]
Param(
[Parameter(Position=0,Mandatory=$False,ValueFromPipeline=$True,
ValueFromPipelineByPropertyName=$True)]
[ValidateNotNullOrEmpty()]
[ValidatePattern(“^http”)]
[string[]]$Path=”http://www.newegg.com/Product/RSS.aspx?Submit=RSSCategorydeals&Depa=0&Category=323&NAME=Xbox-360″
)
Begin {
Write-Verbose -Message “$(Get-Date) Starting $($myinvocation.mycommand)”
Write-Verbose “$(Get-Date) Creating WebClient Object”
$webclient = New-Object -typeName System.Net.WebClient
} #close Begin
Process {
Foreach ($url in $Path) {
Write-Verbose “$(Get-date) Connecting to $url”
#retrieve the url and save results to an XML document
Try
{
#download the rss feed and save as an XML document
[xml]$data =$webclient.downloadstring($url)
}
Catch
{
Write-Warning “Failed to retrieve any information from the url.”
}
#only proceed if something was downloaded
if ($data)
{
#save all rss feed items to a variable
$items=$data.rss.channel.item
#count the number of returned items
$count=($items | measure-object).Count
#get the rss feed title
$channel=$data.rss.channel.title
Write-Verbose “$(Get-Date) found $count items in $channel”
#define a regex pattern for price to pull it from the description
[regex]$PricePattern=”\$\d*\.\d*”
#write data to the pipeline
$items | Select-Object @{Name=”Channel”;Expression={$channel}},
@{Name=”Published”;Expression={$_.PubDate}},
Title,@{Name=”Price”;Expression={
$PricePattern.Match($_.Description).value }},Link
} #if $data
else
{
Write-Output “No items found”
}
} #close Foreach $url
} #close process
End {
Write-Verbose -Message “$(Get-Date) Ending $($myinvocation.mycommand)”
} #close End
} #close Function
This function makes it easier to receive information from a given RSS feed. The function takes a url as a parameter. You can either enter it as a parameter value, or pipe an object to the function. The parameter accepts pipelined input by value and by property name. This means that you can pipe a list of urls or an object with a Path property to the function.
[Parameter(Position=0,Mandatory=$False,ValueFromPipeline=$True,
ValueFromPipelineByPropertyName=$True)]
The parameter also includes validation to make sure something is entered and that the path begins with http. The pattern I’m using is a very simple regular expression.
[ValidateNotNullOrEmpty()]
[ValidatePattern(“^http”)]
I’ve given the parameter a default value since this is the RSS feed I usually want to check. I’m a big fan of default values because it means less typing at the prompt, yet you still have the option of changing the value if you have to.
[string[]]$Path=”http://www.newegg.com/Product/RSS.aspx?Submit=RSSCategorydeals&Depa=0&Category=323&NAME=Xbox-360″
The [] that you see in the object type indicates that the parameter can accept an array of values. In the Process script block you will see this code:
Process {
Foreach ($url in $Path) {
Write-Verbose “$(Get-date) Connecting to $url”
I need this kind of enumeration when I want a function to process an array of values. My function will work regardless of how url strings are passed to it. Both examples would work.
PS C:\> $url1,$url2,$url3 | get-rsslist
PS C:\> get-rsslist $url1,$url2,$url3
Most of the rest of the function should look familiar. The url is downloaded to an XML document in a Try script block so that if there is an exception, I can handle it with the Catch script block.
Try
{
#download the rss feed and save as an XML document
[xml]$data =$webclient.downloadstring($url)
}
Catch
{
Write-Warning “Failed to retrieve any information from the url.”
}
Assuming all is well, all that is left is to extract the relevant information from each element. One additional touch is that I use a regular expression object to pull the price from the description property.
[regex]$PricePattern=”\$\d*\.\d*”
The pattern is used in a hash table with Select-Object, together with a few others to produce friendlier results.
#write data to the pipeline
$items | Select-Object @{Name=”Channel”;Expression={$channel}},
@{Name=”Published”;Expression={$_.PubDate}},
Title,@{Name=”Price”;Expression={
$PricePattern.Match($_.Description).value }},Link
An alternative would be to use New-Object to write a custom object to the pipeline.
$items | ForEach {
New-Object –typeName PSObject –Property @{
Channel=$channel
Published=$_.PubDate
Title=$_.Title
Price=$PricePattern.Match($_.Description).value
Link=$_.link
}
}
The only other function features I want to point out are all the Write-Verbose expressions. Typically when you run the function, you won’t see any verbose output. But because this function uses cmdlet binding, if I include –Verbose when running the function, then I’ll get all that extra verbose output. I find this very useful for tracing and debugging and include it in my scripts and functions from the beginning.
After I’ve loaded the function in my Windows PowerShell session, all I have to do is run it. The following figure shows what I get with the default RSS path.
The output is a collection of custom objects that could be further sorted, grouped, filtered, exported or converted.
The function even works with an RSS feed from ThinkGeek.com. I’ve added one to a wishlist of RSS feeds saved to a text file which I then piped to Get-RSSFeed. The following figure shows the command I used to retrieve the information that is stored in the text file.
Depending on the structure of the RSS feed, you will most likely have to adjust the function to extract the relevant information. But you should at least understand now how to figure that out. And if I’ve been a good boy this year, I hope Santa will remember the USB Fishquarium.
I think Jeffery has been a good boy for sharing this wish list script. I wonder if it will work on the Scripting Wife for me like it might for Santa? Thank you Jeffery.
Join us tomorrow as we continue our holiday guest blogger weekend with guest Sean Kearney.
I invite you to follow me on Twitter or Facebook. If you have any questions, send email to me at scripter@microsoft.com or post them on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.
Ed Wilson, Microsoft Scripting Guy
0 comments