March 25th, 2010

Hey, Scripting Guy! The Story of a Large Script Project

Bookmark and Share

 

Hello, everyone. This is Clint Huffman, this time writing from Seattle, Washington, in the United States. I have hijacked this blog. Don’t tell Ed. Just joking. Actually, Ed Wilson allowed me the honor of filling in for him on one of the blog posts and I hope he isn’t disappointed. I’ve been working on the Performance Analysis of Logs (PAL) tool for a few years now and have some war stories to tell. Weighing in at over 4,000 lines of code, the Performance Analysis of Logs (PAL) tool is the largest Windows PowerShell 2.0 script I have ever written. I had many challenges developing this tool and I hope my experiences can help you avoid some pitfalls when taking on a large script project.

 

Why I wrote the PAL tool

Just like Indiana Jones’s father who wrote everything he researched in his diary, I wrote everything I researched in my tool. Performance counters can tell us a lot about the behavior of a computer, but it is cumbersome to look at them all and still get a good return on the investment of your time. This was a great opportunity to introduce scripting automation to the Windows performance analysis world.

Is it a script or a tool?

In my opinion, a script is something you write for yourself such as a script that shuts down unnecessary services for flight mode, or querying for the latest illegal MP3 files on your network. A “tool” is a script that you write for other people. You might think that is an easy transition, but ask any developer where they spend most of their timehandling user input. You have to expect the unexpected and provide a nice, easy-to-use interface. The PAL tool is a tool because I wrote it for other people to use.

Plan the trip

Just like planning a long trip, you don’t just get in the car and go. You have to plan, pack, and give the kids a sedative. (Scripting Editor: Yikes.) Before I wrote any code for the PAL tool, I wrote a document that defines its specifications. The document specifies the goal of the project, such as what kind of work the user has to do and what will they get out of it; market analysis such as how does it compare to other products in the market; and the infrastructure needed to do the job.

Define the requirements

The first version of PAL was written in VBScript and relied on a lot of technology that was free such as the ability to execute string values as inline code, Microsoft Log Parser, and Microsoft Office Web Components 2003 (OWC11). The free technologies made it easy to create line charts for the counters (a requirement for PAL), but both Log Parser and OWC11 are aging products that might not be around much longer. For PAL v2.0 to succeed, it needed a new and free technology to create charts.

The technology options

I considered many different technologies for PAL v2.0. Here are a few of the technologies I seriously considered:

·         VBScript: PAL v1.x was already written in VBScript, but needed a major overhaul. Like any impatient developer, when writing PAL v1.x I eventually didn’t care about “doing it the best way” and cared more about just getting it working and out the door. My advice to you is to try to stick with doing it right. It helps in the long run.

In any case, I was comfortable with VBScript, so I actually invested a lot of time writing PAL v2.0 in VBScript. I wrote VBScript classes and I was very happy with how the code was turning out, but VBScript is old technology (more than15 years old) and Windows PowerShell is the cool new scripting technology. Furthermore, I didn’t have a replacement for Log Parser and OWC11.

·         VB.NET: Many of my co-workers kept asking me, “Why not just write the whole thing in .NET?” While it sounds easy, there are challenges with this. The problem is that PAL needs to be able to run threshold code from an external XML file. For VB.NET to do this, it would have to compile the code during runtime to run it, and it would require more research on my part to get it to work.

·         VB.NET Classes (assemblies): I tried writing the core functionality of PAL in VB.NET class DLLs (assemblies), but I found that when I needed to update the assembly (DLL) I had to close the Windows PowerShell editor, recompile the assembly, and then bring the Windows PowerShell editor back up again. This became a painful process.

·         VB.NET Chart Controls: Because I needed a new technology to create line charts and I had to keep it open source, I began writing my own .NET classes that generate line charts. This was a lot of fun, but I had to keep each chart between the values of 0 and 100 because the logic of creating a chart that autoscales was beyond me.


The breakthroughs

The development of PAL v2.0 stagnated because of the technology challenges of keeping it a free and open-source tool.

MSChart Controls for .NET Framework v3.5: The first breakthrough came with the release of the MSChart Controls. Microsoft released them as free .NET classes that we can use in .NET. VBScript is unable to use .NET classes, so this was a big push for me to give up on the VBScript development of PAL v2.0.

Meeting Bruce Payette: The second breakthrough was meeting Bruce Payette who is one of the development leads for Windows PowerShell. He led an introduction to PowerShell class for my team (Premier Field Engineering), and it was amazing. Every technology challenge I had with PAL, he was able to answer in Windows PowerShell. I was finally convinced that Windows PowerShell is the true path for PAL v2.0.

Bruce’s book, PowerShell in Action, was just right for me because it is designed for people who already know other programming languages. If you are new to programming, Ed Wilson’s book, Windows PowerShell 2.0 Best Practices, is more appropriate.

With the MSChart Controls freely available to everyone on the .NET v3.5 Framework, Bruce’s assurances that Windows PowerShell can do the job, and the convenience of doing a pure Windows PowerShell solution, the path was clear.

Make it easy to read

Remember that a “tool” is a script that other people will use, so make it easy to read and follow the logic. Because Windows PowerShell must have the functions defined before you can execute them, I put all of my function calls at the bottom of the script. The flow of the script is easy to follow, so if anyone needs to debug it or needs to get an idea of what it does, they just look at the Main() function.

Below is a sample of what the PAL script Main() function looks like. Each line is a function call and I am using those function calls to group portions of the script together. Think of this as the traffic cop of the script:

InitializeGlobalVariables
ShowMainHeader
ProcessArgs
CreateSessionWorkingDirectory
Analyze    
PrepareDataForReport
GenerateHtml
SaveXmlReport
OpenHtmlReport

As you can see, the Main() function is the heart of the script, and I try to use easy-to-read function names, so the person reading the script should be able to follow along at least at a high level.

The output needs to be concise and easy on the eyes. Here is a screenshot of the PAL tool executing.

Image of PAL tool executing


The problem is “choice” -eq a VBScript Legacy

Just like in “The Matrix” trilogy, the problem is “choice.” One of the challenges I had when writing Windows PowerShell scripts from a VBScript background is the expression evaluation of the If statements—meaning I kept using equal signs (=) where I should have been using -eq statements. For example, I would write the following If statement and it would *always* evaluate to True:

If ($a = 1) {# Do Something}

I come from a VB Script background, so to me this looks perfectly okay, but what is happening is the equal sign (=) is assigning the value of 1 to $a instead of evaluating it. After hitting this problem about 30 or so times, I am getting better at it. The correct statement should look like this:

If ($a –eq 1) {# Do Something}

Where the -eq evaluates the variable $a to see if it equals 1.

Debugging with Windows PowerShell ISE versus PowerGUI

I came from a VBScript background, so it was challenging to develop a new application in Windows PowerShell. My crutch for a lot of my work was PowerGUI because it helped out a lot with my If…Then statements like above. All I had to do was press Ctrl+B, type the VBScript I am trying to do such as If, and it would create the Windows PowerShell equivalent code.

While I really like PowerGUI, it had problems with debugging a 5,000-line script such as PAL. I’ve heard that the scalability problem has been fixed as of the writing of this document, but at the time, it took too long to be usable. Therefore, I used PowerGUI to write my Windows PowerShell functions, and then copied the code to the main script and debugged the main script in Windows PowerShell ISE, which ran very fast when debugging.

Windows PowerShell ISE ships with Windows PowerShell v2.0. PowerGUI is a free download from Quest Software at http://www.powergui.org.


Going primitive

Early on in the development of PAL v2.0 in Windows PowerShell, I noticed that its performance is a lot slower than PAL v1.x. I discovered that the slowest part was using built-in cmdlets such Import-CSV. I was using Import-CSV to read in the performance counter log (I use relog.exe to convert the binary log from BLG to text [CSV]), I would extract the counter data from it. It looked something like this:

$oCsvCounterLog = Import-CSV –path ‘.counterlog.csv’

The problem is that this line would cause the memory consumption on the computer to be in the gigabytes. This is because the performance counter logs can become very large. Colleagues on my team suggested piping it like this:

$counter = Import-CSV –Path ‘.counterlog.csv’ | ForEach-Object {$_.”\DemoComputerProcessor(_Total)% Processor Time”}

This solved the memory bloat problem, but every time this line was called, it would take several seconds to process. When dealing with thousands of counters, this was unacceptably slow. Therefore, I had to bite the bullet and go primitive.

What I mean be “going primitive” is that instead of using the built-in cmdlets in Windows PowerShell (in this case the Import-CSV cmdlet), I simply read the CSV file into memory as a two-dimensional array. The code is a bit too complex to post in this document, but I hope you get the idea.

The point is that, if the built-in cmdlets are too slow for you, consider parsing the data using more primitive methods.


“I” before “e” except after “c”

Another concept I had to learn was the order of functions. In VBScript, you can place your functions anywhere in the script and it would work. In Windows PowerShell (and other languages like C), the functions must be defined before you can use them. Typically, in my VBScript I would have the Main() function at the top of the script, so the human looking at the code need only scroll down a little bit to see the flow of the script. In Windows PowerShell, I had to reverse this—meaning I put the Main() function at the bottom of the script, so that all of the functions that are called are already defined before the Main() function is called.


Conclusion

There are a lot of great technologies out there that I could have used to create PAL v2.0, but a pure Windows PowerShell solution worked best for me. With that said, I had a lot of growing pains using Windows PowerShell and I hope that you can avoid some of the pitfalls that I ran into.

Oh, and here is one of the cool charts that is generated by the PAL tool:

Image of chart generated by PAL tool

 

Author

0 comments

Discussion are closed.

Feedback