Expert Solutions: Advanced Event 8 of the 2010 Scripting Games

ScriptingGuy1

 

Bookmark and Share

(Note: These solutions were written for Advanced Event 8.)

Advanced Event 8 (Windows PowerShell)

Photo of Lee Holmes

Lee Holmes is a developer on the Microsoft Windows PowerShell team, author of the Windows PowerShell Cookbook and Windows PowerShell Pocket Reference. He also runs the Precision Computing Blog.

———–

New-TestFile

A problem so nice, we solved it twice.

Once in awhile, you’ll run into the need to generate files of a specific size without caring much about what’s in them. We’ve been asked by our friends in the network team to help them with exactly that, and Windows PowerShell is more than up to the challenge.

In understanding their requirements, the first thing that jumps out is a perceived difference between the size of the file that we generate and its size on disk. Windows Explorer reports these as two separate numbers when you view the file properties, and they’ve asked us to guarantee that these remain within 1 percent of each other. The difference in these two numbers comes from the disk’s cluster size, also known as its allocation unit. To reduce the overhead required to manage your disk space, operating systems don’t dole out space to files on a byte-by-byte basis; they allocate space in larger chunks called (creatively) allocation units. For most systems, this is 4,096 bytes. Even if you only create a 1-byte file, Windows sets aside 4,096 bytes. If you create a file that is 4,096 bytes, Windows still sets aside 4,096 bytes. If you create a file that is 4,097 bytes, Windows sets aside 8,192 bytes. In essence, Windows rounds up to the nearest 4,096-byte chunk.

To figure out how much space is being used by a file, try this simple Windows PowerShell command:

   1: [Math]::Ceiling(<length> / 4096) * 4096

After clarifying this point, the networking team decided that the difference was not important and dropped the requirement that the files be within 1 percent of the “size on disk” as reported by Windows.

The broad dynamic range of Windows PowerShell makes this a fun challenge to solve in a number of ways. At the very simplest, we have a snippet to generate random content of whatever size we want:

   1: $random = New-Object System.Random

   2: $content = for($i = 0; $i -lt 1mb; $i++) { 

   3: [char] $random.Next(32, 127) }

   4: Set-Content file.txt (-join $content)

This works okay for small files, but we’re going to need much better performance if we want to start generating larger files. In addition, this snippet only covers the simplest of the scenario requirements: filling a file one character at a time until it reaches the specified limit. Enabling user-supplied content, while optional, is an equally important problem.

To support user-supplied content, we can just keep on using the Add-Content cmdlet to fill the file with their string as long as space remains for at least one more addition. For our last addition, we’ll have to chop the string down and only add as much as will fit.

As we take his approach, however, the performance is still a problem. The Add-Content cmdlet is designed for interactive use, so we’re asking it to do a lot of redundant work hundreds of thousands of times: verify its parameters, open the file, find the right place to add content to, close the file, and more. We’re most certainly stepping away from this interactive scenario, so using the file APIs from the .NET Framework is an attractive approach. At the most basic level, you use the $file = [System.IO.File]::OpenWrite() method to open a file, and then $file.Write(…) to write to that file. Finally, you call the $file.Close() method to let the .NET Framework know you’re done with it. This is seen in the following image.

Image of one method to write to file

Even with this approach, we have a large opportunity for improvement. If the user gives a small string (such as “Hello World”) for their custom text, we’ll be looping and calling $file.Write(…) an enormous number of times for a large file. In addition, hard drives work best when told to read and write large chunks of data. While most APIs (those in the .NET Framework included) try to batch your work to account for this, we can do a better job ourselves.

On a 100 MB file, batching 10-character chunks up into 16 KB chunks easily takes the performance from 35 seconds to about half a second.

How?

Well, we can create a buffer of a certain disk-friendly size (for example, 16 KB), and then fill the file by writing copies of that buffer instead. To create that buffer, we need to fill it with the user’s custom text (or random data) as long as space remains for at least one more addition. For our last addition, we’ll have to chop the string down and only add as much as…wait! Isn’t that the problem we’re already trying to solve?

Indeed. This is a problem so nice, we solved it twice. The complete script is seen here.

New-TestFile.ps1

   1: ##############################################################################

   2: ##

   3: ## 

   4: New-TestFile

   5: ##

   6: ## by Lee Holmes 

   7: (http://www.leeholmes.com/blog)

   8: ##

   9: ##############################################################################

  10: <#

  11: .SYNOPSIS

  12: Creates 

  13: a new file of the specified length. The file can be filled with the

  14: specified 

  15: TemplateContent, or will be filled with random data if no template

  16: content is 

  17: specified.

  18: .EXAMPLE

  19: New-TestFile -Path c:\temp\test.txt -Length 1mb 

  20: -Force

  21: This example creates a file called test.txt with 1 megabyte of 

  22: data,

  23: overwriting the file if it exists.

  24: #>

  25: param(

  26: ## The path of 

  27: the destination to create

  28: [Parameter(Mandatory = $True)]

  29: [string] 

  30: $Path,

  31: ## The size of the file to create

  32: [ValidateRange(0, 1gb)]

  33: [int] 

  34: $Length,

  35: ## The template content to use to fill the file

  36: [string] 

  37: $TemplateContent,

  38: ## Switch to overwrite the file if it exists

  39: [switch] 

  40: $Force

  41: )

  42: Set-StrictMode -Version Latest

  43: ## Check if the file exists. 

  44: Throw an error if it does, but overwrite the

  45: ## file if they used the -Force 

  46: switch.

  47: if(Test-Path $path)

  48: {

  49: if($Force)

  50: {

  51: Remove-Item $path 

  52: -Force

  53: }

  54: else

  55: {

  56: throw "The file '$path' already 

  57: exists."

  58: }

  59: }

  60: ## Writing to the disk is terribly slow when you do it in 

  61: small chunks.

  62: ## Since we'll usually be blasting out large streams of data, 

  63: we can be

  64: ## much more efficient by writing it out in larger 

  65: chunks.

  66: $chunkLength = 16kb

  67: $fillBytes = New-Object byte[] 

  68: $chunkLength

  69: ## If they gave us some template content, we'll fill the 

  70: 'fillBytes' buffer

  71: ## with their text. It's very likely that the text they 

  72: give us will not

  73: ## completely fill the buffer, so we have to pack it 

  74: ourselves.

  75: if($templateContent)

  76: {

  77: ## First, we convert their input to 

  78: an array of bytes. Normally, we would

  79: ## use [System.Text.Encoding]::Unicode 

  80: to get the bytes out of the string

  81: ## so that our network operators can fill 

  82: the file with Unicode strings. 

  83: ## However, files filled with Unicode data 

  84: don't represent typical files:

  85: ## for most languages, half the file ends up 

  86: being just zeros.

  87: $templateBytes = 

  88: [System.Text.Encoding]::ASCII.GetBytes($templateContent)

  89: ## Figure out how 

  90: much of the 'fillBytes' buffer is remaining. We'll start

  91: ## putting our 

  92: content at position zero in the buffer, and we'll write

  93: ## as many bytes as 

  94: $templateBytes holds.

  95: $bytesRemaining = $chunkLength

  96: $currentPosition = 

  97: 0

  98: $bytesToWrite = $templateBytes.Length

  99: ## Now loop filling up the 

 100: tempateBytes buffer

 101: while($bytesRemaining -gt 0)

 102: {

 103: ## If their input 

 104: text is larger than the remainder of the buffer,

 105: ## then we remember to only 

 106: write as much as will fit.

 107: if($bytesRemaining -lt 

 108: $templateBytes.Length)

 109: {

 110: $bytesToWrite = $bytesRemaining

 111: }

 112: ## Now 

 113: copy bytes from the templateBytes array into the current

 114: ## position in the 

 115: fillBytes array.

 116: [Array]::Copy($templateBytes,

 117: 0, $fillBytes, 

 118: $currentPosition, $bytesToWrite)

 119: ## Update our position counter (so that we 

 120: don't overwrite what we've

 121: ## already written), and update how much space is 

 122: left.

 123: $currentPosition += $bytesToWrite

 124: $bytesRemaining -= 

 125: $bytesToWrite

 126: }

 127: }

 128: else

 129: {

 130: ## They didn't specify any text. We'll 

 131: just fill the buffer with completely

 132: ## random data.

 133: $random = New-Object 

 134: System.Random

 135: for($index = 0; $index -lt $chunkLength; 

 136: $index++)

 137: {

 138: $fillBytes[$index] = $random.Next(32, 127)

 139: }

 140: }

 141: ## Now 

 142: actually create the file. We put this in a try / catch block so that we 

 143: have

 144: ## the chance to clean up after errors, or if the user hits 

 145: ^C

 146: try

 147: {

 148: ## Create the file, and use the .NET API to open the file. 

 149: There are plenty

 150: ## of PowerShell-only flavours to this approach, but they 

 151: build on cmdlets

 152: ## optimized for interactive use. Because we are generating 

 153: so much data,

 154: ## these cmdlets end up spending an enormous amount of time on 

 155: redundant tasks:

 156: ## error checking the parameters, opening the file, closing 

 157: the file, etc.

 158: $path = (New-Item -Type File -Path $path).FullName

 159: $file = 

 160: [IO.File]::OpenWrite($path)

 161: ## Now start filling the file, keeping track of 

 162: how many bytes we have

 163: ## remaining.

 164: $bytesRemaining = 

 165: $length

 166: while($bytesRemaining -gt 0)

 167: {

 168: ## If we don't have enough space 

 169: left to fit the entire $fillBytes

 170: ## buffer, we'll create a new array that 

 171: has only as much as will fit,

 172: ## and replace $fillBytes with that 

 173: array.

 174: if($bytesRemaining -lt $fillBytes.Length)

 175: {

 176: $bytesToWrite = 

 177: New-Object byte[] $bytesRemaining

 178: [Array]::Copy($fillBytes, $bytesToWrite, 

 179: $bytesRemaining)

 180: $fillBytes = $bytesToWrite

 181: }

 182: ## Finally, write the 

 183: bytes to the file, and recalculate how much

 184: ## we still need to 

 185: fill.

 186: $file.Write( $fillBytes, 0, $fillBytes.Length )

 187: $bytesRemaining -= 

 188: $fillBytes.Length

 189: }

 190: }

 191: finally

 192: {

 193: ## Close the file since we're 

 194: done with it.

 195: $file.Close()

 196: }

 197: ## New-* cmdlets generally emit the thing 

 198: they just created. This also lets us

 199: ## visually verify that it was the 

 200: correct length.

 201: Get-Item $path

Advanced Event 8 (VBScript)

Photo of Jakob Gottlieb Svendsen

Jakob Gottlieb Svendsen

  • TechNet Influent Denmark, TechNet Moderator
  • Main Areas is VBScript, Windows PowerShell, C#.NET, and VB.NET.
  • Working as an IT consultant/Microsoft Certified Trainer at Coretech A/S, Copenhagen, Denmark, www.coretech.dk
  • Blog about scripting and other stuff at http://blog.coretech.dk/author/jgs

How and Why

I always start by going through the job in my head.

In this script we would need something to write to the text files. At the same time we would need something to keep track of the size of the file, and stop when it is large enough.

I thought about different ways to do it. I could calculate and hard-code how many bytes one character in a text file is, and then write as many as needed.

But I decided to go for the “easy” and more direct way. I will write one line at a time, and then check the size afterward, to see if the file is big enough.

I am not using a large string, because that would make it less precise. The problem with using a small string is that when we are creating a 100 MB file, it is going to need a lot of strings and it will take a while. I assumed that the network department rather wants a precise file than a fast process. This means my computer could spend about 20 minutes when creating a100 MB file, but I guess the networking guys can start it and do what they do most of the time (Facebook, Twitter, etc.).

I decided to make the script to require command prompt argument, making it easy for the networking guys to change the size when they need to.

The Script Sections

Header section

I always write the header section, containing information about the version history, usage, error codes, and other important information.

This makes it easier for the customers to understand how to use it, and for myself too, when I need to fix it years after coding it!

Declare section

Most of the time I use Option Explicit. This is a good idea, and I have discovered that many enterprise companies like to have control of the script content, and therefore require me to use it. This requires me to explicit declare all variables.

I try to use Hungarian notation (or something similar) on all my variable names to make it easier for myself.

Main routines section

Because this script is very small, most of the code is in the main section.

First, I check if the argument is present, and that only one argument has been given. This is important because the script would fail if it were not supplied. I read the argument and pass it through the UCase function to make sure that everything is uppercase. If no arguments or more than one argument is supplied, I quit the script with error code 1. The error is making it easier to implement in automatic solutions/batch files.

Next up is the Select Case. Here I check the argument to see if any of the predefined sizes are specified. I always use Select Case when I need to do a simple check on a variable content, because I think it has the best overview. If one of the standard sizes is specified, the nSize variable is set to the correct size in bytes. Otherwise the size is set to the specified bytes.

I make sure the argument was an integer by enabling On Error Resume Next, and I try to convert it to int. If an error has occurred, I quit the script (error code 2); if no error has occurred, I re-enable halt on error by using the On Error Goto 0 statement and I put a “B” in the end of the size, to make it ready for the filename.

Now I assemble the filename, using the strSelectedSize variable that contains the size of the file, prefix, and suffix needed.

Now I am sure that the size and filename are set correctly. Therefore, I create the objects for FSO, TextFile, and File. I could have created the objFSO in the declare section, like the other objects, but there was no reason to do so before the argument had been confirmed.

I use objFSO.CreateTextFile to create the file, supplying the filename and True, allowing the script to overwrite any existing file. I also create a file object with objFSO.GetFile. This is used to check the size of the file.

Now is it just a matter of filling the file, one line at a time, using a while loop, and checking the size every time. When it hits the correct size, the loop ends.

House cleaning

I always do house cleaning (in my scripts!).

It might not be necessary in this script, but it sometimes is very important to clean up the connections/objects correctly. Therefore, I always run the Close method, and set all objects to Nothing.

When the script runs, we see the file properties detailed in the following image.

Image of files properties shown when script is run

The complete script is seen here.

AdvancedEvent8.vbs

   1: ' 

   2: //***************************************************************************

   3: ' 

   4: // ***** Script Header *****

   5: ' //

   6: ' // Solution: 2010 Scripting Games 

   7: Advanced Event 8

   8: ' // File: AdvancedEvent8.vbs

   9: ' // Author: Jakob Gottlieb 

  10: Svendsen, Coretech A/S. jgs@coretech.dk

  11: ' // Purpose: Create dummy text files 

  12: in specified sizes

  13: ' // Loops every 60 seconds

  14: ' //

  15: ' // Usage: 

  16: cscript.exe AdvancedEvent8.vbs 100K

  17: ' // cscript.exe AdvancedEvent8.vbs 

  18: 1M

  19: ' // cscript.exe AdvancedEvent8.vbs 10M

  20: ' // cscript.exe 

  21: AdvancedEvent8.vbs 100M

  22: ' // custom size in bytes:

  23: ' // cscript.exe 

  24: AdvancedEvent8.vbs 123465

  25: ' //

  26: ' // CORETECH A/S History:

  27: ' // 0.0.1 

  28: JGS 26/03/2010 Created initial version.

  29: ' //

  30: ' // Customer History:

  31: ' 

  32: //

  33: ' // ErrorCodes:

  34: ' // 1: Wrong number of argument supplied

  35: ' // 2: 

  36: Argument is not a standard size (100K etc) or an integer.

  37: ' // ***** End 

  38: Header *****

  39: ' 

  40: //***************************************************************************

  41: '//----------------------------------------------------------------------------

  42: '//

  43: '// 

  44: Global constant and variable 

  45: declarations

  46: '//

  47: '//----------------------------------------------------------------------------

  48: Option 

  49: Explicit 'always using option explicit

  50: Dim nSize, strTargetFile, 

  51: strSelectedSize

  52: Dim objFSO, objTextFile, 

  53: objFile

  54: '//----------------------------------------------------------------------------

  55: '// 

  56: Main 

  57: routines

  58: '//----------------------------------------------------------------------------

  59: 'Count 

  60: arguments, quit with errorcode if wrong 

  61: If WScript.Arguments.Count = 1 

  62: Then

  63: strSelectedSize = UCase(WScript.Arguments.Item(0)) ' Read argument to 

  64: variable, using Ucase to make sure it is upper case

  65: Else ' No arguments in 

  66: command line, quit with errorcode 1

  67: WScript.Echo "Please provide one argument 

  68: with the preferred Size, valid arguments are: 100K, 1M, 10M, 100M or a integer 

  69: number (120345 etc.)"

  70: WScript.Quit(1)

  71: End If

  72: 'Select the correct size, 

  73: otherwise use custom size in bytes.

  74: Select Case strSelectedSize

  75: Case 

  76: "100K" nSize = 100 * 1024

  77: Case "1M" nSize = 1 * 1024 * 1024

  78: Case "10M" 

  79: nSize = 10 * 1024 * 1024

  80: Case "100M" nSize = 100 * 1024 * 1024

  81: Case 

  82: Else

  83: On Error Resume Next 'Enable resume on error, since we need to test the 

  84: conversion of argument to integer.

  85: nSize = 

  86: CInt(WScript.Arguments.Item(0))

  87: If Err.Number <> 0 Then ' if wrong 

  88: format, write message and quit with errorcode 2

  89: WScript.Echo "Argument wrong 

  90: format valid formats are: 100K, 1M, 10M, 100M or a integer number (120345 

  91: etc.)"

  92: WScript.Quit(2)

  93: End If

  94: strSelectedSize = strSelectedSize & 

  95: "B" 'add a B for bytes, since it is used in the name of the text file

  96: On 

  97: Error Goto 0 ' reenable break on error

  98: End Select

  99: strTargetFile = 

 100: "TestFile" & strSelectedSize & ".txt" 'Setup filename from 

 101: argument.

 102: 'Create FSO, Textfile and File object to keep track of the size of 

 103: the file.

 104: Set objFSO = CreateObject("Scripting.FileSystemObject")

 105: Set 

 106: objTextFile = objFSO.CreateTextFile(strTargetFile, True) 'using CreateTextFile 

 107: with overwrite enabled

 108: Set objFile = objFSO.GetFile(strTargetFile)

 109: 'Write 

 110: lines with test data until the specified size is reached.

 111: Do While 

 112: objFile.Size <= nSize

 113: objTextFile.WriteLine "Test Data - Test Data - Test 

 114: Data - Test Data - Test Data - Test Data - Test Data - Test 

 115: Data"

 116: Loop

 117: '//----------------------------------------------------------------------------

 118: '// 

 119: House 

 120: Cleaning

 121: '//----------------------------------------------------------------------------

 122: 'Close 

 123: the file

 124: objTextFile.Close

 125: 'Remove objects from memory

 126: Set objTextFile 

 127: = Nothing

 128: Set objFile = Nothing

 129: Set objFSO = 

 130: Nothing

 131: '//----------------------------------------------------------------------------

 132: '// 

 133: End 

 134: Script

 135: '//----------------------------------------------------------------------------

 

If you want to know exactly what we will be looking at tomorrow, follow us on Twitter or Facebook. If you have any questions, send e-mail to us at scripter@microsoft.com or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

Ed Wilson and Craig Liebendorfer, Scripting Guys