Expert Solutions: Advanced Event 8 of the 2010 Scripting Games
(Note: These solutions were written for Advanced Event 8.)
Advanced Event 8 (Windows PowerShell)
Lee Holmes is a developer on the Microsoft Windows PowerShell team, author of the Windows PowerShell Cookbook and Windows PowerShell Pocket Reference. He also runs the Precision Computing Blog.
———–
New-TestFile
A problem so nice, we solved it twice.
Once in awhile, you’ll run into the need to generate files of a specific size without caring much about what’s in them. We’ve been asked by our friends in the network team to help them with exactly that, and Windows PowerShell is more than up to the challenge.
In understanding their requirements, the first thing that jumps out is a perceived difference between the size of the file that we generate and its size on disk. Windows Explorer reports these as two separate numbers when you view the file properties, and they’ve asked us to guarantee that these remain within 1 percent of each other. The difference in these two numbers comes from the disk’s cluster size, also known as its allocation unit. To reduce the overhead required to manage your disk space, operating systems don’t dole out space to files on a byte-by-byte basis; they allocate space in larger chunks called (creatively) allocation units. For most systems, this is 4,096 bytes. Even if you only create a 1-byte file, Windows sets aside 4,096 bytes. If you create a file that is 4,096 bytes, Windows still sets aside 4,096 bytes. If you create a file that is 4,097 bytes, Windows sets aside 8,192 bytes. In essence, Windows rounds up to the nearest 4,096-byte chunk.
To figure out how much space is being used by a file, try this simple Windows PowerShell command:
1: [Math]::Ceiling(<length> / 4096) * 4096
After clarifying this point, the networking team decided that the difference was not important and dropped the requirement that the files be within 1 percent of the “size on disk” as reported by Windows.
The broad dynamic range of Windows PowerShell makes this a fun challenge to solve in a number of ways. At the very simplest, we have a snippet to generate random content of whatever size we want:
1: $random = New-Object System.Random
2: $content = for($i = 0; $i -lt 1mb; $i++) {
3: [char] $random.Next(32, 127) }
4: Set-Content file.txt (-join $content)
This works okay for small files, but we’re going to need much better performance if we want to start generating larger files. In addition, this snippet only covers the simplest of the scenario requirements: filling a file one character at a time until it reaches the specified limit. Enabling user-supplied content, while optional, is an equally important problem.
To support user-supplied content, we can just keep on using the Add-Content cmdlet to fill the file with their string as long as space remains for at least one more addition. For our last addition, we’ll have to chop the string down and only add as much as will fit.
As we take his approach, however, the performance is still a problem. The Add-Content cmdlet is designed for interactive use, so we’re asking it to do a lot of redundant work hundreds of thousands of times: verify its parameters, open the file, find the right place to add content to, close the file, and more. We’re most certainly stepping away from this interactive scenario, so using the file APIs from the .NET Framework is an attractive approach. At the most basic level, you use the $file = [System.IO.File]::OpenWrite() method to open a file, and then $file.Write(…) to write to that file. Finally, you call the $file.Close() method to let the .NET Framework know you’re done with it. This is seen in the following image.
Even with this approach, we have a large opportunity for improvement. If the user gives a small string (such as “Hello World”) for their custom text, we’ll be looping and calling $file.Write(…) an enormous number of times for a large file. In addition, hard drives work best when told to read and write large chunks of data. While most APIs (those in the .NET Framework included) try to batch your work to account for this, we can do a better job ourselves.
On a 100 MB file, batching 10-character chunks up into 16 KB chunks easily takes the performance from 35 seconds to about half a second.
How?
Well, we can create a buffer of a certain disk-friendly size (for example, 16 KB), and then fill the file by writing copies of that buffer instead. To create that buffer, we need to fill it with the user’s custom text (or random data) as long as space remains for at least one more addition. For our last addition, we’ll have to chop the string down and only add as much as…wait! Isn’t that the problem we’re already trying to solve?
Indeed. This is a problem so nice, we solved it twice. The complete script is seen here.
New-TestFile.ps1
1: ##############################################################################
2: ##
3: ##
4: New-TestFile
5: ##
6: ## by Lee Holmes
7: (http://www.leeholmes.com/blog)
8: ##
9: ##############################################################################
10: <#
11: .SYNOPSIS
12: Creates
13: a new file of the specified length. The file can be filled with the
14: specified
15: TemplateContent, or will be filled with random data if no template
16: content is
17: specified.
18: .EXAMPLE
19: New-TestFile -Path c:\temp\test.txt -Length 1mb
20: -Force
21: This example creates a file called test.txt with 1 megabyte of
22: data,
23: overwriting the file if it exists.
24: #>
25: param(
26: ## The path of
27: the destination to create
28: [Parameter(Mandatory = $True)]
29: [string]
30: $Path,
31: ## The size of the file to create
32: [ValidateRange(0, 1gb)]
33: [int]
34: $Length,
35: ## The template content to use to fill the file
36: [string]
37: $TemplateContent,
38: ## Switch to overwrite the file if it exists
39: [switch]
40: $Force
41: )
42: Set-StrictMode -Version Latest
43: ## Check if the file exists.
44: Throw an error if it does, but overwrite the
45: ## file if they used the -Force
46: switch.
47: if(Test-Path $path)
48: {
49: if($Force)
50: {
51: Remove-Item $path
52: -Force
53: }
54: else
55: {
56: throw "The file '$path' already
57: exists."
58: }
59: }
60: ## Writing to the disk is terribly slow when you do it in
61: small chunks.
62: ## Since we'll usually be blasting out large streams of data,
63: we can be
64: ## much more efficient by writing it out in larger
65: chunks.
66: $chunkLength = 16kb
67: $fillBytes = New-Object byte[]
68: $chunkLength
69: ## If they gave us some template content, we'll fill the
70: 'fillBytes' buffer
71: ## with their text. It's very likely that the text they
72: give us will not
73: ## completely fill the buffer, so we have to pack it
74: ourselves.
75: if($templateContent)
76: {
77: ## First, we convert their input to
78: an array of bytes. Normally, we would
79: ## use [System.Text.Encoding]::Unicode
80: to get the bytes out of the string
81: ## so that our network operators can fill
82: the file with Unicode strings.
83: ## However, files filled with Unicode data
84: don't represent typical files:
85: ## for most languages, half the file ends up
86: being just zeros.
87: $templateBytes =
88: [System.Text.Encoding]::ASCII.GetBytes($templateContent)
89: ## Figure out how
90: much of the 'fillBytes' buffer is remaining. We'll start
91: ## putting our
92: content at position zero in the buffer, and we'll write
93: ## as many bytes as
94: $templateBytes holds.
95: $bytesRemaining = $chunkLength
96: $currentPosition =
97: 0
98: $bytesToWrite = $templateBytes.Length
99: ## Now loop filling up the
100: tempateBytes buffer
101: while($bytesRemaining -gt 0)
102: {
103: ## If their input
104: text is larger than the remainder of the buffer,
105: ## then we remember to only
106: write as much as will fit.
107: if($bytesRemaining -lt
108: $templateBytes.Length)
109: {
110: $bytesToWrite = $bytesRemaining
111: }
112: ## Now
113: copy bytes from the templateBytes array into the current
114: ## position in the
115: fillBytes array.
116: [Array]::Copy($templateBytes,
117: 0, $fillBytes,
118: $currentPosition, $bytesToWrite)
119: ## Update our position counter (so that we
120: don't overwrite what we've
121: ## already written), and update how much space is
122: left.
123: $currentPosition += $bytesToWrite
124: $bytesRemaining -=
125: $bytesToWrite
126: }
127: }
128: else
129: {
130: ## They didn't specify any text. We'll
131: just fill the buffer with completely
132: ## random data.
133: $random = New-Object
134: System.Random
135: for($index = 0; $index -lt $chunkLength;
136: $index++)
137: {
138: $fillBytes[$index] = $random.Next(32, 127)
139: }
140: }
141: ## Now
142: actually create the file. We put this in a try / catch block so that we
143: have
144: ## the chance to clean up after errors, or if the user hits
145: ^C
146: try
147: {
148: ## Create the file, and use the .NET API to open the file.
149: There are plenty
150: ## of PowerShell-only flavours to this approach, but they
151: build on cmdlets
152: ## optimized for interactive use. Because we are generating
153: so much data,
154: ## these cmdlets end up spending an enormous amount of time on
155: redundant tasks:
156: ## error checking the parameters, opening the file, closing
157: the file, etc.
158: $path = (New-Item -Type File -Path $path).FullName
159: $file =
160: [IO.File]::OpenWrite($path)
161: ## Now start filling the file, keeping track of
162: how many bytes we have
163: ## remaining.
164: $bytesRemaining =
165: $length
166: while($bytesRemaining -gt 0)
167: {
168: ## If we don't have enough space
169: left to fit the entire $fillBytes
170: ## buffer, we'll create a new array that
171: has only as much as will fit,
172: ## and replace $fillBytes with that
173: array.
174: if($bytesRemaining -lt $fillBytes.Length)
175: {
176: $bytesToWrite =
177: New-Object byte[] $bytesRemaining
178: [Array]::Copy($fillBytes, $bytesToWrite,
179: $bytesRemaining)
180: $fillBytes = $bytesToWrite
181: }
182: ## Finally, write the
183: bytes to the file, and recalculate how much
184: ## we still need to
185: fill.
186: $file.Write( $fillBytes, 0, $fillBytes.Length )
187: $bytesRemaining -=
188: $fillBytes.Length
189: }
190: }
191: finally
192: {
193: ## Close the file since we're
194: done with it.
195: $file.Close()
196: }
197: ## New-* cmdlets generally emit the thing
198: they just created. This also lets us
199: ## visually verify that it was the
200: correct length.
201: Get-Item $path
Advanced Event 8 (VBScript)
Jakob Gottlieb Svendsen
- TechNet Influent Denmark, TechNet Moderator
- Main Areas is VBScript, Windows PowerShell, C#.NET, and VB.NET.
- Working as an IT consultant/Microsoft Certified Trainer at Coretech A/S, Copenhagen, Denmark, www.coretech.dk
- Blog about scripting and other stuff at http://blog.coretech.dk/author/jgs
How and Why
I always start by going through the job in my head.
In this script we would need something to write to the text files. At the same time we would need something to keep track of the size of the file, and stop when it is large enough.
I thought about different ways to do it. I could calculate and hard-code how many bytes one character in a text file is, and then write as many as needed.
But I decided to go for the “easy” and more direct way. I will write one line at a time, and then check the size afterward, to see if the file is big enough.
I am not using a large string, because that would make it less precise. The problem with using a small string is that when we are creating a 100 MB file, it is going to need a lot of strings and it will take a while. I assumed that the network department rather wants a precise file than a fast process. This means my computer could spend about 20 minutes when creating a100 MB file, but I guess the networking guys can start it and do what they do most of the time (Facebook, Twitter, etc.).
I decided to make the script to require command prompt argument, making it easy for the networking guys to change the size when they need to.
The Script Sections
Header section
I always write the header section, containing information about the version history, usage, error codes, and other important information.
This makes it easier for the customers to understand how to use it, and for myself too, when I need to fix it years after coding it!
Declare section
Most of the time I use Option Explicit. This is a good idea, and I have discovered that many enterprise companies like to have control of the script content, and therefore require me to use it. This requires me to explicit declare all variables.
I try to use Hungarian notation (or something similar) on all my variable names to make it easier for myself.
Main routines section
Because this script is very small, most of the code is in the main section.
First, I check if the argument is present, and that only one argument has been given. This is important because the script would fail if it were not supplied. I read the argument and pass it through the UCase function to make sure that everything is uppercase. If no arguments or more than one argument is supplied, I quit the script with error code 1. The error is making it easier to implement in automatic solutions/batch files.
Next up is the Select Case. Here I check the argument to see if any of the predefined sizes are specified. I always use Select Case when I need to do a simple check on a variable content, because I think it has the best overview. If one of the standard sizes is specified, the nSize variable is set to the correct size in bytes. Otherwise the size is set to the specified bytes.
I make sure the argument was an integer by enabling On Error Resume Next, and I try to convert it to int. If an error has occurred, I quit the script (error code 2); if no error has occurred, I re-enable halt on error by using the On Error Goto 0 statement and I put a “B” in the end of the size, to make it ready for the filename.
Now I assemble the filename, using the strSelectedSize variable that contains the size of the file, prefix, and suffix needed.
Now I am sure that the size and filename are set correctly. Therefore, I create the objects for FSO, TextFile, and File. I could have created the objFSO in the declare section, like the other objects, but there was no reason to do so before the argument had been confirmed.
I use objFSO.CreateTextFile to create the file, supplying the filename and True, allowing the script to overwrite any existing file. I also create a file object with objFSO.GetFile. This is used to check the size of the file.
Now is it just a matter of filling the file, one line at a time, using a while loop, and checking the size every time. When it hits the correct size, the loop ends.
House cleaning
I always do house cleaning (in my scripts!).
It might not be necessary in this script, but it sometimes is very important to clean up the connections/objects correctly. Therefore, I always run the Close method, and set all objects to Nothing.
When the script runs, we see the file properties detailed in the following image.
The complete script is seen here.
AdvancedEvent8.vbs
1: '
2: //***************************************************************************
3: '
4: // ***** Script Header *****
5: ' //
6: ' // Solution: 2010 Scripting Games
7: Advanced Event 8
8: ' // File: AdvancedEvent8.vbs
9: ' // Author: Jakob Gottlieb
10: Svendsen, Coretech A/S. jgs@coretech.dk
11: ' // Purpose: Create dummy text files
12: in specified sizes
13: ' // Loops every 60 seconds
14: ' //
15: ' // Usage:
16: cscript.exe AdvancedEvent8.vbs 100K
17: ' // cscript.exe AdvancedEvent8.vbs
18: 1M
19: ' // cscript.exe AdvancedEvent8.vbs 10M
20: ' // cscript.exe
21: AdvancedEvent8.vbs 100M
22: ' // custom size in bytes:
23: ' // cscript.exe
24: AdvancedEvent8.vbs 123465
25: ' //
26: ' // CORETECH A/S History:
27: ' // 0.0.1
28: JGS 26/03/2010 Created initial version.
29: ' //
30: ' // Customer History:
31: '
32: //
33: ' // ErrorCodes:
34: ' // 1: Wrong number of argument supplied
35: ' // 2:
36: Argument is not a standard size (100K etc) or an integer.
37: ' // ***** End
38: Header *****
39: '
40: //***************************************************************************
41: '//----------------------------------------------------------------------------
42: '//
43: '//
44: Global constant and variable
45: declarations
46: '//
47: '//----------------------------------------------------------------------------
48: Option
49: Explicit 'always using option explicit
50: Dim nSize, strTargetFile,
51: strSelectedSize
52: Dim objFSO, objTextFile,
53: objFile
54: '//----------------------------------------------------------------------------
55: '//
56: Main
57: routines
58: '//----------------------------------------------------------------------------
59: 'Count
60: arguments, quit with errorcode if wrong
61: If WScript.Arguments.Count = 1
62: Then
63: strSelectedSize = UCase(WScript.Arguments.Item(0)) ' Read argument to
64: variable, using Ucase to make sure it is upper case
65: Else ' No arguments in
66: command line, quit with errorcode 1
67: WScript.Echo "Please provide one argument
68: with the preferred Size, valid arguments are: 100K, 1M, 10M, 100M or a integer
69: number (120345 etc.)"
70: WScript.Quit(1)
71: End If
72: 'Select the correct size,
73: otherwise use custom size in bytes.
74: Select Case strSelectedSize
75: Case
76: "100K" nSize = 100 * 1024
77: Case "1M" nSize = 1 * 1024 * 1024
78: Case "10M"
79: nSize = 10 * 1024 * 1024
80: Case "100M" nSize = 100 * 1024 * 1024
81: Case
82: Else
83: On Error Resume Next 'Enable resume on error, since we need to test the
84: conversion of argument to integer.
85: nSize =
86: CInt(WScript.Arguments.Item(0))
87: If Err.Number <> 0 Then ' if wrong
88: format, write message and quit with errorcode 2
89: WScript.Echo "Argument wrong
90: format valid formats are: 100K, 1M, 10M, 100M or a integer number (120345
91: etc.)"
92: WScript.Quit(2)
93: End If
94: strSelectedSize = strSelectedSize &
95: "B" 'add a B for bytes, since it is used in the name of the text file
96: On
97: Error Goto 0 ' reenable break on error
98: End Select
99: strTargetFile =
100: "TestFile" & strSelectedSize & ".txt" 'Setup filename from
101: argument.
102: 'Create FSO, Textfile and File object to keep track of the size of
103: the file.
104: Set objFSO = CreateObject("Scripting.FileSystemObject")
105: Set
106: objTextFile = objFSO.CreateTextFile(strTargetFile, True) 'using CreateTextFile
107: with overwrite enabled
108: Set objFile = objFSO.GetFile(strTargetFile)
109: 'Write
110: lines with test data until the specified size is reached.
111: Do While
112: objFile.Size <= nSize
113: objTextFile.WriteLine "Test Data - Test Data - Test
114: Data - Test Data - Test Data - Test Data - Test Data - Test
115: Data"
116: Loop
117: '//----------------------------------------------------------------------------
118: '//
119: House
120: Cleaning
121: '//----------------------------------------------------------------------------
122: 'Close
123: the file
124: objTextFile.Close
125: 'Remove objects from memory
126: Set objTextFile
127: = Nothing
128: Set objFile = Nothing
129: Set objFSO =
130: Nothing
131: '//----------------------------------------------------------------------------
132: '//
133: End
134: Script
135: '//----------------------------------------------------------------------------
If you want to know exactly what we will be looking at tomorrow, follow us on Twitter or Facebook. If you have any questions, send e-mail to us at scripter@microsoft.com or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.
Ed Wilson and Craig Liebendorfer, Scripting Guys
0 comments