Hey, Scripting Guy! How can I import a fixed-width file into Microsoft Excel?
— JP
Hey, JP. We have a confession to make here, something we’re terribly ashamed of: for a moment – just for a moment, mind you – we doubted Microsoft Excel. Yes, we know: how could we do such a thing? All we can say is that we made a mistake, and we’re sorry. For some reason, though, when we first read this question we assumed that this would be a really hard thing to do; in fact, we were already thinking about ActiveX Data Objects (ADO) and other fancy scripting techniques that would make up for what we assumed would be Excel’s difficulty in handling fixed-width text files.
Excel: how can we ever make this up to you?
Personal Note to Excel. Actually, we’re not sure if we should recommend that people uninstall all the other software and use only you from now on; that might get Microsoft Word users and the 11 people who still use WordPerfect a little upset. We’ll have to think that over and get back to you. |
OK, it’s time to put the mistakes of the past behind us and get back to work. We’re assuming, JP, that you have a simple little fixed-width text file similar to this:
First Last Department Gail Erickson Human Resources Ken Myer Accounting Pilar Ackerman Research Jonathan Haas Accounting Syed Abbas Accounting
Interestingly, enough, we knew that Excel had a method named OpenText that could be used to open a text file as a spreadsheet. We also knew, from previous experience, that this method worked really well with comma- or tab-separated values files. What we weren’t sure of was how to make it work with a fixed-width file such as yours. We looked at the Excel documentation, tried following the instructions, and failed miserably.
Usually that’s about the time we give up. Because we thought this was going to be really challenging, however, we decided to persevere; after all, we assumed that we might end up with a Nobel Prize or something if we could somehow answer a question that required this level of expertise and sophistication.
Having failed at writing a script that used all the requisite parameters we decided to approach the problem methodically: we’d start with a script that used only a single parameter – the path to the text file – and go from there. We knew that this script would fail, but we figured that the error message would help us determine which of the other parameters we needed and how we would have to configure those parameters.
With that in mind, we started out with this little three-line script:
Set objExcel = CreateObject(“Excel.Application”) objExcel.Visible = TrueobjExcel.Workbooks.OpenText “C:\Scripts\Test.txt”
To make a long story short, the script worked perfectly: it opened the text file and put each field and record in the proper place. We were a bit surprised, but then we reasoned that this was probably due to the simplistic nature of our file. So we added more records and more fields, and we mixed in dates and numbers as well as text. We tried our simple script again and it worked flawlessly. If you have a fixed-width file in which there’s always at least one blank space between the end of one field and the beginning of the next then all you need to do is call the OpenText method and let Excel take care of the rest.
In fact, the only time we had to get a little fancier was when one field ran directly into the next. For example, consider this file (note the name Erickson-Wilkerson):
First Last Department Gail Erickson-WilkersonHuman Resources Ken Myer Accounting Pilar Ackerman Research Jonathan Haas Accounting Syed Abbas Accounting
In this case Excel stumbled when it came to Erickson-Wilkerson. Because there was no blank space to delineate the end of one field and the beginning of the next Excel simply treated the file as if it consisted of a single field; in other words, it dumped all the data into column A. Not exactly what we were hoping for.
Fortunately it was right about then that we finally figured out how to add additional parameters to the OpenText method. If you have a “messy” text file like the one shown above then you can import that file by using a script similar to this:
Const xlFixedWidth = 2Set objExcel = CreateObject(“Excel.Application”) objExcel.Visible = True objExcel.Workbooks.OpenText _ “C:\Scripts\Test.txt”,,,xlFixedWidth,,,,,,,,,Array(Array(0,1),Array(14,1),Array(32,1))
Yes, this one is a bit more complicated, so let’s walk you through how it works. We begin by defining a constant named xlFixedWidth and setting the value to 2. As we saw a moment ago, Excel has problems identifying a file like this as being a fixed-width file; therefore, we’ll use this constant to make that fact explicit.
Next we create an instance of the Excel.Application object and set the Visible property to True; this gives us a running instance of Excel that we can see on screen. Finally we call the OpenText method, passing a crazy looking parameter string:
objExcel.Workbooks.OpenText _ “C:\Scripts\Test.txt”,,,xlFixedWidth,,,,,,,,,Array(Array(0,1),Array(14,1),Array(32,1))
We won’t bother detailing all the empty parameters in this string; if you’re interested in that check out the Excel VBA Language Reference on MSDN. For now, we’ll just note that we pass along the path to the text file and then, after a pair of blank parameters, pass the constant xlFixedWidth to let Excel know what type of file we’re dealing with.
That brings us to this:
Array(Array(0,1),Array(14,1),Array(32,1))
What we’re doing here is defining each field in the text file (and thus each column in the spreadsheet). The entire set of field definitions must be passed as an array, which is why we have the outer Array function. But what about the individual elements within the array, elements like this: Array(0,1)?
We’re glad you asked that. Each individual column definition must also be passed as an array; in this case, a two-dimensional array. The first number in the array represents the starting character position for the field. The very first field starts in character position 1 so – well, so we set the starting position in the Array as 0. That’s because arrays tend to start at 0; because of that, the starting position in the array will always be one less than the character position in the file. Our second field starts off at character position 15; hence the second element in our array shows 14 (15 – 1) as the starting point.
We don’t really like it, either. But to quote the immortal John Prine, that’s the way that the world goes round.
The second number in each element – that is, the 1 in Array(0,1) – indicates the data type for the field. We used the value 1 (representing generic data) for each field. However, you can specify any of the values found in the following table:
Constant |
Value |
xlDMYFormat |
4 |
xlDYMFormat |
7 |
xlEMDFormat |
10 |
xlGeneralFormat |
1 |
xlMDYFormat |
3 |
xlMYDFormat |
6 |
xlSkipColumn |
9 |
xlTextFormat |
2 |
xlYDMFormat |
8 |
xlYMDFormat |
5 |
Like we said, this approach is a tad bit more complicated, but it will guarantee that Excel will correctly open and parse your text file, even if the data is a bit messy. And even here we’re talking only a few lines of code, and without having to rely on ADO or any other technology. In other words, we’re sorry, Excel: we’ll never doubt you again.
0 comments