{"id":55313,"date":"2008-06-20T02:15:00","date_gmt":"2008-06-20T02:15:00","guid":{"rendered":"https:\/\/blogs.technet.microsoft.com\/heyscriptingguy\/2008\/06\/20\/hey-scripting-guy-how-can-i-modify-a-text-file-based-on-the-value-of-a-particular-column-in-the-file\/"},"modified":"2008-06-20T02:15:00","modified_gmt":"2008-06-20T02:15:00","slug":"hey-scripting-guy-how-can-i-modify-a-text-file-based-on-the-value-of-a-particular-column-in-the-file","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/scripting\/hey-scripting-guy-how-can-i-modify-a-text-file-based-on-the-value-of-a-particular-column-in-the-file\/","title":{"rendered":"Hey, Scripting Guy! How Can I Modify a Text File Based on the Value of a Particular Column in the File?"},"content":{"rendered":"<p><img decoding=\"async\" height=\"34\" width=\"34\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/29\/2019\/02\/q-for-powertip.jpg\" align=\"left\" alt=\"Hey, Scripting Guy! Question\" border=\"0\" title=\"Hey, Scripting Guy! Question\" class=\"nearGraphic\" \/><\/p>\n<p>Hey, Scripting Guy! I need some help in cleaning up a text file. I have a file that contains multiple columns of data, and I need to either delete or modify lines in the file based on the value of the 13<sup>th<\/sup> column. Can you help me?<\/p>\n<p>&#8212; AM<\/p>\n<p><img decoding=\"async\" height=\"5\" width=\"5\" src=\"https:\/\/devblogs.microsoft.com\/scripting\/wp-content\/uploads\/sites\/29\/2019\/05\/spacer.gif\" alt=\"Spacer\" border=\"0\" \/><img decoding=\"async\" height=\"34\" width=\"34\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/29\/2019\/02\/a-for-powertip.jpg\" align=\"left\" alt=\"Hey, Scripting Guy! Answer\" border=\"0\" title=\"Hey, Scripting Guy! Answer\" class=\"nearGraphic\" \/><a href=\"http:\/\/go.microsoft.com\/fwlink\/?linkid=68779&amp;clcid=0x409\"><img decoding=\"async\" height=\"288\" width=\"120\" src=\"http:\/\/img.microsoft.com\/library\/media\/1033\/technet\/images\/scriptcenter\/ad.jpg\" align=\"right\" alt=\"Script Center\" border=\"0\" title=\"Script Center\" class=\"farGraphic\" \/><\/a><\/p>\n<p>Hey, AM. You know, usually we start off this column with some amusing (well, OK, <i>semi-<\/i>amusing) anecdote or story. Not today, though; today nothing seems particularly amusing to the Scripting Guy who writes this column. Is that because this day is the absolute <i>worst<\/i> day of his life? Yes.<\/p>\n<p>No, wait; we mean: no. The absolute <i>worst<\/i> day of his life came when he was a junior high school and his baseball team was playing Hanford High, undefeated and ranked number 1 in the state. With the score tied in the bottom of the seventh, the Scripting Guy who writes this column came to bat with one out and a runner on third. The base coach gave him the signal to take the first pitch; he took a first-pitch fastball for a strike. On the second pitch the base coach called for a suicide squeeze, in which the runner from third takes off on a mad dash for home plate and it&rsquo;s up to the batter to just put his bat on the ball, some way somehow. <\/p>\n<p>Unfortunately, though, Hanford High suspected that something was up. The Falcons called for a pitchout, in which the pitcher throws the ball far enough away from the plate that the batter can&rsquo;t possibly hit it. The Scripting Guy who writes this column gamely flailed at the ball, missing it by a good two feet. The catcher caught the pitchout and tagged out the runner who was trying to slide in with the winning run. On the next pitch the Scripting Guy who writes this column swung and missed at a fastball, the only time he struck out that entire season. Inning over, and Hanford ended up winning in extra innings.<\/p>\n<p><i>That<\/i> was the worst day of the Scripting Guy who writes this column&rsquo;s life. And nothing else comes close. Except maybe the time he played in a slo-pitch softball tournament and got a base hit to begin the day and another base hit to end the day. In between those two at-bats he hit 12 line drives, each hit harder than the other, and each hit right at someone. 12 consecutive line shots, and 12 consecutive outs. The Scripting Guy who writes this column didn&rsquo;t speak to anyone for two days after <i>that<\/i> debacle.<\/p>\n<p>So maybe today isn&rsquo;t such a bad day after all. And besides, how bad a day could it be when you have the chance to write a script that can delete or modify lines in a text file based on the value of the 13<sup>th<\/sup> column in each line, eh?<\/p>\n<p>Before we take a look at the script let&rsquo;s take a look at AM&rsquo;s text file:<\/p>\n<pre class=\"codeSample\">RD|I|42|101|0047|42603|1|1|008|099|20060123|11:00|24.00000|||||||||||||||\nRD|I|42|101|0047|42603|1|1|008|099|20060123|12:00|9993||||||||||||||\nRD|I|42|101|0047|42603|1|1|008|099|20060123|13:00|17.80000|||||||||||||||\nRD|I|42|101|0047|42603|1|1|008|099|20060123|14:00|-17.30000|||||||||||||||\nRD|I|42|101|0047|42603|1|1|008|099|20060123|15:00|-28.90000|||||||||||||||\nRD|I|42|101|0047|42603|1|1|008|099|20060123|16:00|-30.80000|||||||||||||||\nRD|I|42|101|0047|42603|1|1|008|099|20060123|17:00|-.40000|||||||||||||||<\/pre>\n<p>Yes, it <i>is<\/i> a very nice little text file, isn&rsquo;t it? However, we&rsquo;re interested in only one small portion of the text file: we&rsquo;re interested in only the 13<sup>th<\/sup> field (column) in each line. For example, in line 1 that&rsquo;s the value 24.00000; in line 2, that&rsquo;s the value 9993. What we need to do is look at this value for each line in the file and take action if:<\/p>\n<table cellpadding=\"0\" cellspacing=\"0\" border=\"0\">\n<tbody>\n<tr>\n<td valign=\"top\" class=\"listBullet\">&bull;<\/td>\n<td class=\"listItem\">\n<p>The value begins with the characters 999.<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" class=\"listBullet\">&bull;<\/td>\n<td class=\"listItem\">\n<p>The value is greater than -.51 but less than 0.<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" class=\"listBullet\">&bull;<\/td>\n<td class=\"listItem\">\n<p>The value is less than -.51.<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>How are we going to do all that? Why, by running this script, of course: <\/p>\n<pre class=\"codeSample\"><pre class=\"codeSample\">Const ForReading = 1\nConst ForWriting = 2\n\nSet objFSO = CreateObject(\"Scripting.FileSystemObject\")\nSet objFile = objFSO.OpenTextFile(\"C:\\Scripts\\Test.txt\", ForReading)\n\nDo Until objFile.AtEndOfStream\n    strLine = objFile.ReadLine\n    arrItems = Split(strLine, \"|\")\n    dblValue = arrItems(12)\n    dblValue = CDbl(dblValue)\n\n    i = 0\n\n    If Left(dblValue, 3) = \"999\" Then\n        i = 1\n    End If\n\n    If dblValue &gt; -.51 AND dblValue &lt; 0 Then\n        i = 1\n    End If\n\n    If  dblValue &lt; -.51 Then\n        arrItems(12) = 0\n    End If\n\n    If i = 0 Then\n        strLine = Join(arrItems, \"|\")\n        strNewContents = strNewContents &amp; strLine &amp; vbCrLf\n    End If\nLoop\n\nobjFile.Close\n\nSet objFile = objFSO.OpenTextFile(\"C:\\Scripts\\Test.txt\", ForWriting)\nobjFile.Write strNewContents\nobjFile.Close<\/pre>\n<p>As you can see, we kick things off in the same way we kick off many of our scripts: by defining a pair of constants (ForReading and ForWriting) that we&rsquo;ll need in order to open our text file in the correct mode. (That is, do we want to open the text file in order to read from it or write to it?) After defining the two constants we create an instance of the <b>Scripting.FileSystemObject<\/b>, then use the following line of code to open the file C:\\Scripts\\Test.txt for reading:<\/p>\n<pre class=\"codeSample\">Set objFile = objFSO.OpenTextFile(\"C:\\Scripts\\Test.txt\", ForReading)<\/pre>\n<p>And yes, in and of itself that <i>is<\/i> pretty exciting, isn&rsquo;t it? But trust us: we can do more with this text file than just open it for reading.<\/p>\n<p>Like what, you ask? Well, how about setting up a Do Until loop that loops until the file&rsquo;s <b>AtEndOfStream<\/b> property is True? (That is, until we&rsquo;ve read every last bit and byte in the file.) Setting up a loop like this one enables us to use the <b>ReadLine<\/b> method to read in the entire contents of the file, line-by-line. In fact, this bit of code uses ReadLine to read the first line in the file and store that information in a variable named strLine:<\/p>\n<pre class=\"codeSample\">strLine = objFile.ReadLine<\/pre>\n<p>And <i>that<\/i> turns out to be everything that we can do with a text file, and everything we can do with this script.<\/p>\n<p>Oh, wait, never mind; now that we think about it, it turns out there are a few other things we can do, such as modify the text file to meet AM&rsquo;s needs. Sorry for the misunderstanding.<\/p>\n<p>In order to modify the text file to meet AM&rsquo;s needs the first thing we do is use the <b>Split<\/b> function to split the file into an array named arrItems; by splitting on the pipe separator (the <b>|<\/b> character) we end up with an array with items similar to these (plus a bunch of empty items at the end of the array):<\/p>\n<ul>\n<li>RD<\/li>\n<li>I <\/li>\n<li>42<\/li>\n<li>101<\/li>\n<li>0047<\/li>\n<li>42603<\/li>\n<li>1<\/li>\n<li>1<\/li>\n<li>008<\/li>\n<li>099<\/li>\n<li>20060123<\/li>\n<li>11:00<\/li>\n<li>24.00000<\/li>\n<\/ul>\n<p>The item of most interest to us is the last item in the list, item 12. (Remember, the first item in an array always has an index number of 0. That means that the 13<sup>th<\/sup> item in the array &ndash; the item we care about &ndash; has an index number of 12.) In order to isolate that value, and make it easy for us to work with it, our next step is to execute these two lines of code:<\/p>\n<pre class=\"codeSample\"><pre class=\"codeSample\">dblValue = arrItems(12)\ndblValue = CDbl(dblValue)\n<\/pre>\n<p>In the first line we&rsquo;re simply assigning the value of array item 12 to a variable named dblValue. In the second line we use the <b>CDbl<\/b> function to &ldquo;cast&rdquo; the value of dblValue to a numeric (double) data type; that ensures that VBScript will treat this value as a number and not as a string. <\/p>\n<table cellpadding=\"0\" cellspacing=\"0\" class=\"dataTable\" id=\"EXG\">\n<thead><\/thead>\n<tbody>\n<tr valign=\"top\" class=\"record\">\n<td>\n<p class=\"lastInCell\"><b>Note<\/b>. Is that really necessary? Well, it was for us: we got some very strange results before we explicitly cast dblValue as a numeric variable.<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"dataTableBottomMargin\"><\/div>\n<p>After setting the value of a counter variable named i to 0 we&rsquo;re ready to begin applying AM&rsquo;s criteria to the current line of text. For starters, AM wants to delete any lines where the 13<sup>th<\/sup> column in the text file starts with the characters <i>999<\/i>. That&rsquo;s what this block of code is for:<\/p>\n<pre class=\"codeSample\"><pre class=\"codeSample\">If Left(dblValue, 3) = \"999\" Then\n    i = 1\nEnd If\n<\/pre>\n<p>All we&rsquo;re doing here is using the <b>Left<\/b> function to determine whether or not the first three characters in dblValue are <i>999<\/i>. What if they are? Well, in that case we set the value of our counter variable to 1; that&rsquo;s how we&rsquo;ll know that this is a line that needs to be deleted from the text file.<\/p>\n<p>That brings us to this chunk of code:<\/p>\n<pre class=\"codeSample\"><pre class=\"codeSample\">If dblValue &gt; -.51 AND dblValue &lt; 0 Then\n    i = 1\nEnd If\n<\/pre>\n<p>In addition to deleting any lines where the 13<sup>th<\/sup> column begins with 999 AM also wants to delete any lines where that 13<sup>th<\/sup> column has a value greater than -.51 but less than 0. (For example, a value like -.40000.) That&rsquo;s what we&rsquo;re checking for here: if we have a value less than 0 but greater than -.51 (that is, a negative number larger than -.51) then set the value of the counter variable i to 1. Why? You got it: because this, too is a line that we need to delete from the text file.<\/p>\n<p>Last, but surely not least, we encounter this test:<\/p>\n<pre class=\"codeSample\">If  dblValue &lt; -.51 Then\n    arrItems(12) = 0\nEnd If<\/pre>\n<p>As you probably noticed in our previous check we don&rsquo;t delete just <i>any<\/i> line that contains a negative value in column 13; instead, we delete only those lines that have a negative value greater than -.51. So what happens if we encounter a negative value that&rsquo;s <i>less<\/i> than -.51 (for example, -17.30000)? In that case, we want to set the value of the 13<sup>th<\/sup> column to 0; that&rsquo;s something we can do by assigning 0 to array item 12:<\/p>\n<pre class=\"codeSample\">arrItems(12) = 0<\/pre>\n<p>And no, in this case we don&rsquo;t set the value of the counter variable i to 1. Why not? Right again: because this is <i>not<\/i> a line that we want to discard. This one&rsquo;s a keeper.<\/p>\n<p>Our next step is to check and see if our counter variable is equal to 0:<\/p>\n<pre class=\"codeSample\">If i = 0 Then<\/pre>\n<p>If it is, then arrItems represents a line of text that we want to keep. With that in mind we use the <b>Join<\/b> function to magically turn our array back into a single line of text, separating the individual columns with a pipe separator:<\/p>\n<pre class=\"codeSample\">strLine = Join(arrItems, \"|\")<\/pre>\n<p>And once we&rsquo;ve done that we append this line of text (and a carriage return-linefeed character) to a variable named strNewContents:<\/p>\n<pre class=\"codeSample\">strNewContents = strNewContents &amp; strLine &amp; vbCrLf<\/pre>\n<p>And then it&rsquo;s back to the top of the loop, where we repeat the process with the next line in the text file.<\/p>\n<p>So what happens when the file&rsquo;s AtEndOfStream property <i>is<\/i> True? Well, at that point we automatically exit the loop, then use the <b>Close<\/b> method to close the file Test.txt. No sooner do we close the file then we immediately reopen it, this time for writing:<\/p>\n<pre class=\"codeSample\">Set objFile = objFSO.OpenTextFile(\"C:\\Scripts\\Test.txt\", ForWriting)<\/pre>\n<p>From there we call the <b>Write<\/b> method to replace the existing contents of the file with the value of strNewContents; we call the Close method to close the file once again; and then we call it a day. If we were to go ahead now and echo back the contents of Test.txt we should see this:<\/p>\n<pre class=\"codeSample\">RD|I|42|101|0047|42603|1|1|008|099|20060123|11:00|24.00000|||||||||||||||\nRD|I|42|101|0047|42603|1|1|008|099|20060123|13:00|17.80000|||||||||||||||\nRD|I|42|101|0047|42603|1|1|008|099|20060123|14:00|0|||||||||||||||\nRD|I|42|101|0047|42603|1|1|008|099|20060123|15:00|0|||||||||||||||\nRD|I|42|101|0047|42603|1|1|008|099|20060123|16:00|0|||||||||||||||<\/pre>\n<p>As you can see, lines where the 13<sup>th<\/sup> column started with the characters 999 have been deleted; so have any lines where the 13<sup>th<\/sup> column had a value less than 0 but greater than -.51. Oh, and any lines that had a 13<sup>th<\/sup> column less than -.51? In those cases, we&rsquo;ve changed the value of column 13 to 0:<\/p>\n<pre class=\"codeSample\">RD|I|42|101|0047|42603|1|1|008|099|20060123|14:00|0|||||||||||||||\nRD|I|42|101|0047|42603|1|1|008|099|20060123|15:00|0|||||||||||||||\nRD|I|42|101|0047|42603|1|1|008|099|20060123|16:00|0|||||||||||||||<\/pre>\n<p>Which is just exactly what AM was hoping we would do.<\/p>\n<p>That should do it for now; we&rsquo;ll be back again tomorrow with yet another <i>Hey, Scripting Guy!<\/i> Are we concerned that tomorrow will be another bad day? Nah. What we neglected to tell you is that the Scripting Guy who writes this column got another crack at Hanford High School later in the year. In that game he had two singles and a double, scoring two runs and driving in two more; in addition, he made a dramatic running catch with two outs in the bottom of the seventh that saved the game and sent Hanford High to their first loss of the season.<\/p>\n<p>Not only that, but one week after his 2-for-14 softball disaster he played in another tournament, this time getting 14 hits in 19 at-bats. So, see: nothing to worry about. Although, just in case, he&rsquo;ll probably go out tonight and buy himself a new baseball bat, which is what he did the other two times in his life that he&rsquo;s had a bad day.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hey, Scripting Guy! I need some help in cleaning up a text file. I have a file that contains multiple columns of data, and I need to either delete or modify lines in the file based on the value of the 13th column. Can you help me? &#8212; AM Hey, AM. You know, usually we [&hellip;]<\/p>\n","protected":false},"author":595,"featured_media":87096,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[3,4,14,5],"class_list":["post-55313","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-scripting","tag-scripting-guy","tag-scripting-techniques","tag-text-files","tag-vbscript"],"acf":[],"blog_post_summary":"<p>Hey, Scripting Guy! I need some help in cleaning up a text file. I have a file that contains multiple columns of data, and I need to either delete or modify lines in the file based on the value of the 13th column. Can you help me? &#8212; AM Hey, AM. You know, usually we [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts\/55313","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/users\/595"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/comments?post=55313"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts\/55313\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/media\/87096"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/media?parent=55313"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/categories?post=55313"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/tags?post=55313"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}