{"id":56353,"date":"2008-01-24T22:06:00","date_gmt":"2008-01-24T22:06:00","guid":{"rendered":"https:\/\/blogs.technet.microsoft.com\/heyscriptingguy\/2008\/01\/24\/hey-scripting-guy-how-can-i-sort-the-contents-of-a-text-file-in-numerical-order\/"},"modified":"2008-01-24T22:06:00","modified_gmt":"2008-01-24T22:06:00","slug":"hey-scripting-guy-how-can-i-sort-the-contents-of-a-text-file-in-numerical-order","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/scripting\/hey-scripting-guy-how-can-i-sort-the-contents-of-a-text-file-in-numerical-order\/","title":{"rendered":"Hey, Scripting Guy! How Can I Sort the Contents of a Text File in Numerical Order?"},"content":{"rendered":"<p><img decoding=\"async\" class=\"nearGraphic\" title=\"Hey, Scripting Guy! Question\" border=\"0\" alt=\"Hey, Scripting Guy! Question\" align=\"left\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/29\/2019\/02\/q-for-powertip.jpg\" width=\"34\" height=\"34\" \/><\/p>\n<p>Hey, Scripting Guy! I have a text file in which each line begins with a number. How can I sort the contents of that file by those numbers?<\/p>\n<p>&#8212; KT<\/p>\n<p><img decoding=\"async\" border=\"0\" alt=\"Spacer\" src=\"https:\/\/devblogs.microsoft.com\/scripting\/wp-content\/uploads\/sites\/29\/2019\/05\/spacer.gif\" width=\"5\" height=\"5\" \/><img decoding=\"async\" class=\"nearGraphic\" title=\"Hey, Scripting Guy! Answer\" border=\"0\" alt=\"Hey, Scripting Guy! Answer\" align=\"left\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/29\/2019\/02\/a-for-powertip.jpg\" width=\"34\" height=\"34\" \/><a href=\"http:\/\/go.microsoft.com\/fwlink\/?linkid=68779&amp;clcid=0x409\"><img decoding=\"async\" class=\"farGraphic\" title=\"Script Center\" border=\"0\" alt=\"Script Center\" align=\"right\" src=\"http:\/\/img.microsoft.com\/library\/media\/1033\/technet\/images\/scriptcenter\/ad.jpg\" width=\"120\" height=\"288\" \/><\/a><\/p>\n<p>Hey, KT. Tell you what: we&rsquo;ll go ahead and answer your question. However, in return, you have to promise to forget everything we tell you. (Which is usually not a problem for most people.) What do you say: do we have a deal?<\/p>\n<p>Admittedly, that&rsquo;s kind of an odd bargain we&rsquo;re trying to strike: we agree to answer someone&rsquo;s question on the condition that they immediately forget the answer. Why would we try to make a deal like that? Well, to tell you the truth, our first thought upon reading this question was, &ldquo;Wow. That would make a good <a href=\"http:\/\/www.microsoft.com\/technet\/scriptcenter\/funzone\/games\/default.mspx\"><strong>Scripting Games<\/strong><\/a> event. In fact, we should use this in the 2009 Games.&rdquo; The only problem, of course, is that we can&rsquo;t provide the answer today, then try and use this question as a Scripting Games event tomorrow. Talk about a quandary. On the one hand, we need everyone to read today&rsquo;s column; after all, if nobody reads the column then there&rsquo;s not much use for the Scripting Guy who writes this column, is there? <\/p>\n<table id=\"EDD\" class=\"dataTable\" cellspacing=\"0\" cellpadding=\"0\">\n<thead><\/thead>\n<tbody>\n<tr class=\"record\" valign=\"top\">\n<td>\n<p class=\"lastInCell\"><strong>Note.<\/strong> Yes, there are plenty of people &ndash; including his fellow Scripting Guys &ndash; who wonder if there&rsquo;s <i>ever<\/i> any need for the Scripting Guy who writes this column.<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"dataTableBottomMargin\"><\/div>\n<p>On the other hand, we need you to forget both the question and the answer; that way we can surprise everyone by using this query as an event in the 2009 Scripting Games. (The events have already been finalized for the 2008 Games, February 15<sup>th<\/sup> through March 3<sup>rd<\/sup> right here in the <a href=\"http:\/\/www.microsoft.com\/technet\/scriptcenter\/funzone\/games\/default.mspx\"><strong>Script Center<\/strong><\/a>.)<\/p>\n<p>To make a long story short, that&rsquo;s why we need everyone to &ndash; you know what? Never mind; go ahead and remember both the question and the answer if you want to. After all, the odds are pretty good that even if you upheld <i>your<\/i> end of the bargain the Scripting Guys would forget all about it and wouldn&rsquo;t use it as an event in the 2009 Scripting Games anyway.<\/p>\n<p>Sadly, forgetfulness is a part of growing old. For some reason, that&rsquo;s something we can never remember to tell the other Scripting Guys.<\/p>\n<p>Fortunately, though, we <i>did<\/i> remember to wear pants this morning (we&rsquo;d rather not talk about that) and to write a script that can sort the lines in a text file based on a number that appears at the beginning of each line. As a matter of fact, we remembered to write <i>this<\/i> script:<\/p>\n<pre class=\"codeSample\"><pre class=\"codeSample\">Const ForReading = 1\nConst ForWriting =2\n\nConst adVarChar = 200\nConst MaxCharacters = 255\nConst adFldIsNullable = 32\nConst adInteger = 3\n\nSet DataList = CreateObject(\"ADOR.Recordset\")\nDataList.Fields.Append \"TextLine\", adVarChar, MaxCharacters, adFldIsNullable\nDataList.Fields.Append \"NumberLine\", adInteger, , adFldIsNullable\nDataList.Open\n\nSet objFSO = CreateObject(\"Scripting.FileSystemObject\")\nSet objFile = objFSO.OpenTextFile(\"C:\\Scripts\\Test.txt\", ForReading)\n\nDo Until objFile.AtEndOfStream\n    strLine = objFile.ReadLine\n    intLength = Len(strLine) \n    strValue = \"\"\n    For i = 1 to intLength\n        If IsNumeric(Mid(strLine, i, 1)) Then\n            strValue = strValue &amp; Mid(strLine, i, 1)\n        Else\n            Exit For\n        End If\n    Next\n    DataList.AddNew\n    DataList(\"TextLine\") = strLine\n    DataList(\"NumberLine\") = strValue\n    DataList.Update\nLoop\n\nobjFile.Close\n\nSet objFile = objFSO.OpenTextFile(\"C:\\Scripts\\Test.txt\", ForWriting)\n\nDataList.Sort = \"NumberLine ASC\"\n\nDataList.MoveFirst\n\nDo Until DataList.EOF\n    objFile.WriteLine DataList.Fields.Item(\"TextLine\") \n    DataList.MoveNext\nLoop\n\nobjFile.Close<\/pre>\n<p>Before we go much further we should note that this is actually a slightly-more complicated script than KT originally asked for. Why did we give him more than he asked for? As most of <br \/>you know, sorting data in VBScript is never much fun; that&rsquo;s because VBScript has no built-in mechanism for sorting data. On top of that, even when you <i>can<\/i> sort data you run into the <br \/>problem of whether VBScript sees your data as being composed of numbers or letters. For example, suppose we have the following values:<\/p>\n<pre class=\"codeSample\">2 BBB\n1 AAA\n10 DDD\n3 CCC<\/pre>\n<p>If VBScript sees these values as being numbers, then it will sort the values like this:<\/p>\n<pre class=\"codeSample\">1 AAA\n2 BBB\n3 CCC\n10 DDD<\/pre>\n<p>However, if VBScript thinks it&rsquo;s dealing with string values, then we&rsquo;re going to get back a list like this one:<\/p>\n<pre class=\"codeSample\">1 AAA\n10 DDD\n2 BBB\n3 CCC<\/pre>\n<p>Yuck. <\/p>\n<p>In KT&rsquo;s case this isn&rsquo;t a problem; that&rsquo;s because the numbers in his text file only go up to 9. As he noted, he needs to sort the lines in the file &ldquo;based on <br \/>the value of the first character in the line, which is always a number.&rdquo; If you&rsquo;re only sorting on the first character, well, there aren&rsquo;t many single-digit numbers <br \/>greater than 9.<\/p>\n<p>Well, OK, math fans: there aren&rsquo;t many <i>base-10<\/i> single-digit numbers greater than 9.<\/p>\n<p>We have a feeling, however, that many of you have text files that include line numbers <i>greater <\/i>than 9; in that case, sorting on just the first digit of a number <br \/>like 147 doesn&rsquo;t do you a lot of good. Therefore, we came up with a script that&rsquo;s a tiny bit more complicated, but way more flexible: it should work in any <br \/>situation and with any numbers.<\/p>\n<p>But don&rsquo;t worry; if <i>do<\/i> you want a script that only works with values up to 9, well, we&rsquo;ve tacked that on to the end of today&rsquo;s column.<\/p>\n<p>As for the fancier and slightly-more complicated script, we start out by defining a bunch of constants. The first two &ndash; ForReading and ForWriting &ndash; are needed <br \/>when we read from and write to the text file. The other four fill the following roles: <\/p>\n<ul>\n<li>\n<p><strong>adVarChar<\/strong>. This constant enables us to create a recordset field that uses the Variant data type. This field will hold the actual text of each line in the text <br \/>file.<\/p>\n<\/li>\n<li>\n<p><strong>MaxCharacters<\/strong>. This constant enables us to put as many as 255 characters in that Variant field. <\/p>\n<\/li>\n<li>\n<p><strong>adFldIsNullable<\/strong>. This constant allows us to have Null values in a field. We don&rsquo;t really need it for today&rsquo;s script, but it&rsquo;s a handy thing to know about. (And <br \/>putting it in doesn&rsquo;t hurt anything.)<\/p>\n<\/li>\n<li>\n<p><strong>adInteger<\/strong>. This constant lets us create a recordset field that uses the Integer data type. We&rsquo;ll use this field to keep track of the numbers that appear at <br \/>the start of each line.<\/p>\n<\/li>\n<\/ul>\n<p>Our next task is to create a &ldquo;disconnected recordset,&rdquo; which is &ndash; for all intents and purposes &ndash; a database table that exists only in memory (that is, it&rsquo;s not tied <br \/>to a physical database). We aren&rsquo;t going to discuss disconnected recordsets in any detail today; if you&rsquo;d like more information, you might take a look at <br \/><a href=\"http:\/\/www.microsoft.com\/technet\/scriptcenter\/guide\/sas_ent_piij.mspx\"><strong>this section<\/strong><\/a> of the <i>Microsoft Windows 2000 Scripting Guide<\/i>. For now, all we&rsquo;ll do is point out that the following chunk of code creates a new disconnected <br \/>recordset, adds a variant field named TextLine and an integer field named NumberLine, and then opens the recordset for use:<\/p>\n<pre class=\"codeSample\">Set DataList = CreateObject(\"ADOR.Recordset\")\nDataList.Fields.Append \"TextLine\", adVarChar, MaxCharacters, adFldIsNullable\nDataList.Fields.Append \"NumberLine\", adInteger, , adFldIsNullable\nDataList.Open<\/pre>\n<p>As soon as we have our recordset we can go ahead open the text file C:\\Scripts\\Test.txt for reading; that&rsquo;s what these two lines of code are for:<\/p>\n<pre class=\"codeSample\">Set objFSO = CreateObject(\"Scripting.FileSystemObject\")\nSet objFile = objFSO.OpenTextFile(\"C:\\Scripts\\Test.txt\", ForReading)<\/pre>\n<p>Once the file is open, we then set up a Do loop designed to run until we&rsquo;ve read every line in the file (in other words, until the file&rsquo;s <strong>AtEndOfStream<\/strong> property <br \/>is True):<\/p>\n<pre class=\"codeSample\">Do Until objFile.AtEndOfStream<\/pre>\n<p>As you might expect, all the excitement (well, most of it, anyway) takes place inside this Do loop. To begin with, we use the <strong>ReadLine<\/strong> method to read the first <br \/>line in the text file, storing that value in a variable named strLine. We then use VBScript&rsquo;s <strong>Len<\/strong> function to determine the total number of characters in that <br \/>line:<\/p>\n<pre class=\"codeSample\">intLength = Len(strLine)<\/pre>\n<p>Why do we need to know that? Well, we know that the first <i>x<\/i> characters in the line are numbers; however, we have no idea what <i>x<\/i> is equal to. Therefore, <br \/>we&rsquo;re going to examine each line character-by-character, not stopping until we encounter a non-numeric value. For example, suppose the first line in our text <br \/>file looks like this:<\/p>\n<pre class=\"codeSample\">124This is a line in the text file.<\/pre>\n<p>We&rsquo;ll grab the 1, the 2, and the 4, and won&rsquo;t stop grabbing characters until we reach the T, the first non-numeric value.<\/p>\n<p>So how do we do all <i>that<\/i>? Well, for starters, we assign an empty string to a variable named strValue:<\/p>\n<pre class=\"codeSample\">strValue = \"\"<\/pre>\n<p>We then execute the following For Next loop:<\/p>\n<pre class=\"codeSample\">For i = 1 to intLength\n    If IsNumeric(Mid(strLine, i, 1)) Then\n        strValue = strValue &amp; Mid(strLine, i, 1)\n    Else\n        Exit For\n    End If\nNext<\/pre>\n<p>What we&rsquo;re doing here is looping from 1 to the total number of characters in the line (intLength). Inside <i>this<\/i> loop, we use the <strong>IsNumeric<\/strong> function to determine <br \/>whether or not the first character in the line is a number. If it is, we append that character to the variable strValue:<\/p>\n<pre class=\"codeSample\">strValue = strValue &amp; Mid(strLine, i, 1)<\/pre>\n<p>In other words, the first time through the loop, strValue will be equal to 1. We then repeat the process with the next character in the line. If this, too is a <br \/>number (and it is) we append <i>that<\/i> value to the variable strValue. That makes strValue equal to this: 12. This continues until we find a non-numeric character. <br \/>As soon as we encounter a non-numeric character we call the <strong>Exit For<\/strong> statement and exit the For Next loop.<\/p>\n<p>So what did that do for us? That enabled us to extract the number that appears at the beginning of each line. Now that we know that number we can use this <br \/>block of code to add a new record to our recordset, a record consisting of the entire text of the line (TextLine) and the number that appears at the beginning <br \/>of the line:<\/p>\n<pre class=\"codeSample\">DataList.AddNew\nDataList(\"TextLine\") = strLine\nDataList(\"NumberLine\") = strValue\nDataList.Update<\/pre>\n<p>In other words, our recordset now contains the following record:<\/p>\n<table id=\"EEAAC\" class=\"dataTable\" cellspacing=\"0\" cellpadding=\"0\">\n<thead><\/thead>\n<tbody>\n<tr class=\"record\" valign=\"top\">\n<td>\n<p class=\"lastInCell\"><strong>TextLine<\/strong><\/p>\n<\/td>\n<td>\n<p class=\"lastInCell\"><strong>NumberLine<\/strong><\/p>\n<\/td>\n<\/tr>\n<tr class=\"evenRecord\" valign=\"top\">\n<td>\n<p class=\"lastInCell\">124This is a line in the text file.<\/p>\n<\/td>\n<td>\n<p class=\"lastInCell\">124<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"dataTableBottomMargin\"><\/div>\n<p>And then it&rsquo;s on to the next line in the text file.<\/p>\n<p>After we&rsquo;ve read all the lines from the text file (and added each line to our recordset) we close the file and then immediately reopen it, this time for writing. All <br \/>it takes now is one little line of code to sort the recordset:<\/p>\n<pre class=\"codeSample\">DataList.Sort = \"NumberLine ASC\"<\/pre>\n<p>As you can see, we&rsquo;re sorting by the field NumberLine; that&rsquo;s the field that contains the number that appears at the beginning of each line. We&rsquo;ve also chosen <br \/>to sort the data in ascending (1 to 100) order; that&rsquo;s what the <strong>ASC<\/strong> is for. Would you rather sort the recordset in <i>descending<\/i> order (100 to 1)? Then use this <br \/>line of code instead:<\/p>\n<pre class=\"codeSample\">DataList.Sort = \"NumberLine DESC\"<\/pre>\n<p>All we have to do now is replace the existing contents of our text file with these new, sorted contents. To do <i>that<\/i>, we use the <strong>MoveFirst<\/strong> method to move to <br \/>the first record in our recordset. We then use this block of code to loop through all the items in that recordset, using the <strong>WriteLine<\/strong> method to add the values <br \/>of the TextLine field to the file:<\/p>\n<pre class=\"codeSample\"><pre class=\"codeSample\">Do Until DataList.EOF\n    objFile.WriteLine DataList.Fields.Item(\"TextLine\") \n    DataList.MoveNext\nLoop\n<\/pre>\n<p>Will that really work? Well, suppose we have the following text file:<\/p>\n<pre class=\"codeSample\"><pre class=\"codeSample\">14tttt\n2aaaa\n8ffff\n6vvvv\n4cccc\n12xxxx\n11zzzz\n9qqqq\n<\/pre>\n<p>Here&rsquo;s what the same file looks like after we run our script:<\/p>\n<pre class=\"codeSample\"><pre class=\"codeSample\">2aaaa\n4cccc\n6vvvv\n8ffff\n9qqqq\n11zzzz\n12xxxx\n14tttt\n<\/pre>\n<p>Well, what do you know: it <i>did<\/i> work!<\/p>\n<p>That should take care of your problem, KT. In a way, though, that&rsquo;s a shame: like we said, that would have been a pretty good event in the 2009 Scripting <br \/>Games. After all, it&rsquo;s not impossible to solve, but it&rsquo;s also not immediately obvious what the solution might be. That, of course, is what the Scripting Games are all about. <\/p>\n<p>Speaking of which, have we mentioned that the <a href=\"http:\/\/www.microsoft.com\/technet\/scriptcenter\/funzone\/games\/default.mspx\"><strong>2008 Winter Scripting Games<\/strong><\/a> begin February 15<sup>th<\/sup>? You don&rsquo;t want to miss it; after all, we expect this <br \/>year&rsquo;s Games to be the biggest and the best competition ever. <\/p>\n<p>OK, admittedly, we&rsquo;ve only had two prior editions of the Scripting Games, and the first one was tossed together in about a week. Nevertheless, this year&rsquo;s <br \/>Games &ndash; which includes a Perl division as well as the VBScript and Windows PowerShell divisions &ndash; should be a lot of fun: the events are a bit more challenging <br \/>(but still doable); we have some &ldquo;bonus&rdquo; activities planned for those of you who run through all 10 events in a single day (unlike the Scripting Guys, who <br \/>typically take 10 days to run through a single event); and we have a lot of <a href=\"http:\/\/www.microsoft.com\/technet\/scriptcenter\/funzone\/games\/games08\/prizes.mspx\"><strong>cool prizes<\/strong><\/a> to give away. And remember, we have a Beginners division as well as <br \/>an Advanced division. That means there&rsquo;s a little something for everyone. Even you. <\/p>\n<p>OK, OK, maybe not him, and maybe not her. But there&rsquo;s definitely something for you. <\/p>\n<p>In other words, we&rsquo;d like as many people as possible to participate in the Scripting Games. Mark your calendar now. Or, even better, wait till tomorrow; in <br \/>tomorrow&rsquo;s column, we&rsquo;ll give you a script that can mark your calendar for you.<\/p>\n<p>Hmmm &hellip;. You know, that might be a good Scripting Games event, too &hellip;.<\/p>\n<p>At any rate, that should do it for today &ndash; oh, right. As promised, here&rsquo;s a simpler version of the script, one that works only with values from 0 to 9:<\/p>\n<pre class=\"codeSample\">Const ForReading = 1\nConst ForWriting =2\n\nConst adVarChar = 200\nConst MaxCharacters = 255\nConst adFldIsNullable = 32\n\nSet DataList = CreateObject(\"ADOR.Recordset\")\nDataList.Fields.Append \"TextLine\", adVarChar, MaxCharacters, adFldIsNullable\nDataList.Open\n\nSet objFSO = CreateObject(\"Scripting.FileSystemObject\")\nSet objFile = objFSO.OpenTextFile(\"C:\\Scripts\\Test.txt\", ForReading)\n\nDo Until objFile.AtEndOfStream\n    strLine = objFile.ReadLine\n    DataList.AddNew\n    DataList(\"TextLine\") = strLine\n    DataList.Update\nLoop\n\nobjFile.Close\n\nSet objFile = objFSO.OpenTextFile(\"C:\\Scripts\\Test.txt\", ForWriting)\n\nDataList.Sort = \"TextLine ASC\"\n\nDataList.MoveFirst\n\nDo Until DataList.EOF\n    objFile.WriteLine DataList.Fields.Item(\"TextLine\") \n    DataList.MoveNext\nLoop\n\nobjFile.Close<\/pre>\n<p>And <i>that<\/i> should do it for today. See you all tomorrow.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hey, Scripting Guy! I have a text file in which each line begins with a number. How can I sort the contents of that file by those numbers? &#8212; KT Hey, KT. Tell you what: we&rsquo;ll go ahead and answer your question. However, in return, you have to promise to forget everything we tell you. [&hellip;]<\/p>\n","protected":false},"author":595,"featured_media":87096,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[3,4,14,5],"class_list":["post-56353","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-scripting","tag-scripting-guy","tag-scripting-techniques","tag-text-files","tag-vbscript"],"acf":[],"blog_post_summary":"<p>Hey, Scripting Guy! I have a text file in which each line begins with a number. How can I sort the contents of that file by those numbers? &#8212; KT Hey, KT. Tell you what: we&rsquo;ll go ahead and answer your question. However, in return, you have to promise to forget everything we tell you. [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts\/56353","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/users\/595"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/comments?post=56353"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts\/56353\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/media\/87096"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/media?parent=56353"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/categories?post=56353"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/tags?post=56353"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}