{"id":70033,"date":"2005-04-13T16:56:00","date_gmt":"2005-04-13T16:56:00","guid":{"rendered":"https:\/\/blogs.technet.microsoft.com\/heyscriptingguy\/2005\/04\/13\/how-can-i-eliminate-duplicate-names-in-a-text-file\/"},"modified":"2005-04-13T16:56:00","modified_gmt":"2005-04-13T16:56:00","slug":"how-can-i-eliminate-duplicate-names-in-a-text-file","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/scripting\/how-can-i-eliminate-duplicate-names-in-a-text-file\/","title":{"rendered":"How Can I Eliminate Duplicate Names in a Text File?"},"content":{"rendered":"<p><IMG class=\"nearGraphic\" title=\"Hey, Scripting Guy! Question\" border=\"0\" alt=\"Hey, Scripting Guy! Question\" align=\"left\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/29\/2019\/02\/q-for-powertip.jpg\" width=\"34\" height=\"34\"> \n<P>Hey, Scripting Guy! I have a text file that contains a bunch of names. How can I read through that text file and eliminate all the duplicate names?<BR><BR>&#8212; MW<\/P><IMG border=\"0\" alt=\"Spacer\" src=\"https:\/\/devblogs.microsoft.com\/scripting\/wp-content\/uploads\/sites\/29\/2019\/05\/spacer.gif\" width=\"5\" height=\"5\"><IMG class=\"nearGraphic\" title=\"Hey, Scripting Guy! Answer\" border=\"0\" alt=\"Hey, Scripting Guy! Answer\" align=\"left\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/29\/2019\/02\/a-for-powertip.jpg\" width=\"34\" height=\"34\"><A href=\"http:\/\/go.microsoft.com\/fwlink\/?linkid=68779&amp;clcid=0x409\"><IMG class=\"farGraphic\" title=\"Script Center\" border=\"0\" alt=\"Script Center\" align=\"right\" src=\"http:\/\/img.microsoft.com\/library\/media\/1033\/technet\/images\/scriptcenter\/ad.jpg\" width=\"120\" height=\"288\"><\/A> \n<P>Hey, MW. We\u2019re assuming you have a text file that looks something like this:<\/P><PRE class=\"codeSample\">Ken Myer\nDean Tsaltas\nJonathan Haas\nKen Myer\nDean Tsaltas\nSyed Abbas\nGail Erickson\nCarol Phillips\nDean Tsaltas\nDylan Miller\nKim Abercrombie\nDylan Miller\n<\/PRE>\n<P>As you can see, there are a number of duplicate names in this list. For example, you have three Dean Tsaltas\u2019. We actually <I>know<\/I> Dean Tsaltas and &#8211; take it from us &#8211; one Dean Tsaltas is enough for any organization! But how can you get rid of all those duplicate names?<\/P>\n<P>Well, to tell you the truth there are several different ways. We decided to use the Script Runtime\u2019s <B>Dictionary<\/B> object because we thought that, all things considered, it was the simplest and easiest approach. What we\u2019re going to do is read in the names from the text file, one-by-one. We\u2019ll read in the first name &#8211; Ken Myer &#8211; and then store that name in the Dictionary. We\u2019ll then read in the second name but &#8211; before we add it to the Dictionary &#8211; we\u2019ll check to see if the name is already <I>in<\/I> the Dictionary. If it\u2019s not, we\u2019ll add it. If it <I>is<\/I> in the Dictionary then we won\u2019t add it and, instead, we\u2019ll just go ahead and read in the third name. When we\u2019re done, our Dictionary will contain a list of all the unique names, no duplicates.<\/P>\n<P>Here\u2019s a script that will echo back the unique names (we\u2019ll modify this script in a minute so that it writes the unique names back to the text file):<\/P><PRE class=\"codeSample\">Const ForReading = 1<\/p>\n<p>Set objDictionary = CreateObject(&#8220;Scripting.Dictionary&#8221;)\nSet objFSO = CreateObject(&#8220;Scripting.FileSystemObject&#8221;)<\/p>\n<p>Set objFile = objFSO.OpenTextFile _\n    (&#8220;c:\\scripts\\namelist.txt&#8221;, ForReading)<\/p>\n<p>Do Until objFile.AtEndOfStream\n    strName = objFile.ReadLine\n    If Not objDictionary.Exists(strName) Then\n        objDictionary.Add strName, strName\n    End If\nLoop<\/p>\n<p>objFile.Close<\/p>\n<p>For Each strKey in objDictionary.Keys\n    Wscript.Echo strKey\nNext\n<\/PRE>\n<P>We begin by defining a constant &#8211; For Reading &#8211; that we need when we open the text file. Next we create instances of two different objects: the <B>Dictionary<\/B> object and the <B>FileSystemObject<\/B>. Finally, we use the <B>OpenTextFile<\/B> method to open the file C:\\Scripts\\Namelist.txt for reading.<\/P>\n<P>Now we\u2019re ready to have some fun. What we do next is set up a Do Loop that enables us to read the text file line-by-line (and, by extension, name-by-name). Each time we read in a line, we store that value in a variable named strName. We then use this line of code to see if that particular name is already in the Dictionary:<\/P><PRE class=\"codeSample\">If Not objDictionary.Exists(strName) Then\n<\/PRE>\n<P>Yes, kind of clumsy syntax, but this can be read as \u201cIf the name stored in the variable strName does not exist in the Dictionary, then do the following line of code.\u201d In that following line of code, we simply add that name to the Dictionary, using the value of strName for both the item value and item key. (For more information about the Dictionary object, see <A href=\"http:\/\/null\/technet\/scriptcenter\/guide\/sas_scr_ildk.mspx\" target=\"_blank\"><B>this section<\/B><\/A> of the <I>Microsoft Windows 2000 Scripting Guide<\/I>.) If the name <I>does<\/I> exist in the Dictionary then we simply loop around and read in the next line. This continues until we\u2019ve processed each line in the text file.<\/P>\n<P>What does that give us? Well, it gives us a Dictionary that contains a set of keys, each key representing a unique name. If we want to see those names we can simply loop through the <B>Keys<\/B> collection and echo the value of each Dictionary key:<\/P><PRE class=\"codeSample\">For Each strKey in objDictionary.Keys\n    Wscript.Echo strKey\nNext\n<\/PRE>\n<P>Child\u2019s play.<\/P>\n<P>Of course, you didn\u2019t necessarily want to <I>see<\/I> the unique names, you wanted to delete duplicate names from your text file. That\u2019s easy enough. Our Dictionary object now contains the unique file names, with all the duplicate names deleted. To remove the duplicate names from the file itself all we have to do is open that file (for writing this time) and replace the existing contents with our Dictionary keys. Because the Dictionary keys represent the unique names writing those keys to the text file will give us a text file with unique names, no duplicates.<\/P>\n<P>Here\u2019s a modified script that writes the Dictionary keys back to the text file:<\/P><PRE class=\"codeSample\">Const ForReading = 1\nConst ForWriting = 2<\/p>\n<p>Set objDictionary = CreateObject(&#8220;Scripting.Dictionary&#8221;)<\/p>\n<p>Set objFSO = CreateObject(&#8220;Scripting.FileSystemObject&#8221;)\nSet objFile = objFSO.OpenTextFile _\n    (&#8220;c:\\scripts\\namelist.txt&#8221;, ForReading)<\/p>\n<p>Do Until objFile.AtEndOfStream\n    strName = objFile.ReadLine\n    If Not objDictionary.Exists(strName) Then\n        objDictionary.Add strName, strName\n    End If\nLoop<\/p>\n<p>objFile.Close<\/p>\n<p>Set objFile = objFSO.OpenTextFile _\n    (&#8220;c:\\scripts\\namelist.txt&#8221;, ForWriting)<\/p>\n<p>For Each strKey in objDictionary.Keys\n    objFile.WriteLine strKey\nNext<\/p>\n<p>objFile.Close\n<\/PRE>\n<P>If you run the script and then open the text file you should see something like this:<\/P><PRE class=\"codeSample\">Ken Myer\nDean Tsaltas\nJonathan Haas\nSyed Abbas\nGail Erickson\nCarol Phillips\nDylan Miller\nKim Abercrombie\n<\/PRE>\n<P>See? We\u2019re down to one Dean Tsaltas. Now if we can just figure out a way to get rid of <I>that <\/I>Dean Tsaltas, then we\u2019d have something. (Just kidding, Dean.)<\/P><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hey, Scripting Guy! I have a text file that contains a bunch of names. How can I read through that text file and eliminate all the duplicate names?&#8212; MW Hey, MW. We\u2019re assuming you have a text file that looks something like this:Ken Myer Dean Tsaltas Jonathan Haas Ken Myer Dean Tsaltas Syed Abbas Gail [&hellip;]<\/p>\n","protected":false},"author":595,"featured_media":87096,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[3,4,14,5],"class_list":["post-70033","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-scripting","tag-scripting-guy","tag-scripting-techniques","tag-text-files","tag-vbscript"],"acf":[],"blog_post_summary":"<p>Hey, Scripting Guy! I have a text file that contains a bunch of names. How can I read through that text file and eliminate all the duplicate names?&#8212; MW Hey, MW. We\u2019re assuming you have a text file that looks something like this:Ken Myer Dean Tsaltas Jonathan Haas Ken Myer Dean Tsaltas Syed Abbas Gail [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts\/70033","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/users\/595"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/comments?post=70033"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts\/70033\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/media\/87096"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/media?parent=70033"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/categories?post=70033"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/tags?post=70033"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}