{"id":55193,"date":"2008-07-29T01:40:00","date_gmt":"2008-07-29T01:40:00","guid":{"rendered":"https:\/\/blogs.technet.microsoft.com\/heyscriptingguy\/2008\/07\/29\/hey-scripting-guy-how-can-i-compare-files-with-the-same-name-but-two-different-extensions\/"},"modified":"2008-07-29T01:40:00","modified_gmt":"2008-07-29T01:40:00","slug":"hey-scripting-guy-how-can-i-compare-files-with-the-same-name-but-two-different-extensions","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/scripting\/hey-scripting-guy-how-can-i-compare-files-with-the-same-name-but-two-different-extensions\/","title":{"rendered":"Hey, Scripting Guy! How Can I Compare Files With the Same Name But Two Different Extensions?"},"content":{"rendered":"<p><span class=\"Apple-style-span\"><\/p>\n<h2><font class=\"Apple-style-span\" face=\"Verdana\" size=\"3\"><span class=\"Apple-style-span\"><font class=\"Apple-style-span\" face=\"Arial\" size=\"6\"><span class=\"Apple-style-span\"><\/span><\/font><\/span><\/font><\/h2>\n<p><img decoding=\"async\" class=\"nearGraphic\" title=\"Hey, Scripting Guy! Question\" height=\"34\" alt=\"Hey, Scripting Guy! Question\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/29\/2019\/02\/q-for-powertip.jpg\" width=\"34\" align=\"left\" border=\"0\" \/> <\/p>\n<p>Hey, Scripting Guy! After doing several restores we have folders with files named name.doc AND name.do (without &#8220;C&#8221;). How can we identify such pairs of files and do something to them based on a comparison?<\/p>\n<p>&#8212; TN<\/p>\n<p><img decoding=\"async\" height=\"5\" alt=\"Spacer\" src=\"https:\/\/devblogs.microsoft.com\/scripting\/wp-content\/uploads\/sites\/29\/2019\/05\/spacer.gif\" width=\"5\" border=\"0\" \/><img decoding=\"async\" class=\"nearGraphic\" title=\"Hey, Scripting Guy! Answer\" height=\"34\" alt=\"Hey, Scripting Guy! Answer\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/29\/2019\/02\/a-for-powertip.jpg\" width=\"34\" align=\"left\" border=\"0\" \/><a href=\"http:\/\/go.microsoft.com\/fwlink\/?linkid=68779&amp;clcid=0x409\"><\/a> <\/p>\n<p>Hi TN,<\/p>\n<p>Scripts like the one we\u2019ll develop in this article are the best tools I know of for dealing with this sort of problem. Dealing with something like this manually just isn\u2019t feasible. If the initial time investment isn\u2019t costly enough, the resulting therapy bills will be. That said, be careful when you run these scripts. Make sure you test them on some representative samples of your \u2018problem\u2019 before you unleash them on entire servers. If you aren\u2019t careful, you could easily worsen your situation.<\/p>\n<p>With that disclaimer out of the way, here\u2019s a script that provides a framework for taking an action on pairs of files with the same name, but different&nbsp; extensions.<\/p>\n<pre class=\"codeSample\">strComputer = \".\"Set objWMIService = GetObject(\"winmgmts:\" _    &amp; \"{impersonationLevel=impersonate}!\\\\\" &amp; strComputer &amp; \"\\root\\cimv2\")Set colFileList = objWMIService.ExecQuery _    (\"ASSOCIATORS OF {Win32_Directory.Name='C:\\Scripts\\HSG\\July28\\A'} Where \" _        &amp; \"ResultClass = CIM_DataFile\")NumberOfFiles = colFileList.Count For i=0 to NumberOfFiles-1   If i+1 &lt;= NumberOfFiles-1 Then      If colfilelist.ItemIndex(i).FileName = colfilelist.ItemIndex(i+1).FileName Then          DealWithDups colfilelist.ItemIndex(i), colfilelist.ItemIndex(i+1)      End IF   End IfNextSub DealWithDups (objFile1, objFile2)      strFile1 = objFile1.FileName &amp; \".\" &amp; objFile1.Extension      strFile2 = objFile2.FileName &amp; \".\" &amp; objFile2.Extension      WScript.Echo strFile1 &amp; \":\" &amp; strFile2End Sub<\/pre>\n<p>This&nbsp; first section of the script is just the boilerplate code required to get started working with WMI.<\/p>\n<pre class=\"codeSample\">strComputer = \".\"Set objWMIService = GetObject(\"winmgmts:\" _    &amp; \"{impersonationLevel=impersonate}!\\\\\" &amp; strComputer &amp; \"\\root\\cimv2\")<\/pre>\n<p>The next section of code uses the Win32_Directory WMI class. It is an association query that retrieves a collection of instances of CIM_Datafile corresponding to all the files in the specified directory: C:\\Scripts\\HSG\\July28\\A. That last sentence is a terrible mouthful. Basically, this section of code enables us to work on all the files in a directory. What we need to work with the files is stored in the variable named colFileList.<\/p>\n<pre class=\"codeSample\">Set colFileList = objWMIService.ExecQuery _    (\"ASSOCIATORS OF {Win32_Directory.Name='C:\\Scripts\\HSG\\July28\\A'} Where \" _        &amp; \"ResultClass = CIM_DataFile\")<\/pre>\n<p>It\u2019s in the next section of code that we introduce the logic that identifies our pairs.<\/p>\n<pre class=\"codeSample\">NumberOfFiles = colFileList.Count For i=0 to NumberOfFiles-1   If i+1 &lt;= NumberOfFiles-1 Then      If colFileList.ItemIndex(i).FileName = colFileList.ItemIndex(i+1).FileName Then          DealWithDups colFileList.ItemIndex(i), colFileList.ItemIndex(i+1)      End IF   End IfNext<\/pre>\n<p>Instead of using our tried and true For Each loop, we use a For loop. We do this so that we can use indexes i and i+1 to access two files in the colFileList collection each time through the loop. So, we end up stepping through each file in the folder, always accessing the file &#8220;to its right&#8221; as well. Since the files are returned in alphabetical order, any files that have the same name but differing extensions will be beside each other in the collection. For example, suppose you have these files in the collection:<\/p>\n<pre class=\"codeSample\">(a.doc, a.do, b.doc, c.doc, d.doc, d.do, e.doc, f.doc)<\/pre>\n<p>Then (a.doc, a.do) will be compared, as will (a.do,b.doc),(b.doc,c.doc),(c.doc,d.doc),(d.doc,d.do),(d.do,e.doc) and(e.doc,f.doc). We probably shouldn\u2019t expect&nbsp;<a href=\"http:\/\/www-cs-staff.stanford.edu\/~uno\/\">Professor Knuth<\/a>&nbsp;to send us a check for $2.56 in celebration of the elegance of this algorithm. But it does seem to get the job done.<\/p>\n<p>This If statement ensures that we don\u2019t try to test the last file in the collection against the one to the right of it (sense there isn\u2019t one to the right of it).<\/p>\n<pre class=\"codeSample\">If i+1 &lt;= NumberOfFiles-1 Then<\/pre>\n<p>Then, this line of code determines whether the names of the pair of files being examined are the same.<\/p>\n<pre class=\"codeSample\">If colfilelist.ItemIndex(i).FileName = colfilelist.ItemIndex(i+1).FileName Then<\/pre>\n<p>Many scripters are familiar with WMI classes and how to look up properties of a class. In this case, we\u2019re using CIM_DataFile (look back at the query involving Win32_Directory) and, once you know that, it\u2019s straightforward to look up&nbsp;<a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/aa387236.aspx\">the properties of that class<\/a>&nbsp;and find that FileName is the one you need. Less straightforward is how to figure out that you need to use ItemIndex to access individual files in the collection.<\/p>\n<p>You have to first recognize that colFileList was populated as a result of a call to ExecQuery. Looking up the&nbsp;<a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/aa393866(VS.85).aspx\">reference information<\/a>&nbsp;for this method in the WMI Scripting API Objects reference, we find that it returns something called an&nbsp;<a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/aa393762(VS.85).aspx\">SWbemObjectSet<\/a>. And, looking at the documentation for SWbemObjectSet, you find out that ItemIndex is the method you need to call if you want to index into the collection using an integer index. Yes, it\u2019s a little difficult to follow. You have WMI classes on one hand and the WMI Scripting Object library on the other; so there are two different sets of WMI reference documentation that you need to use. Of course, you might also need to look up VBScript information as well. Geesh, this scripting stuff is almost like hard work.<\/p>\n<p>OK, so once you\u2019ve done the comparison to see if you\u2019re dealing with a pair of files that have the same name (they must have different extensions&#8211;you can\u2019t have two files named exactly the same thing in a single folder), you can then take some action on them. You might want to delete the oldest or decide which to archive based on some other criteria. The Hey, Scripting Guy&nbsp;<a href=\"http:\/\/www.microsoft.com\/technet\/scriptcenter\/resources\/qanda\/files.mspx\">archives<\/a>&nbsp;of issues dealing with files provide a good starting point to give you some idea of what\u2019s possible.<\/p>\n<p>In this article, we\u2019re going to just set things up by creating a subroutine called DealWithDups where you can add whatever sort of processing you might want. Here\u2019s the line of code that calls our subroutine, passing file objects that represent each member of the pair:<\/p>\n<pre class=\"codeSample\">DealWithDups colfilelist.ItemIndex(i), colfilelist.ItemIndex(i+1)<\/pre>\n<p>And here\u2019s the subroutine. In this case we are just rebuilding the complete name of each file and displaying them both on the screen, separated by a colon.<\/p>\n<pre class=\"codeSample\">Sub DealWithDups (objFile1, objFile2)      strFile1 = objFile1.FileName &amp; \".\" &amp; objFile1.Extension      strFile2 = objFile2.FileName &amp; \".\" &amp; objFile2.Extension      WScript.Echo strFile1 &amp; \":\" &amp; strFile2End Sub<\/pre>\n<p>This lets you test that the pairs are being correctly identified. It also forces you to follow the advice we started out with&#8211;to test things before running scripts of this sort. Running the above script against a test folder will show you which pairs of files objFile1 and objFile2 actually point to.<\/p>\n<p>Once you\u2019re confident that the identified pairs are the ones you want to act on, you can add logic similar to the following to actually take action. This deletes the larger of the two files.<\/p>\n<pre class=\"codeSample\">Sub DealWithDups (objFile1, objFile2)   If objFile1.FileSize &gt; objFile2.FileSize Then      objFile1.Delete   End IfEnd Sub<\/pre>\n<p>Now, this only works on pairs of duplicates. What if you have triplets? What if you don\u2019t know ahead of time what you have? Hey, TN only asked for pairs. If you want to get fancy, you\u2019ve come to the wrong place.<\/p>\n<p>Maybe try&nbsp;<a href=\"http:\/\/www-cs-staff.stanford.edu\/~uno\/\">Professor Knuth<\/a>.<\/p>\n<p><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hey, Scripting Guy! After doing several restores we have folders with files named name.doc AND name.do (without &#8220;C&#8221;). How can we identify such pairs of files and do something to them based on a comparison? &#8212; TN Hi TN, Scripts like the one we\u2019ll develop in this article are the best tools I know of [&hellip;]<\/p>\n","protected":false},"author":595,"featured_media":87096,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[38,3,12,5],"class_list":["post-55193","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-scripting","tag-files","tag-scripting-guy","tag-storage","tag-vbscript"],"acf":[],"blog_post_summary":"<p>Hey, Scripting Guy! After doing several restores we have folders with files named name.doc AND name.do (without &#8220;C&#8221;). How can we identify such pairs of files and do something to them based on a comparison? &#8212; TN Hi TN, Scripts like the one we\u2019ll develop in this article are the best tools I know of [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts\/55193","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/users\/595"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/comments?post=55193"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/posts\/55193\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/media\/87096"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/media?parent=55193"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/categories?post=55193"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/scripting\/wp-json\/wp\/v2\/tags?post=55193"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}