October 27th, 2006

How Can I Delete Duplicate Items From an Array?

Hey, Scripting Guy! Question

Hey, Scripting Guy! How can I delete duplicate items from an array?

— WR

SpacerHey, Scripting Guy! AnswerScript Center

Hey, WR. You know, if Microsoft has a weakness – and we’re not saying that it does, we’re just saying if it does – it’s this: we never make mistakes. (What’s that? What about Windows ME? That’s a weird name; couldn’t possibly be one of our products.)

Take arrays, for example. When someone at Microsoft creates an array it has exactly the right items in it; no more, no less. And that’s a bit of a problem: because our arrays are always perfect it never occurred to us to come up with an easy way to delete items from an array. After all, if you create a perfect array right off the bat then why would you ever need to delete anything from that array?

Of course, we also recognize that not everyone has a spotless record of success after success like we do. (Pardon? Microsoft Bob? No, sorry; never heard of him. Do you know what department he works in here?) Because of that, we do have workarounds for people who end up with duplicate items in their arrays. Can you delete duplicate items from an array? Sure. The process is a bit convoluted, but it’ll work.

Let’s show you a script that deletes duplicate items from an array named arrItems, then see if we can explain how it works:

Set objDictionary = CreateObject(“Scripting.Dictionary”)

arrItems = Array(“a”,”b”,”b”,”c”,”c”,”c”,”d”,”e”,”e”,”e”)

For Each strItem in arrItems If Not objDictionary.Exists(strItem) Then objDictionary.Add strItem, strItem End If Next

intItems = objDictionary.Count – 1

ReDim arrItems(intItems)

i = 0

For Each strKey in objDictionary.Keys arrItems(i) = strKey i = i + 1 Next

For Each strItem in arrItems Wscript.Echo strItem Next

As you can see, things start out in fairly straightforward fashion: in line 1 we simply create an instance of the Scripting.Dictionary object. (Why do we create a Dictionary object? Sit tight; we’ll explain that in a minute.) We then use this line of code to create an array named arrItems, an array that – alas – includes a bunch of duplicate items:

arrItems = Array(“a”,”b”,”b”,”c”,”c”,”c”,”d”,”e”,”e”,”e”)

As we intimated earlier, there really is no way – at least no simple, intuitive way – to delete items from an array, let alone to detect duplicate items within an array. Therefore we’re going to go a different route. What we decided to do here is take the items in the array and copy them into a Dictionary; we’re going to do that because it is possible to prevent duplicate items from being entered into a Dictionary. Once the Dictionary has been populated we’ll redimension our array and then (in essence) export the items in the Dictionary back to the array. Like we said, it’s a bit convoluted, but the net effect is what we’re looking for: when we’re all done the array arrItems will contain only unique values. No more duplicates.

Of course, all that hinges on us first getting the items in the array into the Dictionary. To do that we begin by setting up a For Each loop that loops through each and every item in the array:

For Each strItem in arrItems

Inside that loop we then use this line of code to determine whether the array item has already been added to the Dictionary:

If Not objDictionary.Exists(strItem) Then

It’s kind of crazy syntax, but If Not objDictionary.Exists(strItem) Then can be read like this: “If the array item does not already exist in the Dictionary then do the following.” And what is the following? This line of code, which adds the array item to the Dictionary, using the same value for both the Dictionary key and item:

objDictionary.Add strItem, strItem

Note. If you aren’t familiar with the Dictionary object, or with terminology like key and item, then check out this section of the Microsoft Windows 2000 Scripting Guide or the Sesame Script article title, appropriately, The Dictionary Object.

That’s fine, but what if the item does exist in the Dictionary? No problem; in that case we simply loop around and tackle the next item in the array.

Let’s take a brief timeout to show you this works. The first item in our array is a. When we check the Dictionary the first time through the loop, we won’t find a Dictionary key equal to a. Therefore, we add a to the Dictionary. We then loop around and repeat this process, this time with item b. That makes sense, right?

Now, notice that the third item is the array is also a b, a duplicate item. What happens when we encounter this duplicate item on our third time through the loop? Nothing; because b is already in the Dictionary we won’t add a second instance of b. (Actually we can’t add a second instance of b; the Dictionary object doesn’t allow duplicate keys.) Instead, we simply loop around and repeat this process with the fourth item in the array.

Etc.

When we’re all done we’ll have a Dictionary containing the following keys:

a
b
c
d
e

Those are your unique items right there, WR, which means you could simply work with the Dictionary object at this point. Just for the heck of it, though, let’s talk about how you could reconfigure the array arrItems so that it contains just those 5 values.

In order to reconfigure arrItems we first need to redimension the array; that is, we need to reset arrItems so that it will hold only as many items as we have in the Dictionary. To do that we first use this line of code to determine the number of items in the Dictionary, minus 1:

intItems = objDictionary.Count – 1

Note. Why “minus 1?” Well, when you redimension an array you must specify the index number for the last allowable item. The first item in an array is always given the index number 0; that means that, in an array with 5 items, the last item will have an index number of 4 (the total number of items minus 1). Thus we take total number of items and subtract 1; that gives us the index number that will be assigned to the last item in the array.

After assigning the variable intItems the number of items in the Dictionary (that is, the Dictionary Count) minus 1, we then use this line of code to redimension the array arrItems and, while we’re at it, delete all the existing data in the array:

ReDim arrItems(intItems)

In turn, that gives us an empty array that has reserved spots for 5 values. Now all we have to do is fill each of those 5 spots.

Is that going to be hard? No, not really. After assigning the value 0 to a counter variable named i we set up a For Each loop to loop through each of the keys in the Dictionary:

For Each strKey in objDictionary.Keys
    arrItems(i) = strKey
    i = i + 1
Next

Inside that loop we assign the first item in the array (remember, the first item has an index number of 0) the value of the first key in the Dictionary. We then increment our counter variable by 1, loop around, and repeat the process, this time assigning the second item in the array (index number 1) the value of the second key in the Dictionary. We simply repeat this process until each key in the Dictionary has been assigned a spot in the array.

Is that really going to work? You bet it will. But just in case you don’t believe us, we’ve tacked some code to the end of the script that echoes back all the items in the array arrItems. Here’s what we get back when we execute that block of code:

a
b
c
d
e

Absolutely perfect. But, then again, what else would you expect from Microsoft?

Author

0 comments

Discussion are closed.