Since .NET 4’s release, I’ve received several questions about a peculiar behavior of ConcurrentQueue<T> having to do with memory management.
With Queue<T>, List<T>, and other such data structures in the .NET Framework, when you remove an element from the collection, the collection internally wipes out its reference to the stored element, e.g.
class MyQueue<T>
{
private T[] m_data;
private int m_head;
…
public T Dequeue()
{
…
T value = m_data[m_head];
m_data[m_head] = default(T);
m_head = (m_head + 1) % m_data.Length;
…
return value;
}
}
This ensures that the data structure won’t hold on to the dequeued data in the structure after the data has been removed, such that if the consumer also drops all references, any referenced data can be garbage collected.
This behavior is intuitive, which is a key reason it’s concerned some folks that ConcurrentQueue<T> in .NET 4 doesn’t behave exactly like this. For its implementation in .NET 4 (as with all discussions of implementation, these details can change in the future), ConcurrentQueue<T> is made up of a linked list of array segments, each of which can store up to 32 elements. As elements are enqueued, they fill up the tail array segment, and when it’s full, a new tail segment is tacked on to the linked list. When elements are dequeued, they’re taken from the head segment, and when all elements have been taken from the segment, the segment is removed from the linked list and all references to it dropped. Since the queue only stores a reference to an enqueued item in one segment, dropping that segment removes all references to the enqueued item (for the purposes of this description, if the same item is enqueued twice, I’m treating that as two distinct items). However, multiple items are stored per segment, so dequeuing an element doesn’t cause the segment to be removed from the linked list unless the element is the last remaining one in the segment. And in .NET 4, ConcurrentQueue<T> doesn’t null out individual items in the segment as they’re dequeued. All of this means that the queue might hold on to the last <= 31 dequeued elements.
If the elements are small, you’ll probably never notice this. If, however, the elements hold on to large resources (e.g. each element is a huge image bitmap), it’s possible you could see the impact of this (one workaround is to queue a wrapper object, e.g. have a ConcurrentQueue<StrongBox<T>> rather than a ConcurrentQueue<T>, and null out the wrapper’s reference to the T value after the wrapper has been dequeued).
For better or worse, this behavior in .NET 4 is actually “by design.” The reason for this has to do with enumeration semantics. ConcurrentQueue<T> provides “snapshot semantics” for enumeration, meaning that the instant you start enumerating, ConcurrentQueue<T> captures the current head and tail of what’s currently in the queue, and even if those elements are dequeued after the capture or if new elements are enqueued after the capture, the enumeration will still return all of and only what was in the queue at the time the enumeration began. If elements in the segments were to be nulled out when they were dequeued, that would impact the veracity of these enumerations.
For .NET 4.5, we’ve changed the design to strike what we believe to be a good balance. Dequeued elements are now nulled out as they’re dequeued, unless there’s a concurrent enumeration happening, in which case the element isn’t nulled out and the same behavior as in .NET 4 is exhibited. So, if you never enumerate your ConcurrentQueue<T>, dequeues will result in the queue immediately dropping its reference to the dequeued element. Only if when the dequeue is issued someone happens to be enumerating the queue (i.e. having called GetEnumerator on the queue and not having traversed the enumerator or disposed of it yet) will the null’ing out not happen; as with .NET 4, at that point the reference will remain until the containing segment is removed.
0 comments