To box or not to Box? That is the question!

Sergey Tepliakov

Sergey

Discussions on reddit, hacker news.

Recently I’ve noticed that the Equal method from our ValueTuple (*) struct generates significant memory traffic (~1Gb). That was a bit of a surprise to me. This struct is well designed and was used pretty heavily in many performance critical scenarios. Here how the struct looks like:

public struct ValueTuple<TItem1, TItem2> : IEquatable<ValueTuple<TItem1, TItem2>>
{
    public TItem1 Item1 { get; }
    public TItem2 Item2 { get; }

    public ValueTuple(TItem1 item1, TItem2 item2)
        : this()
    {
        Item1 = item1;
        Item2 = item2;
    }

    public override int GetHashCode()
    {
        // Real implementation is a bit more complicated. Not a simple XOR
        return EqualityComparer<TItem1>.Default.GetHashCode(Item1) ^
            EqualityComparer<TItem2>.Default.GetHashCode(Item2);
    }

    public bool Equals(ValueTuple<TItem1, TItem2> other)
    {
        return (Item1 != null && Item1.Equals(other.Item1)) &&
                (Item2 != null && Item2.Equals(other.Item2));
    }

    public override bool Equals(object obj)
    {
        return obj is ValueTuple<TItem1, TItem2> &&
            Equals((ValueTuple<TItem1, TItem2>)obj);
    }
    // Other members, like equality operators are omitted for brevity
}

(*) Our ValueTuple was implemented way before the one that was shipped with VS 2017, and our implementation is immutable. So even today we can’t switch to the implementation from the .NET Framework.

Nothing special. Methods Equals and GetHashCode are overriden to avoid boxing allocation (more on this later in the post). Null check to avoid NullReferenceExcecption if Item1 or Item2 are reference types. Null check potentially could have caused a boxing allocation if the JIT would have leave a check for null if Item1 or Item2 are structs. But luckily it is smart enough to eliminate the check in this case. Everything is simple and straightforward. Right? Almost.

In our case the memory traffic was generated not for every ValueTuple case, but only for a specific one: HashSet<ValueTuple<int, MyEnum>>. Ok. This helps a bit. Right?

Let’s see what will happen when the Equals method is called to compare two instances of ValueTuple<int, MyEnm>> type:

// Compiler's view of the world
public bool Equals(ValueTuple<int, MyEnum> rhs)
{
    return Item1.Equals(rhs.Item1) && Item2.Equals(rhs.Item2);
}

For the Item1 we’d have a call to int.Equals(int), and for Item2 we’d have a call to MyEnum.Equals(MyEnum) method. For the first case nothing special will happen, but for the second case the method call will cause a boxing allocation!

“When” and “Why” boxing happens?

We used to think that boxing happens only when an instance of a value type is explicitly or implicitly converts to an object or to an interface:

int n = 42;
object o = n; // Boxing
IComparable c = n; // Boxing

But the reality is slightly more complicated. The JIT and the CLR will have to box an instance in some other cases: for instance, to invoke a method defined in “reference type” section of a value type.

All custom structs are implicitly sealed and derived from the special class: System.ValueType. All value types have “value semantic” meaning that the object’s default referential equality is no longer suitable for them. To enforce the new semantic System.ValueType provides a special implementation for two methods: GetHashCode and Equals.

But default implementation has two problems: the performance is very bad (**) because it may use reflection and boxing will happen for every method invocation.

(**) The performance of the default implementation of ValueType.Equals and ValueType.GetHashCode can vary significantly based on the structure of a given value type. If the struct doesn’t have pointers and is “packed” correctly, then bit-wise comparison is possible. Otherwise, reflection will be used that will cause a drastic performance degradation. See CanCompareBits implementation at the coreclr repo.

The first issue is pretty well-known but the other one is more subtle: if the struct does not override a method, boxing will happen!

struct MyStruct
{
    public int N { get; }
 
    public override int GetHashCode() => N.GetHashCode();
 
    public override bool Equals(object obj)
    {
        return obj is MyStruct && Equals((MyStruct)obj);
    }
 
    public bool Equals(MyStruct other) => N == other.N;
}
 

var myStruct = new MyStruct();
 
// no boxing: MyStruct overrides GetHashCode
var hc = myStruct.GetHashCode();
 
// no boxing: MyStruct overrides Equals
var equality = myStruct.Equals(myStruct);
 
// boxing: MyStruct does not override ToString
var s = myStruct.ToString();
 
// boxing: GetType is not virtual
var t = myStruct.GetType();

In the example above, the boxing will not happen in the first 2 cases, but will happen in the last two. Call to a method, defined in System.ValueType (like ToString, and GetType in this case) will cause boxing and call to an overridden method (like Equals and GetHashCode) – will not.

Now let’s switch back to our example with ValueTuple<int, MyEnum>. User-defined enums are value types without an ability to override GetHashCode and Equals, and this means that every call to MyEnum.GetHashCode or MyEnum.Equals will cause an allocation.

But can we avoid it? Yes, by using EqualityComparer<T>.Default instance.

How EqualityComparer avoids allocations?

Let’s get slightly simplified example and compare two ways of comparing enum values (pun intended): using Equals method and using EqualityComparer<T>.Default:

var s1 = new ValueTuple<int, MyEnum>(42, MyEnum.Foo);
var s2 = new ValueTuple<int, MyEnum>(42, MyEnum.Bar);
 
// Will cause a boxing allocation
bool r = s1.Equals(s2);

Let’s use BenchmarkDotNet to prove that the first case causes an allocation and the other one does not (to avoid iterator allocation I’m using a simple foreach loop and not something like Enumerable.Any or Enumerable.Contains):

[MemoryDiagnoser]
public class EnumComparisonBenchmark
{
    public MyEnum[] values = 
            Enumerable.Range(1, 1_000_000)
              .Select(n => MyEnum.Foo).ToArray();
 
    public EnumComparisonBenchmark()
    {
        values[values.Length - 1] = MyEnum.Bar;
    }
 
    [Benchmark]
    public bool UsingEquals()
    {
        foreach(var n in values)
        {
            if (n.Equals(MyEnum.Bar)) return true;
        }
        return false;
    }
 
    [Benchmark]
    public bool UsingEqualityComparer()
    {
            var comparer = EqualityComparer<MyEnum>.Default;
        foreach (var n in values)
        {
            if (comparer.Equals(n, MyEnum.Bar)) return true;
        }
        return false;
    }
}
                Method |      Mean |      Gen 0 |  Allocated |
---------------------- |----------:|-----------:|-----------:|
           UsingEquals | 13.300 ms | 15195.9459 | 48000597 B |
 UsingEqualityComparer |  4.659 ms |          - |       58 B |

As we can see, the regular Equals call causes a lot of allocations for no reason. Moreover, EqualityComparer is way faster, although in my case I haven’t seen any difference after I’ve switched the implementation to use EqualityComparer. The main question is: how does EqualityComparer do this?

EqualityComparer is an abstract class that provides the most suitable comparer based on a given type argument via EqualityComparer<T>.Default property. Main logic resides in method ComparerHelpers.CreateDefaultEqualityComparer and in case of enums it delegates it to another helper method – TryCreateEnumEqualityComparer. The last method then checks the underlying type of the enum and creates special comparer instance that does some nasty tricks:

[Serializable]
internal class EnumEqualityComparer<T> 
    : EqualityComparer<T> where T : struct
{
    [Pure]
    public override bool Equals(T x, T y)
    {
        int x_final = System.Runtime.CompilerServices
           .JitHelpers.UnsafeEnumCast(x);
        int y_final = System.Runtime.CompilerServices
           .JitHelpers.UnsafeEnumCast(y);
        return x_final == y_final;
    }
 
    [Pure]
    public override int GetHashCode(T obj)
    {
        int x_final = System.Runtime.CompilerServices
           .JitHelpers.UnsafeEnumCast(obj);
        return x_final.GetHashCode();
    }
}

EnumEqualityComparer converts a nenum instance to its underlying numeric value using JitHelpers.UnsafeEnumCast with a following comparison of two numbers.

So, what was the final fix?

The fix was very simple: instead of comparing values using Item1.Equals we’ve switched to EqualityComparer<T>.Default.Equals(Item1, other.Item1).

Additional References

·        Remove usage of EqualityComparer from ValueTuple’s GetHashCode.

·        GetHashCode should not box enums in generic methods.

·        Optimize calls to enum’s GetHashCode/Equals.

0 comments

Leave a comment