Mikey S. Mikey S. - 4 months ago 23
C# Question

Efficiency of using IEqualityComparer in Dictionary vs HashCode and Equals()

The title is pretty much clear I think.

I was wondering if there's a certain efficiency overhead when using

IEqualityComparer
in a
Dictionary<K,V>
how does it all work when providing one?

Thanks

Answer

Is it Faster?

Coming from a gamedev perspective, if your key is a value type (struct, primitive, enum, etc.) providing your own EqualityComparer<T> is significantly faster - due to the fact the EqualityComparer<T>.Default boxes the value.

As a real-world example, the Managed DirectX billboard sample used to run at ~30% of the speed of the C++ version; where all the other samples were running at ~90%. The reason for this was that the billboards were being sorted using the default comparer (and thus being boxed), as it turns out 4MB of data was being copied around every frame thanks to this.

How does it work?

Dictionary<K,V> will provide EqualityComparer<T>.Default to itself via the default constructor. What the default equality comparer does is (basically, notice how much boxing occurs):

public void GetHashCode(T value)
{
   return ((object)value).GetHashCode();
}

public void Equals(T first, T second)
{
   return ((object)first).Equals((object)second);
}

Why would I ever use it?

It's quite common to see this kind of code (when trying to have case-insensitive keys):

var dict = new Dictionary<string, int>();
dict.Add(myParam.ToUpperInvariant(), fooParam);
// ...
var val = dict[myParam.ToUpperInvariant()];

This is really wasteful, it is better to just use a StringComparer on the constructor:

var dict = new Dictionary<string, int>(StringComparer.OrdinalIgnoreCase);

Is it faster (redux)?

In this specific scenario it is a lot faster, because ordinal string comparisons are the fastest type of string comparison you can do. A quick benchmark:

static void Main(string[] args)
{
    var d1 = new Dictionary<string, int>();
    var d2 = new Dictionary<string, int>(StringComparer.OrdinalIgnoreCase);

    d1.Add("FOO", 1);
    d2.Add("FOO", 1);

    Stopwatch s = new Stopwatch();
    s.Start();
    RunTest1(d1, "foo");
    s.Stop();
    Console.WriteLine("ToUpperInvariant: {0}", s.Elapsed);

    s.Reset();
    s.Start();
    RunTest2(d2, "foo");
    s.Stop();
    Console.WriteLine("OrdinalIgnoreCase: {0}", s.Elapsed);

    Console.ReadLine();
}

static void RunTest1(Dictionary<string, int> values, string val)
{
    for (var i = 0; i < 10000000; i++)
    {
        values[val.ToUpperInvariant()] = values[val.ToUpperInvariant()];
    }
}

static void RunTest2(Dictionary<string, int> values, string val)
{
    for (var i = 0; i < 10000000; i++)
    {
        values[val] = values[val];
    }
}

// ToUpperInvariant: 00:00:04.5084119
// OrdinalIgnoreCase: 00:00:02.1211549
// 2x faster.

Reservations

It is possible to eliminate the boxing overhead by implementing an interface on a struct (such as IEquatable<T>). However, there are many surprising rules for when boxing occurs under these circumstances so I would recommend using the paired interface (e.g. IEqualityComparer<T> in this case) if at all possible.