RSS 2.0
Sign In
# Monday, 13 January 2014

Dealing recently with some task (the same that inspired us to implement WeakTable), we were in a position to use a dictionary as a key in another dictionary.

What are the rules for the class to be used as key:

  • key should be immutable;
  • key should implement a GetHashCode() method;
  • key should implement a Equals() method.

The first requirement is usually implemented as a documentation contract like this:

As long as an object is used as a key in the Dictionary<TKey, TValue>, it must not change in any way that affects its hash value.

The third requirement about equals is trivially implemented as a method:

public bool Equals(IDictionary<K, V> x, IDictionary<K, V> y)
{
  if (x == y)
  {
    return true;
  }

  if ((x == null) || (y == null) || (x.Count != y.Count))
  {
    return false;
  }

  foreach(var entry in x)
  {
    V value;

    if (!y.TryGetValue(entry.Key, out value) ||
      !valueComparer.Equals(entry.Value, value))
    {
      return false;
    }
  }

  return true;
}

But how would you implement hash code?

We argued like this.

1. Let's consider the dictionary as a sparse array of values with only populated items that correspond to key hash codes.

2. Hash code is constructed using some fair algorithm. E.g like that used in java to calculate string's hash code:

       n-1
h(s) = SUM (s[i]*p^(n-1-i)) mod m,  where m = 2^31
       i=0

In our case:

  • n can be arbitrary large int value, so in fact it's 2^32;
  • items are enumerated in unknown order;
  • there is only limited set of items, so most s[i] are zeros.

As result we cannot use recurrent function to calculate a power p^k mod m. Fortunately one can build fast exponentiation arguing like this:

        32/s - 1
p^k = p^  SUM  2^((s*i)*k[i]) mod m,  where s some int: 1, 2, 4, 8, 16, or 32.
          i=0

Thus

     32/s - 1
p^k = PRODUCT (p^(2^(s*i)))^k[i] mod m
        i=0

If s = 1 then k[i] is either 1 or 0 (a bit), and there is 32 different p^(2^i) mod m values, which can be precalculated.

On the other hand, if we select s = 8 we can write the formula as:

p^k = p^k[0] * (p^(2^8))^k[1] * (p^(2^16))^k[2] * (p^(2^24))^k[3] mod m

where k[i] is a 8 bit value (byte).

Precalculating all values p^n, (p^(2^8))^n, (p^(2^16))^n, (p^(2^24))^n for n in 0 to 255 we reach the formula with 4 multiplications and with 1024 precalculated values.

Here is the whole utility to calculate hash factors:

/// <summary>
/// Hash utilities.
/// </summary>
public class Hash
{
  /// <summary>
  /// Returns a P^value mod 2^31, where P is hash base.
  /// </summary>
  /// <param name="value">A value to get hash factor for.</param>
  /// <returns>A hash factor value.</returns>
  public static int GetHashFactor(int value)
  {
    return factors[(uint)value & 0xff] *
      factors[(((uint)value >> 8) & 0xff) | 0x100] *
      factors[(((uint)value >> 16) & 0xff) | 0x200] *
      factors[(((uint)value >> 24) & 0xff) | 0x300];
  }

  /// <summary>
  /// Initializes hash factors.
  /// </summary>
  static Hash()
  {
    var values = new int[4 * 256];
    var value = P;
    var current = 1;
    var i = 0;

    do
    {
      values[i++] = current;
      current *= value;
    }
    while(i < 256);

    value = current;
    current = 1;

    do
    {
      values[i++] = current;
      current *= value;
    }
    while(i < 512);

    value = current;
    current = 1;

    do
    {
      values[i++] = current;
      current *= value;
    }
    while(i < 768);

    value = current;
    current = 1;

    do
    {
      values[i++] = current;
      current *= value;
    }
    while(i < 1024);

    factors = values;
  }

  /// <summary>
  /// A base to calculate hash factors.
  /// </summary>
  public const int P = 1103515245;

  /// <summary>
  /// Hash factors.
  /// </summary>
  private static readonly int[] factors;
}

With this API hash code for a dictionary is a trivial operation:

public int GetHashCode(IDictionary<K, V> dictionary)
{
  if (dictionary == null)
  {
    return 0;
  }

  var result = 0;

  foreach(var entry in dictionary)
  {
    if ((entry.Key == null) || (entry.Value == null))
    {
      continue;
    }

    result += Hash.GetHashFactor(keyComparer.GetHashCode(entry.Key)) *
      valueComparer.GetHashCode(entry.Value);
  }

  return result;
}

And finally, here is a reference to a class DictionaryEqualityComparer<K, V>: IEqualityComparer<IDictionary<K, V>> that allows a dictionary to be a key in another dictionary.

Update

We have commited some tests, and have found that with suffiently "good" implementation of GetHashCode() of key or value we achieve results almost of the same quality, as the results of the algorithm we have outlined above with much simpler and straightforward algorithm like this:

public int GetHashCode(IDictionary<K, V> dictionary)
{
  if (dictionary == null)
  {
    return 0;
  }

  var result = 0;

  foreach(var entry in dictionary)
  {
    if ((entry.Key == null) || (entry.Value == null))
    {
      continue;
    }

    var k = entry.Key.GetHashCode();
    var v = entry.Value.GetHashCode();

    k = (k << 5) + k;
    v = (v << (k >> 3)) + v;

    result += k ^ v;

    //result += Hash.GetHashFactor(keyComparer.GetHashCode(entry.Key)) *
    // valueComparer.GetHashCode(entry.Value);
  }

  return result;
}

It was worth to blog about this just to find out that we have outwitted ourselves, and finally to reach to a trivial hash code implementation for the dictionary.

Monday, 13 January 2014 20:33:31 UTC  #    Comments [0] -
.NET | Thinking aloud | Tips and tricks
All comments require the approval of the site owner before being displayed.
Name
E-mail
Home page

Comment (Some html is allowed: a@href@title, b, blockquote@cite, em, i, strike, strong, sub, super, u) where the @ means "attribute." For example, you can use <a href="" title=""> or <blockquote cite="Scott">.  

[Captcha]Enter the code shown (prevents robots):

Live Comment Preview
Archive
<2014 January>
SunMonTueWedThuFriSat
2930311234
567891011
12131415161718
19202122232425
2627282930311
2345678
Statistics
Total Posts: 387
This Year: 3
This Month: 0
This Week: 0
Comments: 1877
Locations of visitors to this page
Disclaimer
The opinions expressed herein are our own personal opinions and do not represent our employer's view in anyway.

© 2024, Nesterovsky bros
All Content © 2024, Nesterovsky bros
DasBlog theme 'Business' created by Christoph De Baene (delarou)