Dealing recently with some task (the same that inspired us to implement WeakTable), we were in a position to use a dictionary as a key in another dictionary.
What are the rules for the class to be used as key:
GetHashCode()
Equals()
The first requirement is usually implemented as a documentation contract like this:
As long as an object is used as a key in the Dictionary<TKey, TValue>, it must not change in any way that affects its hash value.
Dictionary<TKey, TValue>
The third requirement about equals is trivially implemented as a method:
public bool Equals(IDictionary<K, V> x, IDictionary<K, V> y) { if (x == y) { return true; } if ((x == null) || (y == null) || (x.Count != y.Count)) { return false; } foreach(var entry in x) { V value; if (!y.TryGetValue(entry.Key, out value) || !valueComparer.Equals(entry.Value, value)) { return false; } } return true; }
But how would you implement hash code?
We argued like this.
1. Let's consider the dictionary as a sparse array of values with only populated items that correspond to key hash codes.
2. Hash code is constructed using some fair algorithm. E.g like that used in java to calculate string's hash code:
n-1 h(s) = SUM (s[i]*p^(n-1-i)) mod m, where m = 2^31 i=0
In our case:
n
int
2^32
s[i]
As result we cannot use recurrent function to calculate a power p^k mod m. Fortunately one can build fast exponentiation arguing like this:
p^k mod m.
32/s - 1 p^k = p^ SUM 2^((s*i)*k[i]) mod m, where s some int: 1, 2, 4, 8, 16, or 32. i=0
Thus
32/s - 1 p^k = PRODUCT (p^(2^(s*i)))^k[i] mod m i=0
If s = 1 then k[i] is either 1 or 0 (a bit), and there is 32 different p^(2^i) mod m values, which can be precalculated.
s = 1
k[i]
1
0
32
p^(2^i) mod m
On the other hand, if we select s = 8 we can write the formula as:
s = 8
p^k = p^k[0] * (p^(2^8))^k[1] * (p^(2^16))^k[2] * (p^(2^24))^k[3] mod m
where k[i] is a 8 bit value (byte).
8
Precalculating all values p^n, (p^(2^8))^n, (p^(2^16))^n, (p^(2^24))^n for n in 0 to 255 we reach the formula with 4 multiplications and with 1024 precalculated values.
p^n, (p^(2^8))^n
(p^(2^16))^n
(p^(2^24))^n
255
4
1024
Here is the whole utility to calculate hash factors:
/// <summary> /// Hash utilities. /// </summary> public class Hash { /// <summary> /// Returns a P^value mod 2^31, where P is hash base. /// </summary> /// <param name="value">A value to get hash factor for.</param> /// <returns>A hash factor value.</returns> public static int GetHashFactor(int value) { return factors[(uint)value & 0xff] * factors[(((uint)value >> 8) & 0xff) | 0x100] * factors[(((uint)value >> 16) & 0xff) | 0x200] * factors[(((uint)value >> 24) & 0xff) | 0x300]; } /// <summary> /// Initializes hash factors. /// </summary> static Hash() { var values = new int[4 * 256]; var value = P; var current = 1; var i = 0; do { values[i++] = current; current *= value; } while(i < 256); value = current; current = 1; do { values[i++] = current; current *= value; } while(i < 512); value = current; current = 1; do { values[i++] = current; current *= value; } while(i < 768); value = current; current = 1; do { values[i++] = current; current *= value; } while(i < 1024); factors = values; } /// <summary> /// A base to calculate hash factors. /// </summary> public const int P = 1103515245; /// <summary> /// Hash factors. /// </summary> private static readonly int[] factors; }
With this API hash code for a dictionary is a trivial operation:
public int GetHashCode(IDictionary<K, V> dictionary) { if (dictionary == null) { return 0; } var result = 0; foreach(var entry in dictionary) { if ((entry.Key == null) || (entry.Value == null)) { continue; } result += Hash.GetHashFactor(keyComparer.GetHashCode(entry.Key)) * valueComparer.GetHashCode(entry.Value); } return result; }
And finally, here is a reference to a class DictionaryEqualityComparer<K, V>: IEqualityComparer<IDictionary<K, V>> that allows a dictionary to be a key in another dictionary.
DictionaryEqualityComparer<K, V>: IEqualityComparer<IDictionary<K, V>>
Update
We have commited some tests, and have found that with suffiently "good" implementation of GetHashCode() of key or value we achieve results almost of the same quality, as the results of the algorithm we have outlined above with much simpler and straightforward algorithm like this:
public int GetHashCode(IDictionary<K, V> dictionary) { if (dictionary == null) { return 0; } var result = 0; foreach(var entry in dictionary) { if ((entry.Key == null) || (entry.Value == null)) { continue; } var k = entry.Key.GetHashCode(); var v = entry.Value.GetHashCode(); k = (k << 5) + k; v = (v << (k >> 3)) + v; result += k ^ v; //result += Hash.GetHashFactor(keyComparer.GetHashCode(entry.Key)) * // valueComparer.GetHashCode(entry.Value); } return result; }
It was worth to blog about this just to find out that we have outwitted ourselves, and finally to reach to a trivial hash code implementation for the dictionary.
Remember Me
a@href@title, b, blockquote@cite, em, i, strike, strong, sub, super, u