覆盖System.Object.GetHashCode的最佳算法是什么？-Java 学习之路

1271

在.NET中， System.Object.GetHashCode 方法在很多地方使用，遍及.NET基类库 . 特别是在快速查找集合中的项目或确定相等性时 . 是否有关于如何为我的自定义类实现 GetHashCode 覆盖的标准算法/最佳实践，因此我不会降低性能？

17 回答

9
从https://github.com/dotnet/coreclr/pull/14863开始，有一种新方法可以生成超级简单的哈希码！写吧
```
public override int GetHashCode()
    => HashCode.Combine(field1, field2, field3);
```
这将生成高质量的哈希码，而无需担心实现细节 .
回复于 2024-04-25T23:52:13+08:00
1

在大多数情况下，Equals（）比较多个字段，如果你的GetHash（）在一个字段或多个字段上进行哈希处理并不重要 . 你只需要确保计算哈希值非常便宜（ No allocations ，请）和快速（ No heavy computations ，当然没有数据库连接）并提供良好的分发 .

繁重的工作应该是Equals（）方法的一部分;哈希应该是一个非常便宜的操作，以便在尽可能少的项目上调用Equals（） .

最后一个提示： Don't rely on GetHashCode() being stable over multiple aplication runs . 许多.Net类型不保证其哈希码在重新启动后保持不变，因此您只应在内存数据结构中使用GetHashCode（）的值 .

回复于 2024-04-25T23:52:13+08:00

这是我的简单方法 . 我正在使用经典的构建器模式 . 它是类型安全（没有装箱/拆箱），也适用于.NET 2.0（没有扩展方法等） .

它使用如下：

public override int GetHashCode()
{
    HashBuilder b = new HashBuilder();
    b.AddItems(this.member1, this.member2, this.member3);
    return b.Result;
}

这是真正的建设者类：

internal class HashBuilder
{
    private const int Prime1 = 17;
    private const int Prime2 = 23;
    private int result = Prime1;

    public HashBuilder()
    {
    }

    public HashBuilder(int startHash)
    {
        this.result = startHash;
    }

    public int Result
    {
        get
        {
            return this.result;
        }
    }

    public void AddItem<T>(T item)
    {
        unchecked
        {
            this.result = this.result * Prime2 + item.GetHashCode();
        }
    }

    public void AddItems<T1, T2>(T1 item1, T2 item2)
    {
        this.AddItem(item1);
        this.AddItem(item2);
    }

    public void AddItems<T1, T2, T3>(T1 item1, T2 item2, T3 item3)
    {
        this.AddItem(item1);
        this.AddItem(item2);
        this.AddItem(item3);
    }

    public void AddItems<T1, T2, T3, T4>(T1 item1, T2 item2, T3 item3, 
        T4 item4)
    {
        this.AddItem(item1);
        this.AddItem(item2);
        this.AddItem(item3);
        this.AddItem(item4);
    }

    public void AddItems<T1, T2, T3, T4, T5>(T1 item1, T2 item2, T3 item3, 
        T4 item4, T5 item5)
    {
        this.AddItem(item1);
        this.AddItem(item2);
        this.AddItem(item3);
        this.AddItem(item4);
        this.AddItem(item5);
    }        

    public void AddItems<T>(params T[] items)
    {
        foreach (T item in items)
        {
            this.AddItem(item);
        }
    }
}

回复于 2024-04-25T23:52:13+08:00

我在Helper库中有一个Hashing类，我将它用于此目的 .

/// <summary> 
/// This is a simple hashing function from Robert Sedgwicks Hashing in C book.
/// Also, some simple optimizations to the algorithm in order to speed up
/// its hashing process have been added. from: www.partow.net
/// </summary>
/// <param name="input">array of objects, parameters combination that you need
/// to get a unique hash code for them</param>
/// <returns>Hash code</returns>
public static int RSHash(params object[] input)
{
    const int b = 378551;
    int a = 63689;
    int hash = 0;

    // If it overflows then just wrap around
    unchecked
    {
        for (int i = 0; i < input.Length; i++)
        {
            if (input[i] != null)
            {
                hash = hash * a + input[i].GetHashCode();
                a = a * b;
            }
        }
    }

    return hash;
}

然后，只需将其用作：

public override int GetHashCode()
{
    return Hashing.RSHash(_field1, _field2, _field3);
}

我没有评估其表现，所以欢迎任何反馈 .

回复于 2024-04-25T23:52:13+08:00

这是the algorithm posted above by Jon Skeet的另一个流畅的实现，但不包括分配或装箱操作：

public static class Hash
{
    public const int Base = 17;

    public static int HashObject(this int hash, object obj)
    {
        unchecked { return hash * 23 + (obj == null ? 0 : obj.GetHashCode()); }
    }

    public static int HashValue<T>(this int hash, T value)
        where T : struct
    {
        unchecked { return hash * 23 + value.GetHashCode(); }
    }
}

用法：

public class MyType<T>
{
    public string Name { get; set; }

    public string Description { get; set; }

    public int Value { get; set; }

    public IEnumerable<T> Children { get; set; }

    public override int GetHashCode()
    {
        return Hash.Base
            .HashObject(this.Name)
            .HashObject(this.Description)
            .HashValue(this.Value)
            .HashObject(this.Children);
    }
}

由于泛型类型约束，编译器将确保不使用类调用 HashValue . 但是没有编译器支持 HashObject ，因为添加泛型参数也会添加装箱操作 .

回复于 2024-04-25T23:52:13+08:00

27
如果我们不超过8个属性（希望如此），这是另一种选择 .

ValueTuple 是一个结构，似乎具有可靠的 GetHashCode 实现 .

这意味着我们可以简单地这样做：
```
// Yay, no allocations and no custom implementations!
public override int GetHashCode() => (this.PropA, this.PropB).GetHashCode();
```
让's take a look at .NET Core'为 ValueTuple 的 GetHashCode 的当前实现 .

这是来自ValueTuple：
```
internal static int CombineHashCodes(int h1, int h2)
    {
        return HashHelpers.Combine(HashHelpers.Combine(HashHelpers.RandomSeed, h1), h2);
    }

    internal static int CombineHashCodes(int h1, int h2, int h3)
    {
        return HashHelpers.Combine(CombineHashCodes(h1, h2), h3);
    }
```
这是来自HashHelper：
```
public static readonly int RandomSeed = Guid.NewGuid().GetHashCode();

    public static int Combine(int h1, int h2)
    {
        unchecked
        {
            // RyuJIT optimizes this to use the ROL instruction
            // Related GitHub pull request: dotnet/coreclr#1830
            uint rol5 = ((uint)h1 << 5) | ((uint)h1 >> 27);
            return ((int)rol5 + h1) ^ h2;
        }
    }
```
用英语讲：
- 左旋（循环移位）h1 5个位置 .
- 将结果和h1一起添加 .
- 与h2的XOR结果 .
- 首先对{static random seed，h1}执行上述操作 .
- 对于每个其他项目，对前一个结果和下一个项目（例如h2）执行操作 .
很高兴知道这个ROL-5哈希码算法的属性 .

遗憾的是，推迟到 ValueTuple 为我们自己的 GetHashCode 可能没有我们想要和期望的那么快 . This comment在相关讨论中说明直接调用 HashHelpers.Combine 更具性能 . 另一方面，那个是内部的，所以我们负责用随机种子记住第一个 Combine . 如果我们跳过这一步，我不知道后果是什么 .
回复于 2024-04-25T23:52:13+08:00
1413
我通常会选择Josh Bloch精彩的Effective Java中给出的实现 . 它很快并且创建了一个非常好的哈希，不太可能导致冲突 . 选择两个不同的素数，例如17和23，并做：
```
public override int GetHashCode()
{
    unchecked // Overflow is fine, just wrap
    {
        int hash = 17;
        // Suitable nullity checks etc, of course :)
        hash = hash * 23 + field1.GetHashCode();
        hash = hash * 23 + field2.GetHashCode();
        hash = hash * 23 + field3.GetHashCode();
        return hash;
    }
}
```
正如评论中所指出的，你可能会发现小数字看起来倾向于使用素数，至少有类似的算法，其中经常使用非素数 . 例如，在稍后的非常FNV示例中，我've used numbers which apparently work well - but the initial value isn' t是素数 . （虽然乘法常数是素数 . 我不知道它有多重要 . ）

由于两个主要原因，这比 XOR 哈希码的常见做法更好 . 假设我们有一个包含两个 int 字段的类型：
```
XorHash(x, x) == XorHash(y, y) == 0 for all x, y
XorHash(x, y) == XorHash(y, x) for all x, y
```
顺便说一下，早期的算法是C＃编译器当前用于匿名类型的算法 .

This page提供了不少选项 . 我认为对于大多数情况来说，上面是"good enough"并且它非常容易记住并且正确 . FNV替代方法同样简单，但使用不同的常量和 XOR 而不是 ADD 作为组合操作 . 它看起来像下面的代码，但普通的FNV算法对单个字节进行操作，因此需要修改每个字节执行一次迭代，而不是每32位散列值 . FNV也是针对可变长度的数据而设计的，而我们的实际工作方式（在样本案例中也是如此）作为上述添加方法 .
```
// Note: Not quite FNV!
public override int GetHashCode()
{
    unchecked // Overflow is fine, just wrap
    {
        int hash = (int) 2166136261;
        // Suitable nullity checks etc, of course :)
        hash = (hash * 16777619) ^ field1.GetHashCode();
        hash = (hash * 16777619) ^ field2.GetHashCode();
        hash = (hash * 16777619) ^ field3.GetHashCode();
        return hash;
    }
}
```
请注意，有一点需要注意的是，理想情况下，在将其添加到依赖于哈希的集合之后，应该防止对等式敏感（因此对哈希码敏感）状态的更改码 .

根据documentation：

您可以为不可变引用类型重写GetHashCode . 通常，对于可变引用类型，只有在以下情况下才应覆盖GetHashCode：您可以从不可变的字段计算哈希代码;或者您可以确保在对象包含在依赖于其哈希代码的集合中时，可变对象的哈希码不会更改 .
回复于 2024-04-25T23:52:13+08:00

与nightcoder的解决方案非常相似，只是如果你愿意，它更容易提升素数 .

PS：这是你在嘴里呕吐一点的时间之一，知道这可以被重构成一种方法，有9个默认值，但它会慢一点，所以你只是闭上眼睛试着忘掉它 .

/// <summary>
/// Try not to look at the source code. It works. Just rely on it.
/// </summary>
public static class HashHelper
{
    private const int PrimeOne = 17;
    private const int PrimeTwo = 23;

    public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7, T8, T9, T10>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7, T8 arg8, T9 arg9, T10 arg10)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();
            hash = hash * PrimeTwo + arg3.GetHashCode();
            hash = hash * PrimeTwo + arg4.GetHashCode();
            hash = hash * PrimeTwo + arg5.GetHashCode();
            hash = hash * PrimeTwo + arg6.GetHashCode();
            hash = hash * PrimeTwo + arg7.GetHashCode();
            hash = hash * PrimeTwo + arg8.GetHashCode();
            hash = hash * PrimeTwo + arg9.GetHashCode();
            hash = hash * PrimeTwo + arg10.GetHashCode();

            return hash;
        }
    }

    public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7, T8, T9>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7, T8 arg8, T9 arg9)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();
            hash = hash * PrimeTwo + arg3.GetHashCode();
            hash = hash * PrimeTwo + arg4.GetHashCode();
            hash = hash * PrimeTwo + arg5.GetHashCode();
            hash = hash * PrimeTwo + arg6.GetHashCode();
            hash = hash * PrimeTwo + arg7.GetHashCode();
            hash = hash * PrimeTwo + arg8.GetHashCode();
            hash = hash * PrimeTwo + arg9.GetHashCode();

            return hash;
        }
    }

    public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7, T8>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7, T8 arg8)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();
            hash = hash * PrimeTwo + arg3.GetHashCode();
            hash = hash * PrimeTwo + arg4.GetHashCode();
            hash = hash * PrimeTwo + arg5.GetHashCode();
            hash = hash * PrimeTwo + arg6.GetHashCode();
            hash = hash * PrimeTwo + arg7.GetHashCode();
            hash = hash * PrimeTwo + arg8.GetHashCode();

            return hash;
        }
    }

    public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();
            hash = hash * PrimeTwo + arg3.GetHashCode();
            hash = hash * PrimeTwo + arg4.GetHashCode();
            hash = hash * PrimeTwo + arg5.GetHashCode();
            hash = hash * PrimeTwo + arg6.GetHashCode();
            hash = hash * PrimeTwo + arg7.GetHashCode();

            return hash;
        }
    }

    public static int GetHashCode<T1, T2, T3, T4, T5, T6>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();
            hash = hash * PrimeTwo + arg3.GetHashCode();
            hash = hash * PrimeTwo + arg4.GetHashCode();
            hash = hash * PrimeTwo + arg5.GetHashCode();
            hash = hash * PrimeTwo + arg6.GetHashCode();

            return hash;
        }
    }

    public static int GetHashCode<T1, T2, T3, T4, T5>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();
            hash = hash * PrimeTwo + arg3.GetHashCode();
            hash = hash * PrimeTwo + arg4.GetHashCode();
            hash = hash * PrimeTwo + arg5.GetHashCode();

            return hash;
        }
    }

    public static int GetHashCode<T1, T2, T3, T4>(T1 arg1, T2 arg2, T3 arg3, T4 arg4)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();
            hash = hash * PrimeTwo + arg3.GetHashCode();
            hash = hash * PrimeTwo + arg4.GetHashCode();

            return hash;
        }
    }

    public static int GetHashCode<T1, T2, T3>(T1 arg1, T2 arg2, T3 arg3)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();
            hash = hash * PrimeTwo + arg3.GetHashCode();

            return hash;
        }
    }

    public static int GetHashCode<T1, T2>(T1 arg1, T2 arg2)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();

            return hash;
        }
    }
}

回复于 2024-04-25T23:52:13+08:00

3
直到最近，我的答案将非常接近Jon Skeet在这里 . 但是，我最近启动了一个使用二次幂哈希表的项目，即哈希表，其中内部表的大小为8,16,32等 . 有一个很好的理由支持素数大小，但那里对于两种尺寸的尺寸也是一些优点 .

而且它非常糟糕 . 经过一些实验和研究后，我开始用以下方法重新哈希哈希：
```
public static int ReHash(int source)
{
  unchecked
  {
    ulong c = 0xDEADBEEFDEADBEEF + (ulong)source;
    ulong d = 0xE2ADBEEFDEADBEEF ^ c;
    ulong a = d += c = c << 15 | c >> -15;
    ulong b = a += d = d << 52 | d >> -52;
    c ^= b += a = a << 26 | a >> -26;
    d ^= c += b = b << 51 | b >> -51;
    a ^= d += c = c << 28 | c >> -28;
    b ^= a += d = d << 9 | d >> -9;
    c ^= b += a = a << 47 | a >> -47;
    d ^= c += b << 54 | b >> -54;
    a ^= d += c << 32 | c >> 32;
    a += d << 25 | d >> -25;
    return (int)(a >> 1);
  }
}
```
然后我的二次幂哈希表再也没有了 .

这让我感到不安，因为上述情况不应该起作用，除非原始的 GetHashCode() 以非常特殊的方式表现不佳 .

重新混合哈希码不能改进很好的哈希码，因为唯一可能的效果是我们引入了一些冲突 .

重新混合哈希码不能改善可怕的哈希码，因为唯一可能的效果是我们改变例如哈希码 . 值53的大量碰撞到大量的值18,3487,291 .

重新混合哈希码只能改进一个哈希码，它至少可以很好地避免整个范围内的绝对冲突（232个可能的值），但是当模数明显变大时，很难避免冲突（重新加载的额外工作将超过好处，但好处仍然存在） .

编辑：我也使用开放寻址，这也会增加对碰撞的敏感度，或许比它是二次幂的事实更多 .

好吧，令人不安的是.NET（或研究here）中的 string.GetHashCode() 实现可以通过这种方式改进多少（由于冲突更少，测试运行速度提高了约20-30倍），并且我自己的哈希码更加令人不安可以改进（远不止于此） .

All the GetHashCode() implementations I'd coded in the past, and indeed used as the basis of answers on this site, were much worse than I'd throught . 很多时候，大部分用途都是"good enough"，但我想要更好的东西 .

所以我把这个项目放在一边（无论如何都是一个宠物项目）并开始研究如何在.NET中快速生成一个好的，分布良好的哈希代码 .

最后我决定将SpookyHash移植到.NET . 实际上，上面的代码是使用SpookyHash从32位输入产生32位输出的快速路径版本 .

现在，SpookyHash不是一个很好的快速记住代码片段 . 我的端口更是如此，因为我为了更好的速度而手动插入了大量的内容* . 但这就是代码重用的目的 .

然后我将该项目放在一边，因为正如原始项目产生了如何生成更好的哈希代码的问题，因此该项目产生了如何生成更好的.NET memcpy的问题 .

然后我回来了，产生了很多重载，可以轻松地将所有本机类型（除了 decimal †）整合到哈希码中 .

它的速度很快，Bob Jenkins应该得到大部分功劳，因为我移植的原始代码更快，特别是在64位机器上，该算法针对‡进行了优化 .

完整的代码可以在https://bitbucket.org/JonHanna/spookilysharp/src看到，但请考虑上面的代码是它的简化版本 .

但是，由于它现在已经编写，因此可以更容易地使用它：
```
public override int GetHashCode()
{
  var hash = new SpookyHash();
  hash.Update(field1);
  hash.Update(field2);
  hash.Update(field3);
  return hash.Final().GetHashCode();
}
```
它还需要种子值，因此如果您需要处理不受信任的输入并希望防止Hash DoS攻击，您可以根据正常运行时间或类似情况设置种子，并使攻击者无法预测结果：
```
private static long hashSeed0 = Environment.TickCount;
private static long hashSeed1 = DateTime.Now.Ticks;
public override int GetHashCode()
{
  //produce different hashes ever time this application is restarted
  //but remain consistent in each run, so attackers have a harder time
  //DoSing the hash tables.
  var hash = new SpookyHash(hashSeed0, hashSeed1);
  hash.Update(field1);
  hash.Update(field2);
  hash.Update(field3);
  return hash.Final().GetHashCode();
}
```
*一个很大的惊喜就是手动内联一个返回 (x << n) | (x >> -n) 改进的旋转方法 . 我本可以肯定的是，对于我来说，抖动会内联，但是分析显示不然 .

† decimal 虽然来自C＃，但它不是.NET的本机 . 它的问题在于它自己的 GetHashCode() 将精度视为重要，而它自己的 Equals() 则不然 . 两者都是有效的选择，但不是那样的混合 . 在实现你自己的版本时，你需要选择做一个或另一个，但我可以't know which you'd想要 .

‡通过比较 . 如果在字符串上使用，64位上的SpookyHash比32位上的 string.GetHashCode() 快得多，这比64位上的 string.GetHashCode() 稍快，这比32位上的SpookyHash快得多，但仍然足够快，是一个合理的选择 .
回复于 2024-04-25T23:52:13+08:00

ReSharper用户可以使用 ReSharper -> Edit -> Generate Code -> Equality Members 生成GetHashCode，Equals和其他人 .

// ReSharper's GetHashCode looks like this
public override int GetHashCode() {
    unchecked {
        int hashCode = Id;
        hashCode = (hashCode * 397) ^ IntMember;
        hashCode = (hashCode * 397) ^ OtherIntMember;
        hashCode = (hashCode * 397) ^ (RefMember != null ? RefMember.GetHashCode() : 0);
        // ...
        return hashCode;
    }
}

回复于 2024-04-25T23:52:13+08:00

我使用选择作为上面答案的实现遇到浮点数和小数的问题 .

此测试失败（浮点数;哈希值相同，即使我将2个值切换为负值）：

var obj1 = new { A = 100m, B = 100m, C = 100m, D = 100m};
        var obj2 = new { A = 100m, B = 100m, C = -100m, D = -100m};
        var hash1 = ComputeHash(obj1.A, obj1.B, obj1.C, obj1.D);
        var hash2 = ComputeHash(obj2.A, obj2.B, obj2.C, obj2.D);
        Assert.IsFalse(hash1 == hash2, string.Format("Hashcode values should be different   hash1:{0}  hash2:{1}",hash1,hash2));

但是这个测试通过了（带有整数）：

var obj1 = new { A = 100m, B = 100m, C = 100, D = 100};
        var obj2 = new { A = 100m, B = 100m, C = -100, D = -100};
        var hash1 = ComputeHash(obj1.A, obj1.B, obj1.C, obj1.D);
        var hash2 = ComputeHash(obj2.A, obj2.B, obj2.C, obj2.D);
        Assert.IsFalse(hash1 == hash2, string.Format("Hashcode values should be different   hash1:{0}  hash2:{1}",hash1,hash2));

我改变了我的实现，不使用GetHashCode作为原始类型，它似乎工作得更好

private static int InternalComputeHash(params object[] obj)
    {
        unchecked
        {
            var result = (int)SEED_VALUE_PRIME;
            for (uint i = 0; i < obj.Length; i++)
            {
                var currval = result;
                var nextval = DetermineNextValue(obj[i]);
                result = (result * MULTIPLIER_VALUE_PRIME) + nextval;

            }
            return result;
        }
    }



    private static int DetermineNextValue(object value)
    {
        unchecked
        {

                int hashCode;
                if (value is short
                    || value is int
                    || value is byte
                    || value is sbyte
                    || value is uint
                    || value is ushort
                    || value is ulong
                    || value is long
                    || value is float
                    || value is double
                    || value is decimal)
                {
                    return Convert.ToInt32(value);
                }
                else
                {
                    return value != null ? value.GetHashCode() : 0;
                }
        }
    }

回复于 2024-04-25T23:52:13+08:00

这个不错：

/// <summary>
/// Helper class for generating hash codes suitable 
/// for use in hashing algorithms and data structures like a hash table. 
/// </summary>
public static class HashCodeHelper
{
    private static int GetHashCodeInternal(int key1, int key2)
    {
        unchecked
        {
           var num = 0x7e53a269;
           num = (-1521134295 * num) + key1;
           num += (num << 10);
           num ^= (num >> 6);

           num = ((-1521134295 * num) + key2);
           num += (num << 10);
           num ^= (num >> 6);

           return num;
        }
    }

    /// <summary>
    /// Returns a hash code for the specified objects
    /// </summary>
    /// <param name="arr">An array of objects used for generating the 
    /// hash code.</param>
    /// <returns>
    /// A hash code, suitable for use in hashing algorithms and data 
    /// structures like a hash table. 
    /// </returns>
    public static int GetHashCode(params object[] arr)
    {
        int hash = 0;
        foreach (var item in arr)
            hash = GetHashCodeInternal(hash, item.GetHashCode());
        return hash;
    }

    /// <summary>
    /// Returns a hash code for the specified objects
    /// </summary>
    /// <param name="obj1">The first object.</param>
    /// <param name="obj2">The second object.</param>
    /// <param name="obj3">The third object.</param>
    /// <param name="obj4">The fourth object.</param>
    /// <returns>
    /// A hash code, suitable for use in hashing algorithms and
    /// data structures like a hash table.
    /// </returns>
    public static int GetHashCode<T1, T2, T3, T4>(T1 obj1, T2 obj2, T3 obj3,
        T4 obj4)
    {
        return GetHashCode(obj1, GetHashCode(obj2, obj3, obj4));
    }

    /// <summary>
    /// Returns a hash code for the specified objects
    /// </summary>
    /// <param name="obj1">The first object.</param>
    /// <param name="obj2">The second object.</param>
    /// <param name="obj3">The third object.</param>
    /// <returns>
    /// A hash code, suitable for use in hashing algorithms and data 
    /// structures like a hash table. 
    /// </returns>
    public static int GetHashCode<T1, T2, T3>(T1 obj1, T2 obj2, T3 obj3)
    {
        return GetHashCode(obj1, GetHashCode(obj2, obj3));
    }

    /// <summary>
    /// Returns a hash code for the specified objects
    /// </summary>
    /// <param name="obj1">The first object.</param>
    /// <param name="obj2">The second object.</param>
    /// <returns>
    /// A hash code, suitable for use in hashing algorithms and data 
    /// structures like a hash table. 
    /// </returns>
    public static int GetHashCode<T1, T2>(T1 obj1, T2 obj2)
    {
        return GetHashCodeInternal(obj1.GetHashCode(), obj2.GetHashCode());
    }
}

以下是如何使用它：

private struct Key
{
    private Type _type;
    private string _field;

    public Type Type { get { return _type; } }
    public string Field { get { return _field; } }

    public Key(Type type, string field)
    {
        _type = type;
        _field = field;
    }

    public override int GetHashCode()
    {
        return HashCodeHelper.GetHashCode(_field, _type);
    }

    public override bool Equals(object obj)
    {
        if (!(obj is Key))
            return false;
        var tf = (Key)obj;
        return tf._field.Equals(_field) && tf._type.Equals(_type);
    }
}

回复于 2024-04-25T23:52:13+08:00

96
匿名类型

Microsoft已经提供了一个很好的通用HashCode生成器：只需将属性/字段值复制到匿名类型并对其进行哈希：
```
new { PropA, PropB, PropC, PropD }.GetHashCode();
```
这适用于任何数量的属性 . 它不使用拳击 . 它只是使用已在框架中为匿名类型实现的算法 .

ValueTuple - C＃7的更新

正如@cactuaroid在评论中提到的，可以使用值元组 . 这样可以节省一些按键，更重要的是可以在堆栈上执行（无垃圾）：
```
(PropA, PropB, PropC, PropD).GetHashCode();
```
（注意：使用匿名类型的原始技术似乎在堆上创建了一个对象，即垃圾，因为匿名类型是作为类实现的，尽管这可能会被编译器优化 . 对这些选项进行基准测试会很有趣，但是元组选项应该是优越的 . ）
回复于 2024-04-25T23:52:13+08:00

微软领先几种哈希方式......

//for classes that contain a single int value
return this.value;

//for classes that contain multiple int value
return x ^ y;

//for classes that contain single number bigger than int    
return ((int)value ^ (int)(value >> 32)); 

//for classes that contain class instance fields which inherit from object
return obj1.GetHashCode();

//for classes that contain multiple class instance fields which inherit from object
return obj1.GetHashCode() ^ obj2.GetHashCode() ^ obj3.GetHashCode();

我可以猜测，对于多个大的int你可以使用它：

int a=((int)value1 ^ (int)(value1 >> 32));
int b=((int)value2 ^ (int)(value2 >> 32));
int c=((int)value3 ^ (int)(value3 >> 32));
return a ^ b ^ c;

对于多类型也是如此：所有使用 GetHashCode() 首先转换为 int 然后int值将被xor'ed，结果就是你的哈希值 .

对于那些使用哈希作为ID（我的意思是一个唯一值）的人来说，哈希自然局限于多个数字，我认为哈希算法是5个字节，至少是MD5 .

您可以将多个值转换为散列值，其中一些值相同，因此不要将其用作标识符 . （也许有一天我会使用你的组件）

回复于 2024-04-25T23:52:13+08:00

319

这是我的哈希码助手 .
它的优点是它使用泛型类型参数，因此不会导致装箱：

public static class HashHelper
{
    public static int GetHashCode<T1, T2>(T1 arg1, T2 arg2)
    {
         unchecked
         {
             return 31 * arg1.GetHashCode() + arg2.GetHashCode();
         }
    }

    public static int GetHashCode<T1, T2, T3>(T1 arg1, T2 arg2, T3 arg3)
    {
        unchecked
        {
            int hash = arg1.GetHashCode();
            hash = 31 * hash + arg2.GetHashCode();
            return 31 * hash + arg3.GetHashCode();
        }
    }

    public static int GetHashCode<T1, T2, T3, T4>(T1 arg1, T2 arg2, T3 arg3, 
        T4 arg4)
    {
        unchecked
        {
            int hash = arg1.GetHashCode();
            hash = 31 * hash + arg2.GetHashCode();
            hash = 31 * hash + arg3.GetHashCode();
            return 31 * hash + arg4.GetHashCode();
        }
    }

    public static int GetHashCode<T>(T[] list)
    {
        unchecked
        {
            int hash = 0;
            foreach (var item in list)
            {
                hash = 31 * hash + item.GetHashCode();
            }
            return hash;
        }
    }

    public static int GetHashCode<T>(IEnumerable<T> list)
    {
        unchecked
        {
            int hash = 0;
            foreach (var item in list)
            {
                hash = 31 * hash + item.GetHashCode();
            }
            return hash;
        }
    }

    /// <summary>
    /// Gets a hashcode for a collection for that the order of items 
    /// does not matter.
    /// So {1, 2, 3} and {3, 2, 1} will get same hash code.
    /// </summary>
    public static int GetHashCodeForOrderNoMatterCollection<T>(
        IEnumerable<T> list)
    {
        unchecked
        {
            int hash = 0;
            int count = 0;
            foreach (var item in list)
            {
                hash += item.GetHashCode();
                count++;
            }
            return 31 * hash + count.GetHashCode();
        }
    }

    /// <summary>
    /// Alternative way to get a hashcode is to use a fluent 
    /// interface like this:

    /// return 0.CombineHashCode(field1).CombineHashCode(field2).
    ///     CombineHashCode(field3);
    /// </summary>
    public static int CombineHashCode<T>(this int hashCode, T arg)
    {
        unchecked
        {
            return 31 * hashCode + arg.GetHashCode();   
        }
    }

它还有扩展方法来提供流畅的界面，所以你可以像这样使用它：

public override int GetHashCode()
{
    return HashHelper.GetHashCode(Manufacturer, PartN, Quantity);
}

或者像这样：

public override int GetHashCode()
{
    return 0.CombineHashCode(Manufacturer)
        .CombineHashCode(PartN)
        .CombineHashCode(Quantity);
}

回复于 2024-04-25T23:52:13+08:00

这是我使用Jon Skeet's implementation的助手类 .

public static class HashCode
{
    public const int Start = 17;

    public static int Hash<T>(this int hash, T obj)
    {
        var h = EqualityComparer<T>.Default.GetHashCode(obj);
        return unchecked((hash * 31) + h);
    }
}

用法：

public override int GetHashCode()
{
    return HashCode.Start
        .Hash(_field1)
        .Hash(_field2)
        .Hash(_field3);
}

如果要避免为System.Int32编写扩展方法：

public struct HashCode
{
    private readonly int _value;

    public HashCode(int value) => _value = value;

    public static HashCode Start { get; } = new HashCode(17);

    public static implicit operator int(HashCode hash) => hash._value;

    public HashCode Hash<T>(T obj)
    {
        var h = EqualityComparer<T>.Default.GetHashCode(obj);
        return unchecked(new HashCode((_value * 31) + h));
    }

    public override int GetHashCode() => _value;
}

它仍然是通用的，它仍然避免任何堆分配，它使用完全相同的方式：

public override int GetHashCode()
{
    // This time `HashCode.Start` is not an `Int32`, it's a `HashCode` instance.
    // And the result is implicitly converted to `Int32`.
    return HashCode.Start
        .Hash(_field1)
        .Hash(_field2)     
        .Hash(_field3);
}

Update after Martin's comment:

obj != null 导致拳击所以我切换到默认的比较器 .

有关默认比较器的性能，请参阅this answer .
有关空值的哈希码的讨论，请参阅this question .

Edit (May 2018):

EqualityComparer<T>.Default getter现在是一个JIT内在 - 由Stephen Toub在this blog post中提到pull request .

回复于 2024-04-25T23:52:13+08:00

0
我的大部分工作都是通过数据库连接完成的，这意味着我的类都具有数据库中的唯一标识符 . 我总是使用数据库中的ID来生成哈希码 .
```
// Unique ID from database
private int _id;

...    
{
  return _id.GetHashCode();
}
```
回复于 2024-04-25T23:52:13+08:00

覆盖System.Object.GetHashCode的最佳算法是什么？

17 回答

匿名类型

ValueTuple - C＃7的更新

相关问题