要實現對象的相等比較,需要實現IEquatable<T>,或單獨寫一個類實現IEqualityComparer<T>接口。
像List<T>的Contains這樣的函數,如果我們自己定義的對象不實現IEquatable<T>接口,這個函數會默認調用object的Equels來比較對象,得出非預期的結果。
先自定義一個類:
public class DaichoKey { public int ID { get; set; } public int SubID { get; set; } }
List<DaichoKey> lst = new List<DaichoKey>() { new DaichoKey(){ID = 1,SubID =2}, new DaichoKey(){ID = 1,SubID = 3} }; var newItem = new DaichoKey() { ID = 1, SubID = 2 }; bool isContains = lst.Contains(newItem);//false
上面的代碼調用Contains后得到false,我們預想1和2的對象都已經存在了,應該得到true才對呀。
要實現這個效果,需要實現IEquatable<T>接口。
public class DaichoKey : IEquatable<DaichoKey> { public int ID { get; set; } public int SubID { get; set; } public bool Equals(DaichoKey other) { return this.ID == other.ID && this.SubID == other.SubID; } }
經過上面的改良,結果如我們預期了,但是還不夠完善,微軟建議我們重寫object的Equels方法我GetHashCode方法,以保持語義的一致性,於是有了下面的代碼:
public class DaichoKey : IEquatable<DaichoKey> { public int ID { get; set; } public int SubID { get; set; } public bool Equals(DaichoKey other) { return this.ID == other.ID && this.SubID == other.SubID; } public override bool Equals(object obj) { if (obj == null) return base.Equals(obj); if (obj is DaichoKey) return Equals(obj as DaichoKey); else throw new InvalidCastException("the 'obj' Argument is not a DaichoKey object"); } public override int GetHashCode() { return base.GetHashCode();//return object's hashcode } }
上面的代碼依然還有缺陷,沒重寫==和!=運算符,但這不是本文討論的重點。繞了一大圈,終於來到了GetHashCode函數身上,貌似他對我們的Contains函數沒有啥影響呀,不重寫又何妨?我們再來試試List<T>的一個擴展函數Distinct:
List<DaichoKey> lst = new List<DaichoKey>() { new DaichoKey(){ID = 1,SubID =2}, new DaichoKey(){ID = 1,SubID = 3} }; var newItem = new DaichoKey() { ID = 1, SubID = 2 }; lst.Add(newItem); if (lst != null) { lst = lst.Distinct<DaichoKey>().ToList(); } //result: //1 2 //1 3 //1 2
悲劇發生了,數據1,2的重復數據沒有被去掉呀,我們不是實現了IEquatable<T>接口接口嗎。在園子上找到了一篇文章(c# 擴展方法奇思妙用基礎篇八:Distinct 擴展),在回復中提到要將GetHashCode返回固定值,以強制調用IEquatable<T>的Equels方法。如下:
public class DaichoKey : IEquatable<DaichoKey> { public int ID { get; set; } public int SubID { get; set; } public bool Equals(DaichoKey other) { return this.ID == other.ID && this.SubID == other.SubID; } public override bool Equals(object obj) { if (obj == null) return base.Equals(obj); if (obj is DaichoKey) return Equals(obj as DaichoKey); else throw new InvalidCastException("the 'obj' Argument is not a DaichoKey object"); } public override int GetHashCode() { return 0;//base.GetHashCode(); } }
結果立馬就對了,難道是這個Distinct函數在比較時,先比較的HashCode值?
帶着這個疑問,反編譯了下Distinct的代碼,確實如我所猜測的那樣。下面是源代碼,有興趣的同學,可以往下看看:
public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source) { if (source == null) throw Error.ArgumentNull("source"); return DistinctIterator<TSource>(source, null); } private static IEnumerable<TSource> DistinctIterator<TSource>(IEnumerable<TSource> source, IEqualityComparer<TSource> comparer) { <DistinctIterator>d__81<TSource> d__ = new <DistinctIterator>d__81<TSource>(-2); d__.<>3__source = source; d__.<>3__comparer = comparer; return d__; } private sealed class <DistinctIterator>d__81<TSource> : IEnumerable<TSource>, IEnumerable, IEnumerator<TSource>, IEnumerator, IDisposable { // Fields private int <>1__state; private TSource <>2__current; public IEqualityComparer<TSource> <>3__comparer; public IEnumerable<TSource> <>3__source; public IEnumerator<TSource> <>7__wrap84; private int <>l__initialThreadId; public TSource <element>5__83; public Set<TSource> <set>5__82; public IEqualityComparer<TSource> comparer; public IEnumerable<TSource> source; // Methods [DebuggerHidden] public <DistinctIterator>d__81(int <>1__state); private void <>m__Finally85(); private bool MoveNext(); [DebuggerHidden] IEnumerator<TSource> IEnumerable<TSource>.GetEnumerator(); [DebuggerHidden, TargetedPatchingOptOut("Performance critical to inline this type of method across NGen image boundaries")] IEnumerator IEnumerable.GetEnumerator(); [DebuggerHidden] void IEnumerator.Reset(); void IDisposable.Dispose(); // Properties TSource IEnumerator<TSource>.Current { [DebuggerHidden] get; } object IEnumerator.Current { [DebuggerHidden] get; } } private sealed class <DistinctIterator>d__81<TSource> : IEnumerable<TSource>, IEnumerable, IEnumerator<TSource>, IEnumerator, IDisposable { // Fields private int <>1__state; private TSource <>2__current; public IEqualityComparer<TSource> <>3__comparer; public IEnumerable<TSource> <>3__source; public IEnumerator<TSource> <>7__wrap84; private int <>l__initialThreadId; public TSource <element>5__83; public Set<TSource> <set>5__82; public IEqualityComparer<TSource> comparer; public IEnumerable<TSource> source; // Methods [DebuggerHidden] public <DistinctIterator>d__81(int <>1__state); private void <>m__Finally85(); private bool MoveNext(); [DebuggerHidden] IEnumerator<TSource> IEnumerable<TSource>.GetEnumerator(); [DebuggerHidden, TargetedPatchingOptOut("Performance critical to inline this type of method across NGen image boundaries")] IEnumerator IEnumerable.GetEnumerator(); [DebuggerHidden] void IEnumerator.Reset(); void IDisposable.Dispose(); // Properties TSource IEnumerator<TSource>.Current { [DebuggerHidden] get; } object IEnumerator.Current { [DebuggerHidden] get; } } private bool MoveNext() { bool flag; try { switch (this.<>1__state) { case 0: this.<>1__state = -1; this.<set>5__82 = new Set<TSource>(this.comparer); this.<>7__wrap84 = this.source.GetEnumerator(); this.<>1__state = 1; goto Label_0092; case 2: this.<>1__state = 1; goto Label_0092; default: goto Label_00A5; } Label_0050: this.<element>5__83 = this.<>7__wrap84.Current; if (this.<set>5__82.Add(this.<element>5__83)) { this.<>2__current = this.<element>5__83; this.<>1__state = 2; return true; } Label_0092: if (this.<>7__wrap84.MoveNext()) goto Label_0050; this.<>m__Finally85(); Label_00A5: flag = false; } fault { this.System.IDisposable.Dispose(); } return flag; } internal class Set<TElement> { // Fields private int[] buckets; private IEqualityComparer<TElement> comparer; private int count; private int freeList; private Slot<TElement>[] slots; // Methods [TargetedPatchingOptOut("Performance critical to inline this type of method across NGen image boundaries")] public Set(); public Set(IEqualityComparer<TElement> comparer); public bool Add(TElement value); [TargetedPatchingOptOut("Performance critical to inline this type of method across NGen image boundaries")] public bool Contains(TElement value); private bool Find(TElement value, bool add); internal int InternalGetHashCode(TElement value); public bool Remove(TElement value); private void Resize(); // Nested Types [StructLayout(LayoutKind.Sequential)] internal struct Slot { internal int hashCode; internal TElement value; internal int next; } } public bool Add(TElement value) { return !this.Find(value, true); } public bool Contains(TElement value) { return this.Find(value, false); } private bool Find(TElement value, bool add) { int hashCode = this.InternalGetHashCode(value); for (int i = this.buckets[hashCode % this.buckets.Length] - 1; i >= 0; i = this.slots[i].next) { if (this.slots[i].hashCode == hashCode && this.comparer.Equals(this.slots[i].value, value)) return true;//就是這一句了 } if (add) { int freeList; if (this.freeList >= 0) { freeList = this.freeList; this.freeList = this.slots[freeList].next; } else { if (this.count == this.slots.Length) this.Resize(); freeList = this.count; this.count++; } int index = hashCode % this.buckets.Length; this.slots[freeList].hashCode = hashCode; this.slots[freeList].value = value; this.slots[freeList].next = this.buckets[index] - 1; this.buckets[index] = freeList + 1; } return false; }
在這段代碼中可以看出,擴展函數Distinct在內部使用了一個Set<T>的類來幫助踢掉重復數據,而這個內部類使用的是hash表的方式存儲數據,所以會調用到我們自定義類的GetHashCode函數,如果返回的hashcode值不等,它就不會再調用Equels方法進行比較了。
原因已經一目了然了,得出的結論就是:
1,重寫Equles方法的時候,盡量重寫GetHashCode函數,並且不要簡單的調用object的GetHashCode函數,返回一個設計合理的hash值,以保證結果如我們的預期。上面的做法直接返回了0,雖然解決了問題,但明顯不是每個對象的hash值都是0,做法欠妥。
2,List<T>的Contains,IndexOf方法,不會用到GetHashCode函數。
3,擴展函數Distinct,Except用到了GetHashCode函數,必須重寫這個函數。其他還有哪些函數用到了GetHashCode函數,以后再做補充,使用時多加注意就是了。
4,如果對象要作為字典類(Dictionary)的主鍵,必須重寫GetHashCode函數。
2014/07/08 補充
5,HashSet等容器的Add方法內部,也是先判斷GetHashCode,如果GetHashCode值相等,進一步判斷Equals方法是否相等來確定對象的相等性。
所以,Equals是相等的,那么GetHashCode也必須要保證相等。相反卻不一定,GetHashCode相等,Equals方法可以不等。
6,改變影響GetHashCode返回值的字段值,會造成對象的HashCode值變化,如果對象已經存入了HashSet等容器中,將會是HashSet找不到這個對象,從而使得Remove等方法失敗。
Point a = new Point(1, 2); Point b = new Point(1, 2); HashSet<Point> hashSet = new HashSet<Point>(); hashSet.Add(a); hashSet.Remove(b); //能刪除a嗎?答案是可以 //hashset的Count變為0,原因就是我們重新了Equals方法,a和 //b被認為相等的。
7,記錄一個自定義值類型重寫GetHashCode等方法的完整實現,作為參考。

1 public struct Point 2 { 3 private int x; 4 private int y; 5 public Point(int x, int y) 6 { 7 this.x = x; 8 this.y = y; 9 } 10 public int X 11 { 12 get { return x; } 13 } 14 public int Y 15 { 16 get { return y; } 17 } 18 19 public static bool operator ==(Point left,Point right) 20 { 21 if (object.ReferenceEquals(left, null)) 22 return object.ReferenceEquals(right, null); 23 return left.Equals(right); 24 } 25 26 public static bool operator !=(Point left, Point right) 27 { 28 return !(left == right); 29 } 30 31 public override bool Equals(object obj) 32 { 33 if (obj.GetType() != typeof(Point)) 34 return false; 35 Point other = (Point)obj; 36 return this.x == other.x && this.y == other.y; 37 } 38 39 public override int GetHashCode() 40 { 41 return x.GetHashCode() ^ y.GetHashCode(); 42 } 43 }