none
Overriding GetHashCode, again... RRS feed

  • Question

  • Im having some trouble understanding how to override GetHashCode properly. My struct has two value, a Guid and a Int32. The struct overrides ToString as {guid}#{int}. How should the GetHasCode-override look? My options:

    1. return _guid.GetHashCode() + _int
    2. return _guid.GetHashCode() ^ _int
    3. return this.ToString().GetHashCode()
    4. int hash = 27;
    hash
    = (13 * hash) + _guid.GetHashCode();
    hash
    = (13 * hash) + _int
    return hash;
    5. other suggestions?
    Thursday, September 24, 2009 6:35 AM

Answers

  • Jeffrey Richter, in CLR Via C#, offers the following [paraphrased] guidelines when designing your algorithm
     - for performance use an algorithm that gives a good random distribution
     - you can call the base type's GetHashCode but avoid calling System.Object's GetHashCode (not performant)
     - use at least one instance field
     - if your objects are going to be used as keys in a hashtable/dictionary the fields you use should ideally be immutable.
     - objects that have the same value should return the same hash code value.
    • Proposed as answer by eryang Monday, September 28, 2009 7:52 AM
    • Marked as answer by eryang Friday, October 9, 2009 6:08 AM
    Thursday, September 24, 2009 7:17 AM
  • It rarely matters much.  All that GetHashCode() has to do is to generate the exact same number for the exact same value.  A great bonus is when it can do so by generating consistently different numbers for different values.  That will make your code more efficient.  All of the ones you've listed will get the job done, pick the fast one (not ToString).  I always use the xor one.

    Hans Passant.
    • Proposed as answer by eryang Monday, September 28, 2009 7:52 AM
    • Marked as answer by nobugzModerator Sunday, October 4, 2009 4:37 PM
    Friday, September 25, 2009 1:04 AM
    Moderator

All replies

  • I don't have much experience with this.  I have only done an override of GetHashCode() and just to shut up the compiler (it was throwing a warning).

    From what I read in MSDN Online at that time, GetHashCode() does not have many rules.  Only 1:  That two classes with the same internal (and meaningful) state return the same hash value every single time.  Or put differently, any given object must return the same hash value every time is called so long its internal, meaninful state remains unchanged.

    Since there is only one rule, you don't need to worry about different instances returning the same hash value.  This is allowed.

    Besides this, just go for it.  All of your options seem to be deterministic (return the same value given the same input), and therefore should serve as good hashing algorithms.

    Finally, consider the application of the value.  Is the value going to be used for something?  The HashTable collection class uses this function.  If you are using it, your hashing method will have an impact in the performance of this class.  Study it and base your decision on your analysis.
    MCP
    Thursday, September 24, 2009 6:55 AM
  • Jeffrey Richter, in CLR Via C#, offers the following [paraphrased] guidelines when designing your algorithm
     - for performance use an algorithm that gives a good random distribution
     - you can call the base type's GetHashCode but avoid calling System.Object's GetHashCode (not performant)
     - use at least one instance field
     - if your objects are going to be used as keys in a hashtable/dictionary the fields you use should ideally be immutable.
     - objects that have the same value should return the same hash code value.
    • Proposed as answer by eryang Monday, September 28, 2009 7:52 AM
    • Marked as answer by eryang Friday, October 9, 2009 6:08 AM
    Thursday, September 24, 2009 7:17 AM
  • I like the (a * 37 + b) method that's popular among Java users, but remember to use unchecked so you won't get an Int32 overflow:

    public override int GetHashCode() {
        unchecked { return _guid.GetHashCode() * 37 + _int; }
    }

    And remember that you should only use immutable fields as the basis for your hash code, or you'll get in trouble when the struct is modified after being put in a hashtable.  (But since it's a struct it's probably immutable anyway.)
    Thursday, September 24, 2009 10:43 AM
  • It rarely matters much.  All that GetHashCode() has to do is to generate the exact same number for the exact same value.  A great bonus is when it can do so by generating consistently different numbers for different values.  That will make your code more efficient.  All of the ones you've listed will get the job done, pick the fast one (not ToString).  I always use the xor one.

    Hans Passant.
    • Proposed as answer by eryang Monday, September 28, 2009 7:52 AM
    • Marked as answer by nobugzModerator Sunday, October 4, 2009 4:37 PM
    Friday, September 25, 2009 1:04 AM
    Moderator