locked
uniqueness of GetHashCode RRS feed

  • Question

  • If I understand the docs correctly the Hashcodes returned by GetHashCode of two given objects have to be the same if they are equal (i.e. .Equals and .Operator== return true). However if the two objects are not equal (i.e. .Equals and .Operator== return false).  the Hashcodes don't necessarily have to be different.

    In other words: based on GetHashCode you can only find out if two objects are not equal but not if two objects are equal, because if two objects are different it is theoretically still possible that they have the same Hashcode. So it is not suggested to use GetHashCode to uniquely identify an object.  

    This of course makes sense just because of the simple fact that there are only a limited number of possible hashcode values available (2^32).

    But what about object.GetHashCode(). How is it implemented? How exactly does it build the hashcode? Does it use the actual adress (reference handle) of that object? May I use this to uniquely Identify a reference to an object? i.e. if two instances do not "point" to same reference (object.ReferenceEquals returns false), is it guaranteed that their Hashcodes are not equal?

     

    Where in the Rotor sscli do I find the object.getHashcode() implementation ?

    Tuesday, January 9, 2007 12:40 PM

Answers

  • The hash is generated from the SyncBlock's hash which is generated by ObjHeader::GetSyncBlock() which is called by ObjectNative::GetHashCodeEx().  The hash has 26 relevant bits.
    Tuesday, January 9, 2007 7:53 PM

All replies

  • Rotor Implementation of GetHashCode:


    // follow the necessary rules to get a new valid hashcode for an object

    DWORD Object::ComputeHashCode()
    {
    DWORD hashCode;

    // note that this algorithm now uses at most HASHCODE_BITS so that it will

    // fit into the objheader if the hashcode has to be moved back into the objheader

    // such as for an object that is being frozen

    do

    {
    // we use the high order bits in this case because they're more random

    hashCode = GetThread()->GetNewHashCode() >> (32-HASHCODE_BITS);
    }
    while (hashCode == 0); // need to enforce hashCode != 0

    // verify that it really fits into HASHCODE_BITS

    _ASSERTE((hashCode & ((1<<HASHCODE_BITS)-1)) == hashCode);
    return hashCode;
    }

    inline DWORD GetNewHashCode()
    {
    LEAF_CONTRACT;
    // Every thread has its own generator for hash codes so that we won't get into a situation

    // where two threads consistently give out the same hash codes.

    // Choice of multiplier guarantees period of 2**32 - see Knuth Vol 2 p16 (3.2.1.2 Theorem A).

    DWORD multiplier = m_ThreadId*4 + 5;
    m_dwHashCodeSeed = m_dwHashCodeSeed*multiplier + 1;
    return m_dwHashCodeSeed;
    }


     

     

    It seems that there object refernce is not involved in any way , when computing the hashcode for an object. Instead some sort of dedicated hashcode generator of the thread is used ...

    Tuesday, January 9, 2007 1:15 PM
  • Whatever Rotor does in GetHashCode is irrelevant because that may change from one version of the framework to another. From Object.GetHasCode documentation in MSDN:

    "The default implementation of the GetHashCode method does not guarantee unique return values for different objects. Furthermore, the .NET Framework does not guarantee the default implementation of the GetHashCode method, and the value it returns will be the same between different versions of the .NET Framework. Consequently, the default implementation of this method must not be used as a unique object identifier for hashing purposes."

     

    Tuesday, January 9, 2007 1:59 PM
  • Yeah, GetHashCode doesn't always return different values, as I found yesterday.
    Tuesday, January 9, 2007 4:02 PM
  • The hash is generated from the SyncBlock's hash which is generated by ObjHeader::GetSyncBlock() which is called by ObjectNative::GetHashCodeEx().  The hash has 26 relevant bits.
    Tuesday, January 9, 2007 7:53 PM
  • Hello, everybody
    i have question regarding uniqueness of hashcode. i have a class point in which i calculate hash code of the point object as

                return x.GetHashCode() ^ y.GetHashCode() ^ z.GetHashCode();

    where as i found a problem that i have two different objects with different values but their hashcode is same... i don't know if it is possible that we have different values but same hashcode.
    (-1.5,-16,19) and (-1.5,-18,17).

        double x1=-1.5,x2=-16 ,x3=19 ,x4=-6,x5=-18,x6=17 ;

                Console.WriteLine(x1.GetHashCode() ^ x2.GetHashCode() ^ x3.GetHashCode());
                Console.WriteLine(x1.GetHashCode() ^ x5.GetHashCode() ^ x6.GetHashCode());
                Console.WriteLine(x4.GetHashCode() ^ x5.GetHashCode() ^ x6.GetHashCode()); 
    Output:

        //1073414144
                //1073414144
                //1075511296

    can anybody tell me if it is the special case that we have same hashcode, or my method of calculating hashcode is wrong.

    yasar
    Monday, March 23, 2009 3:28 PM
  • >Whatever Rotor does in GetHashCode is irrelevant because that may change from one version of the framework to another.

    I do want to specifically point out that, in a roundabout way, the documentation does state that the implementation is consistent with reference equality.  The algorithm might change, but we should be able to rely on the consistency with reference equality.  This feature is incredibly useful as it allows dictionaries to be keyed by arbitrary objects.

    Here is the text (http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx):
    [quote]
    For derived classes of Object, the GetHashCode method can delegate to the Object.GetHashCode implementation, if and only if that derived class defines value equality to be reference equality and the type is not a value type.
    [/quote]

    Sometimes developers seem afraid of Object.GetHashCode due to that line of documentation that Mike pointed out.  Basically, that is really just saying that the values are not unique though people interpret it as saying that the hash might not be consistent.  I think that part could have been written better.

    • Edited by Jason Kresowaty Wednesday, April 1, 2009 12:18 AM Fix posting font
    Wednesday, April 1, 2009 12:14 AM