Discussion on the correct evaluation method of hashCode in Java

  • 2021-01-19 22:13:51
  • OfStack

This paper mainly studies the relevant content of the correct evaluation method of hashCode in Java, as follows.

The hash table has an optimization that caches the hash codes (hashCode) of the objects. If the hash codes do not match, the objects are considered to be different objects without checking their equality. If the hash codes (hashCode) are equal, the equality of the object is detected (equals).

If objects have the same hash code (hashCode), they are mapped to the same hash bucket. If all the objects in the hash table have the same hash code (hashCode), the hash table will degenerate to a linked list (linked list), greatly reducing its query efficiency.

A good hash function tends to "produce unequal hash codes for objects that don't want to wait". Ideally, the hash function should distribute all the unwanted instances of the set evenly over all possible hashes. However, it is very difficult to achieve this ideal. Here is a relatively simple and effective hash method:

1. Store a non-zero constant value, such as 17, in a variable of type int called result.

2. For each key domain f in the object (each domain involved in the equals method), complete the following steps:

The hash code c of type int is computed for this domain If the field is of type boolean, the calculation (f ? 1:0) If the domain is of type byte, char, short, or int, calculate ((int) f) If the field is of type long, evaluate (int) (f ^ (f > > > 32 ) ) If the field is of type float, calculate Float.floatToIntBits(f) If the field is of type double, calculate Double.doubleToLongBits(f), and then follow the steps above to calculate the hash value for the resulting long value If the field is an object reference and the equals method of that class compares its fields by recursively calling equals, then recursively calling hashCode for the same field as above If the field is an array, then each element is treated as a separate field, recursively applying the above principles, or using the Arrays.hashCode method directly if each element in the array is important. According to the following formula, the hash code c obtained in the above steps is successively merged into result: result = 31 * result + c;  The multiplication is to get a better hash function. For example, if String's hash function omits multiplication, then all strings that differ in alphabetical order will have the same hash code. I chose 31 because it's an odd prime number. If the multiplier is even, and the multiplication overflows, the information is lost, because multiplying by two is equivalent to displacement. The benefits of using prime numbers are not obvious, but it is customary to use prime numbers to calculate hash results. 31 has a nice feature that replaces multiplication with shift and subtraction, which gives better performance: 31 * i == (i < < 5) - i. The current VM can automatically achieve this optimization.

If a class is immutable (all fields are ES69en modifier, and all fields are primitive or immutable classes), and the overhead of calculating the hash code is high, then you should consider caching the hash code inside the object.


public class HashCodeDemo {
  static class HashCodeClass {
    private final boolean bResult;
    private final byte byteValue;
    private final char charValue;
    private final short shortValue;
    private final int intValue;
    private final long longValue;
    private final float floatValue;
    private final double doubleValue;
    private final String str;
    private final int[] arrayValue;

    //volatile Indicates that the variable is accessed in memory each time to ensure that the variable is up to date 
    private volatile int hashCode;

    public HashCodeClass() {
      bResult = false;
      byteValue = 1;
      charValue = 'a';
      shortValue = 1;
      intValue = 1;
      longValue = 1l;
      floatValue = 1.0f;
      doubleValue = 1.0d;
      str = getClass().getName();
      arrayValue = new int[] {1,2,3,4,5};
    }

    @Override
    public int hashCode() {
      if(hashCode == 0) {
        //  Set up the 1 A non-zero initial value that increases the conflict in the zero domain 
        int result = 17;
        //  If you omit the multiplier, then all strings that differ only in alphabetical order will have the same hash code 
        final int HASH_CODE = 31;
        result = HASH_CODE * result + (bResult ? 1 : 0);
        result = HASH_CODE * result + byteValue;
        result = HASH_CODE * result + charValue;
        result = HASH_CODE * result + shortValue;
        result = HASH_CODE * result + intValue;
        result = HASH_CODE * result + (int) (longValue ^ (longValue >>> 32));
        result = HASH_CODE * result + Float.floatToIntBits(floatValue);
        long doubleLongValue = Double.doubleToLongBits(doubleValue);
        result = HASH_CODE * result + (int) (doubleLongValue ^ (doubleLongValue >>> 32));
        result = HASH_CODE * result + (str == null ? 0 : str.hashCode());
        System.out.println("str=" + str + ", str.hashCode=" + str.hashCode());
        result = HASH_CODE * result + arrayValue.hashCode();
        return result;
      } 
      return hashCode;
    }
  }

  public static void main(String[] args) {
    HashCodeClass obj = new HashCodeClass();
    System.out.println("obj.hashCode=" + obj.hashCode());
    System.out.println("obj="+obj.toString());
  }
}

The output


str=com.demo.test.HashCodeDemo$HashCodeClass, str.hashCode=-205823051
obj.hashCode=946611167
str=com.demo.test.HashCodeDemo$HashCodeClass, str.hashCode=-205823051
obj=com.demo.test.HashCodeDemo$HashCodeClass@386c23df

conclusion

The above is all the content of this article about the correct evaluation method of hashCode in Java, I hope to be helpful to you. Interested friends can continue to refer to the site of other related topics, if there are shortcomings, welcome to leave a message to point out. Thank you for your support!


Related articles: