Comparison of the Java8 and Java7 HashMap source implementations

  • 2020-06-01 09:38:10
  • OfStack

1. Introduction to the principle of HashMap

This is too old a story to be explained in detail.

1 sentence summary: HashMap is a hash table that stores a key-value pair (key-value) mapping.

2. Source code analysis of HashMap in Java 7

The first is HashMap The constructor code block 1, according to the initialization Capacity with loadFactor (load factor) initialization HashMap .


// The code block 1
 public HashMap(int initialCapacity, float loadFactor) {
 if (initialCapacity < 0)
  throw new IllegalArgumentException("Illegal initial capacity: " +
      initialCapacity);
 if (initialCapacity > MAXIMUM_CAPACITY)
  initialCapacity = MAXIMUM_CAPACITY;
 if (loadFactor <= 0 || Float.isNaN(loadFactor))
  throw new IllegalArgumentException("Illegal load factor: " +loadFactor);

 this.loadFactor = loadFactor;
 threshold = initialCapacity;
 init();
 }

For in the Java7 <key1,value1> the put Method implementation is relatively simple, according to the first key1 the key Value calculation hash Value, and then according to this hash Values and Capacity0 the Capacity1 Determine if the key Where the Capacity3 , if the current location Capacity4 Don't for Capacity5 In this Capacity4 Traverse the chain, if found hash Values and key If all values are the same, the value will be Capacity9 Overwrite, return loadFactor0 ; If the current location Capacity4 for Capacity5 The direct loadFactor3 .


 The code block 2
public V put(K key, V value) {
 if (table == EMPTY_TABLE) {
  inflateTable(threshold);
 }
 if (key == null)
  return putForNullKey(value);
 int hash = hash(key);
 int i = indexFor(hash, table.length);
 for (Entry<K,V> e = table[i]; e != null; e = e.next) {
  Object k;
  if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
  V oldValue = e.value;
  e.value = value;
  e.recordAccess(this);
  return oldValue;
  }
 }

 modCount++;
 addEntry(hash, key, value, i);
 return null;
 }

//addEntry The current is checked in the method table If you need resize
 void addEntry(int hash, K key, V value, int bucketIndex) {
 if ((size >= threshold) && (null != table[bucketIndex])) {
  resize(2 * table.length); // The current map In the size  If more than threshole Is the threshold value of resize will table the length expand 2 Times. 
  hash = (null != key) ? hash(key) : 0;
  bucketIndex = indexFor(hash, table.length);
 }

 createEntry(hash, key, value, bucketIndex);
 }

In the Java7 loadFactor4 Method implementation is relatively simple, will loadFactor5 The length of the extension and will loadFactor6 In the Capacity4 According to the loadFactor8 Marks are recalculated hash Values and Capacity3 Move to the HashMap1 .

The code is shown in code block 3,


// The code block 3 --JDK7 In the HashMap.resize() methods 
void resize(int newCapacity) {
 Entry[] oldTable = table;
 int oldCapacity = oldTable.length;
 if (oldCapacity == MAXIMUM_CAPACITY) {
  threshold = Integer.MAX_VALUE;
  return;
 }

 Entry[] newTable = new Entry[newCapacity];
 transfer(newTable, initHashSeedAsNeeded(newCapacity));
 table = newTable;
 threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
 }

 /**
 *  The current table the Entry Transfer to the new table In the 
 */
 void transfer(Entry[] newTable, boolean rehash) {
 int newCapacity = newTable.length;
 for (Entry<K,V> e : table) {
  while(null != e) {
  Entry<K,V> next = e.next;
  if (rehash) {
   e.hash = null == e.key ? 0 : hash(e.key);
  }
  int i = indexFor(e.hash, newCapacity);
  e.next = newTable[i];
  newTable[i] = e;
  e = next;
  }
 }
 }

HashMap performance has two parameters: initial capacity ( HashMap2 ) and load factor ( loadFactor ). The capacity is the number of buckets in the hash table, and the initial capacity is just the capacity of the hash table when it was created. The load factor is a scale that the hash table can reach before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is performed rehash Operation (that is, rebuilding the internal data structure) so that the hash table will have about twice the number of buckets.

According to the source code analysis can be seen: HashMap in Java7 entry Is in accordance with the Capacity3 Index store, when encountering hash conflict, use zipper method to resolve the conflict, will conflict key and Capacity9 Insert into a linked list list In the.

However, there is one drawback to this solution, if key If all the values are in conflict, HashMap will degenerate into a linked list, get The complexity is going to be zero O(n) .

In Java8, to optimize the performance in this worst-case scenario, a balance tree is used to store these hash conflicted key-value pairs, so that the performance can be improved to O(logn) .


 The code block 4 -- JDK8 In the HashMap Middle constant definition 
 static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; 
 static final int TREEIFY_THRESHOLD = 8; //  Whether or not to list Converted to tree The threshold 
 static final int UNTREEIFY_THRESHOLD = 6; //  in resize During operation, decide whether or not untreeify The threshold 
 static final int MIN_TREEIFY_CAPACITY = 64; //  Decide whether to convert to tree Minimum capacity 
 static final float DEFAULT_LOAD_FACTOR = 0.75f; // default Load factor of 

At Java 8 HashMap put The method implementation is shown in code block 5,


 The code block 5 --JDK8 HashMap.put methods 
 public V put(K key, V value) {
 return putVal(hash(key), key, value, false, true);
 }

 final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
   boolean evict) {
 Node<K,V>[] tab; Node<K,V> p; int n, i;
 if ((tab = table) == null || (n = tab.length) == 0)
  n = (tab = resize()).length; //table When it's empty, n for table The length of the 
 if ((p = tab[i = (n - 1) & hash]) == null)
  tab[i] = newNode(hash, key, value, null); // (n - 1) & hash  with Java7 In the indexFor The implementation of the method is the same if i If the value at the location is empty, new is created 1 a Node . table[i] Points to the Node . 
 else {
  //  if i The value at the position is not empty, and it determines the value at the current position Node p  Whether to be inserted with key the hash and key The same 
  Node<K,V> e; K k;
  if (p.hash == hash &&
  ((k = p.key) == key || (key != null && key.equals(k))))
  e = p;// The same covers it 
  else if (p instanceof TreeNode)
  //  Different, and of the current position node p Is already TreeNode , and insert a new one into the tree node . 
  e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
  else {
  //  in i Location found in the linked list p.next for null The location of the, binCount Calculate the length of the current list, and if you continue to insert the conflicting nodes into the list, the length of the list will be greater than tree To convert the list to tree . 
  for (int binCount = 0; ; ++binCount) {
   if ((e = p.next) == null) {
   p.next = newNode(hash, key, value, null);
   if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
    treeifyBin(tab, hash);
   break;
   }
   if (e.hash == hash &&
   ((k = e.key) == key || (key != null && key.equals(k))))
   break;
   p = e;
  }
  }
  if (e != null) { // existing mapping for key
  V oldValue = e.value;
  if (!onlyIfAbsent || oldValue == null)
   e.value = value;
  afterNodeAccess(e);
  return oldValue;
  }
 }
 ++modCount;
 if (++size > threshold)
  resize();
 afterNodeInsertion(evict);
 return null;
 }

Look at the resize Method, because it needs to be considered when hash conflict resolution may be adopted list May also be balance tree The way, therefore resize The method is a little more complicated than JDK7,


  The code block 6 -- JDK8 the resize methods 
 inal Node<K,V>[] resize() {
 Node<K,V>[] oldTab = table;
 int oldCap = (oldTab == null) ? 0 : oldTab.length;
 int oldThr = threshold;
 int newCap, newThr = 0;
 if (oldCap > 0) {
  if (oldCap >= MAXIMUM_CAPACITY) {
  threshold = Integer.MAX_VALUE;// If the maximum capacity is exceeded, it cannot be expanded table
  return oldTab;
  }
  else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
   oldCap >= DEFAULT_INITIAL_CAPACITY)
  newThr = oldThr << 1; // threshold Threshold expanded to 2 times 
 }
 else if (oldThr > 0) // initial capacity was placed in threshold
  newCap = oldThr;
 else {  // zero initial threshold signifies using defaults
  newCap = DEFAULT_INITIAL_CAPACITY;
  newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
 }
 if (newThr == 0) {
  float ft = (float)newCap * loadFactor;
  newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
   (int)ft : Integer.MAX_VALUE);
 }
 threshold = newThr;
 @SuppressWarnings({"rawtypes","unchecked"})
  Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];//  Create capacity as newCap the newTab And will oldTab In the Node So let's move over here, so we need to think about linked lists and tree Two cases. 
 table = newTab;
 if (oldTab != null) {
  for (int j = 0; j < oldCap; ++j) {
  Node<K,V> e;
  if ((e = oldTab[j]) != null) {
   oldTab[j] = null;
   if (e.next == null)
   newTab[e.hash & (newCap - 1)] = e;
   else if (e instanceof TreeNode)
   ((TreeNode<K,V>)e).split(this, newTab, j, oldCap); 
   // split The method splits the tree into lower  and upper tree Two trees, 
 If the number of nodes in the subtree is less than 1 UNTREEIFY_THRESHOLD Threshold, then the tree untreeify , and store all the nodes newTab In the. 
   else { // preserve order
   Node<K,V> loHead = null, loTail = null;
   Node<K,V> hiHead = null, hiTail = null;
   Node<K,V> next;
   do {
    next = e.next;
    if ((e.hash & oldCap) == 0) {
    if (loTail == null)
     loHead = e;
    else
     loTail.next = e;
    loTail = e;
    }
    else {
    if (hiTail == null)
     hiHead = e;
    else
     hiTail.next = e;
    hiTail = e;
    }
   } while ((e = next) != null);
   if (loTail != null) {
    loTail.next = null;
    newTab[j] = loHead;
   }
   if (hiTail != null) {
    hiTail.next = null;
    newTab[j + oldCap] = hiHead;
   }
   }
  }
  }
 }
 return newTab;
 }

Let's take another look at tree treeifyBin Methods and put0 Method implementation, the bottom layer adopts the red-black tree method.


 //  The code block 7 
 //MIN_TREEIFY_CAPACITY  The value of 64 If the current table the length Is not enough, resize (a) 
 final void treeifyBin(Node<K,V>[] tab, int hash) {
 int n, index; Node<K,V> e;
 if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
  resize();
 else if ((e = tab[index = (n - 1) & hash]) != null) {
  TreeNode<K,V> hd = null, tl = null;
  do {
  TreeNode<K,V> p = replacementTreeNode(e, null);
  if (tl == null)
   hd = p;
  else {
   p.prev = tl;
   tl.next = p;
  }
  tl = p;
  } while ((e = e.next) != null);
  if ((tab[index] = hd) != null)
  hd.treeify(tab);
 }
 }
// putVal  the tree version  
 final TreeNode<K,V> putTreeVal(HashMap<K,V> map, Node<K,V>[] tab,
     int h, K k, V v) {
  Class<?> kc = null;
  boolean searched = false;
  TreeNode<K,V> root = (parent != null) ? root() : this;
  for (TreeNode<K,V> p = root;;) {
  int dir, ph; K pk;
  if ((ph = p.hash) > h)
   dir = -1;
  else if (ph < h)
   dir = 1;
  else if ((pk = p.key) == k || (k != null && k.equals(pk)))
   return p;
  else if ((kc == null &&
    (kc = comparableClassFor(k)) == null) ||
    (dir = compareComparables(kc, k, pk)) == 0) {
   if (!searched) {
   TreeNode<K,V> q, ch;
   searched = true;
   if (((ch = p.left) != null &&
    (q = ch.find(h, k, kc)) != null) ||
    ((ch = p.right) != null &&
    (q = ch.find(h, k, kc)) != null))
    return q;
   }
   dir = tieBreakOrder(k, pk);
  }
  TreeNode<K,V> xp = p;
  if ((p = (dir <= 0) ? p.left : p.right) == null) {
   Node<K,V> xpn = xp.next;
   TreeNode<K,V> x = map.newTreeNode(h, k, v, xpn);
   if (dir <= 0)
   xp.left = x;
   else
   xp.right = x;
   xp.next = x;
   x.parent = x.prev = xp;
   if (xpn != null)
   ((TreeNode<K,V>)xpn).prev = x;
   moveRootToFront(tab, balanceInsertion(root, x));
   return null;
  }
  }
 }

After looking at the source code, and 11 did a comparison, marvel at the source code, a lot of benefits.

conclusion

The above is the whole content of this article, I hope the content of this article to your study or work can bring 1 definite help, if you have questions you can leave a message to communicate.


Related articles: