Java HashMap深度剖析

openkk 14年前發布 | 19K 次閱讀 Java Java開發

一、首先再簡單重復一下Hash算法
簡單的說就是一種將任意內容的輸入轉換成相同長度輸出（有個范圍，假設10位的數字，用一個稱之為HashTable的容器來存放）的加密方式------hash
如（假設）：
“a”---?10位數1
123---?10位數2
…
注意：任意內容的輸入，范圍是無窮無盡，肯定比相同長度輸出（如10位數）要大很多，那么就會造成不同的輸入，會得到相同的輸出（值）----hash沖突
HashMap當然也無法避免沖突問題

二、HashMap源碼片段
public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR;//負載因子，默認0.75
        threshold = (int)(DEFAULT_INITIAL_CAPACITY DEFAULT_LOAD_FACTOR);
        table = new Entry[DEFAULT_INITIAL_CAPACITY];
//使用的是Entry數組
        init();
}

public V put(K key, V value) {
        if (key == null)//空的情況，允許存空
            return putForNullKey(value);
        int hash = hash(key.hashCode());//根據key的hashCode再來計算一個hash值----根據hash沖突可以知道不同的key對應的hashCode可能一樣（如果不被重寫的話，Object的hashCode()生成的hashCode存放在java底層的一個大的HashTable上，不過我想JDK應該已經做過沖突處理，不會使用這么簡單的hash算法，除非自己重寫hashCode(),否則應該不會有沖突，真的發生沖突估計要內存溢出了）
//就算hashCode不同，通過hash()出來的hash值也可能沖突(后面會講到)
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {//該位置已經有了，且key也完全是同一個，就覆蓋，否則也是走下面的新開(后面會有例子)
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }//已經存在的情況
//以下是新增的情況
        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }

/**
為給定的hashCode增加一個hash方法，用于抵消低質量的hash函數。這是很關鍵的，因為HashMap使用power-of-two長度的hash表，否則遇到沖突的hashCode無法區分
     /</span>
    static int hash(int h) {
        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

//根據hashCode及table的長度計算index
static int indexFor(int h, int length) {
        return h & (length-1);
    }

void addEntry(int hash, K key, V value, int bucketIndex) {
    Entry<K,V> e = table[bucketIndex];//先取原有的元素作為next，index沖突的時候用，形成一個“鏈條”
        table[bucketIndex] = new Entry<K,V>(hash, key, value, e);
        if (size++ >= threshold)
            resize(2 table.length);
    }

static class Entry<K,V> implements Map.Entry<K,V> {
        final K key;
        V value;
        Entry<K,V> next;//存放自己的下一個
        final int hash;
...
}

//------再看一下get方法
public V get(Object key) {
        if (key == null)
            return getForNullKey();
        int hash = hash(key.hashCode());
//根據key的hashCode計算出“鏈條”在table中的index，再到“鏈條”中查找符合自己key的對象
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
                return e.value;
        }
        return null;
}

三、用個簡單的例子詳細解讀HashMap的運作
public static void main(String[] args) {
HashMap map = new HashMap();
//map里包含一個Entry數組table（默認長度16）

//測試1：測試hashMap的存儲 put(K,V)
//--------存放第1個對象----------
String s1 = "123";
map.put(s1, "123");
//s1的hashCode = 48690
//hash(hashCode) = 46246 ---HashMap的再次hash()
//indexFor(hash, table.length)=6 得出存放在數組中的位置為6+1（從0開始）
//雖然存放的是第一個元素，但卻不是放在第一位，不是順序存放，
//我們看table的結果為：[null, null, null, null, null, null, 123=123, null, null, null, null, null, null, null, null, null]
//讀取的時候不需要按順序對比KEY，只需通過hashCode計算出其位置就可以快速讀取。

//--------存放第2個對象----------
String s2 = "123";
//s1和s2使用編譯常量區，指向同一個常量
System.out.println((s1==s2) +" s1.hashCode="+s1.hashCode());//肯定是true
map.put(s2, "124");//s2和s1指向同一個對象，無疑是覆蓋

//--------存放第3個對象----------
         String s3 = new String("123");
//String已經重寫hashCode,相同內容的字符串hashCode一定是固定相同的。
System.out.println((s1==s3) +" s3.hashCode="+s3.hashCode());//此時就是false了
//s3指向了“堆”區里的一個新建對象，但其值和s1一樣，故hashCode也是一樣的
map.put(s3, "125");//雖然s3和s1不是一個對象，但hashCode一樣，map里只比較hashCode，故認為是同一個

System.out.println(map.size());//size只有一個

//測試2：hashMap沖突解決
//-----再存11個，正好達到threshold
for(int i=0;i<11;i++){
map.put("i"+i, i);
//有趣的事情來了
//當存到i9的時候，"i9"的hashCode=3312，hash(hashCode)=3110，----跟s3明顯不同
//indexFor(hash, table.length)=6，也就是說存放在table的位置和之前的s3重了，咋辦？？？
//繼續執行，判斷key的hash和值發現不是同一個，所以還是執行addEntry方法，
//addEntry也毫不客氣的把位置讓給"i9", }
//--此時的table為：
//[i0=0, null, null, null, i10=10, null, i9=9, null,
// i8=8, i7=7, i6=6, i5=5, i4=4, i3=3, i2=2, i1=1]
//此時的table中已經看不到s3（123=125）的存在了，那么s3哪去了呢？
//查看上面的源碼段，原來每個Entry除了自己的信息外，還包含了個next（自己的下一個Entry，【可以理解為縱向的Entry，而不是table中的下一個】，
//該next是不存在table中【如果再重復，還可能再包含更深一級的next，這就跟“鏈”一樣】）
//也就是說此時s1含在了i9中了，2個共用table的第7個位置
//我們取map元素的時候會取table中的Entry+逐級遞歸取每個Entry的next，這樣就不會丟失了

//測試3：測試HashMap的空間擴展
                  //負載因子0.75，threshold（閥，table存儲邊界） = table容量負載因子 = 160.75 = 12
//當map的實際長度超過threshold，則新建一個size2的table，把原有的小table數據都放進來
//再存第13個，超過12，將會執行resize并transfer-data</span>
map.put("a", "a");
//resize table----即定義一個新的2倍size大小的Entry[]數組，沒什么好講
//transfer data---將舊數組數據存放到新數組
//1、讀取全部oldTable的數據【注意：是全部，含next】
//2、并不是簡單的存放，將每一個Entry在新的數組中重新計算位置,再來一個indexFor(hash, table.length)，這時本來2個一樣的index此時很可能就不一樣了，當然也不排除本來不一樣變成一樣的。
//HashMap代碼段：
/do {
            Entry<K,V> next = e.next;
            int i = indexFor(e.hash, newCapacity);
            e.next = newTable[i];
            newTable[i] = e;
            e = next;
        } while (e != null);/

//就本例解析如下：
//取到“i9”的時候，新的indexFor算出來也是6（巧合吧），放到了newTable[6]
//取“i9”的next（s3），計算s3的indexFor也正好是6（又是巧合吧，如果不是6那他們就可以撇開關系各自存儲了），但是此時的newTable[6]換成了s3,而i9只能作為s3的next來存儲了

//--最后table如下
//[null, null, null, null, null, null, 123=125, a=a,
//null, null, null, null, null, null, null, null, i0=0,
// null, null, null, i10=10, null, null, null, i8=8,
//i7=7, i6=6, i5=5, i4=4, i3=3, i2=2, i1=1]
//---i9不見了，s3出來了
}

轉自：http://alex-lin.iteye.com/blog/1403404

本文由用戶 openkk 自行上傳分享，僅供網友學習交流。所有權歸原作者，若您的權利被侵害，請聯系管理員。

轉載本站原創文章，請注明出處，并保留原始鏈接、圖片水印。

本站是一個以用戶分享為主的開源技術平臺，歡迎各類分享！

本文地址：http://www.baiduhome.net/lib/view/open1329182556187.html

Java Java開發

相關經驗

相關資訊

相關文檔