Java集合---Array類源碼解析

本文轉載自查看原文 2016-05-01 16:35 2572 Java

Java集合---Array類源碼解析 ---轉自：牛奶、不加糖

一、Arrays.sort()數組排序

Java Arrays中提供了對所有類型的排序。其中主要分為Primitive(8種基本類型)和Object兩大類。

　　基本類型：采用調優的快速排序；

　　對象類型：采用改進的歸並排序。

1、對於基本類型源碼分析如下（以int[]為例）：

　　Java對Primitive（int，float等原型數據）數組采用快速排序，對Object對象數組采用歸並排序。對這一區別，sun在<<The Java Tutorial>>中做出的解釋如下：

　　The sort operation uses a slightly optimized merge sort algorithm that is fast and stable:

　　* Fast: It is guaranteed to run in n log(n) time and runs substantially faster on nearly sorted lists. Empirical tests showed it to be as fast as a highly optimized quicksort. A quicksort is generally considered to be faster than a merge sort but isn't stable and doesn't guarantee n log(n) performance.

　　* Stable: It doesn't reorder equal elements. This is important if you sort the same list repeatedly on different attributes. If a user of a mail program sorts the inbox by mailing date and then sorts it by sender, the user naturally expects that the now-contiguous list of messages from a given sender will (still) be sorted by mailing date. This is guaranteed only if the second sort was stable.

　　也就是說，優化的歸並排序既快速（nlog(n)）又穩定。

　　對於對象的排序，穩定性很重要。比如成績單，一開始可能是按人員的學號順序排好了的，現在讓我們用成績排，那么你應該保證，本來張三在李四前面，即使他們成績相同，張三不能跑到李四的后面去。

　　而快速排序是不穩定的，而且最壞情況下的時間復雜度是O(n^2)。

　　另外，對象數組中保存的只是對象的引用，這樣多次移位並不會造成額外的開銷，但是，對象數組對比較次數一般比較敏感，有可能對象的比較比單純數的比較開銷大很多。歸並排序在這方面比快速排序做得更好，這也是選擇它作為對象排序的一個重要原因之一。

　　排序優化：實現中快排和歸並都采用遞歸方式，而在遞歸的底層，也就是待排序的數組長度小於7時，直接使用冒泡排序，而不再遞歸下去。

　　分析：長度為6的數組冒泡排序總比較次數最多也就1+2+3+4+5+6=21次，最好情況下只有6次比較。而快排或歸並涉及到遞歸調用等的開銷，其時間效率在n較小時劣勢就凸顯了，因此這里采用了冒泡排序，這也是對快速排序極重要的優化。

　　源碼中的快速排序，主要做了以下幾個方面的優化：

　　1）當待排序的數組中的元素個數較少時，源碼中的閥值為7，采用的是插入排序。盡管插入排序的時間復雜度為0(n^2)，但是當數組元素較少時，插入排序優於快速排序，因為這時快速排序的遞歸操作影響性能。

　　2）較好的選擇了划分元（基准元素）。能夠將數組分成大致兩個相等的部分，避免出現最壞的情況。例如當數組有序的的情況下，選擇第一個元素作為划分元，將使得算法的時間復雜度達到O(n^2).

　　源碼中選擇划分元的方法:

　　　　當數組大小為 size=7 時，取數組中間元素作為划分元。int n=m>>1;(此方法值得借鑒)

　　　　當數組大小 7<size<=40時，取首、中、末三個元素中間大小的元素作為划分元。

　　　　當數組大小 size>40 時，從待排數組中較均勻的選擇9個元素，選出一個偽中數做為划分元。

　　3）根據划分元 v ，形成不變式 v* (<v)* (>v)* v*

　　普通的快速排序算法，經過一次划分后，將划分元排到素組較中間的位置，左邊的元素小於划分元，右邊的元素大於划分元，而沒有將與划分元相等的元素放在其附近，這一點，在Arrays.sort()中得到了較大的優化。

　　舉例：15、93、15、41、6、15、22、7、15、20

　　因 7<size<=40,所以在15、6、和20 中選擇v = 15 作為划分元。

　　經過一次換分后： 15、15、7、6、41、20、22、93、15、15. 與划分元相等的元素都移到了素組的兩邊。

　　接下來將與划分元相等的元素移到數組中間來，形成：7、6、15、15、15、15、41、20、22、93.

　　最后遞歸對兩個區間進行排序[7、6]和[41、20、22、93].

　　部分源代碼（一）如下：

1 package com.util; 2 3 public class ArraysPrimitive { 4 private ArraysPrimitive() {} 5 6 /** 7 * 對指定的 int 型數組按數字升序進行排序。 8 */ 9 public static void sort(int[] a) { 10 sort1(a, 0, a.length); 11 } 12 13 /** 14 * 對指定 int 型數組的指定范圍按數字升序進行排序。 15 */ 16 public static void sort(int[] a, int fromIndex, int toIndex) { 17 rangeCheck(a.length, fromIndex, toIndex); 18 sort1(a, fromIndex, toIndex - fromIndex); 19 } 20 21 private static void sort1(int x[], int off, int len) { 22 /* 23 * 當待排序的數組中的元素個數小於 7 時，采用插入排序 。 24 * 25 * 盡管插入排序的時間復雜度為O(n^2),但是當數組元素較少時， 插入排序優於快速排序，因為這時快速排序的遞歸操作影響性能。 26 */ 27 if (len < 7) { 28 for (int i = off; i < len + off; i++) 29 for (int j = i; j > off && x[j - 1] > x[j]; j--) 30 swap(x, j, j - 1); 31 return; 32 } 33 /* 34 * 當待排序的數組中的元素個數大於 或等於7 時，采用快速排序 。 35 * 36 * Choose a partition element, v 37 * 選取一個划分元，V 38 * 39 * 較好的選擇了划分元(基准元素)。能夠將數組分成大致兩個相等的部分，避免出現最壞的情況。例如當數組有序的的情況下， 40 * 選擇第一個元素作為划分元，將使得算法的時間復雜度達到O(n^2). 41 */ 42 // 當數組大小為size=7時 ，取數組中間元素作為划分元。 43 int m = off + (len >> 1); 44 // 當數組大小 7<size<=40時，取首、中、末 三個元素中間大小的元素作為划分元。 45 if (len > 7) { 46 int l = off; 47 int n = off + len - 1; 48 /* 49 * 當數組大小 size>40 時 ，從待排數組中較均勻的選擇9個元素， 50 * 選出一個偽中數做為划分元。 51 */ 52 if (len > 40) { 53 int s = len / 8; 54 l = med3(x, l, l + s, l + 2 * s); 55 m = med3(x, m - s, m, m + s); 56 n = med3(x, n - 2 * s, n - s, n); 57 } 58 // 取出中間大小的元素的位置。 59 m = med3(x, l, m, n); // Mid-size, med of 3 60 } 61 62 //得到划分元V 63 int v = x[m]; 64 65 // Establish Invariant: v* (<v)* (>v)* v* 66 int a = off, b = a, c = off + len - 1, d = c; 67 while (true) { 68 while (b <= c && x[b] <= v) { 69 if (x[b] == v) 70 swap(x, a++, b); 71 b++; 72 } 73 while (c >= b && x[c] >= v) { 74 if (x[c] == v) 75 swap(x, c, d--); 76 c--; 77 } 78 if (b > c) 79 break; 80 swap(x, b++, c--); 81 } 82 // Swap partition elements back to middle 83 int s, n = off + len; 84 s = Math.min(a - off, b - a); 85 vecswap(x, off, b - s, s); 86 s = Math.min(d - c, n - d - 1); 87 vecswap(x, b, n - s, s); 88 // Recursively sort non-partition-elements 89 if ((s = b - a) > 1) 90 sort1(x, off, s); 91 if ((s = d - c) > 1) 92 sort1(x, n - s, s); 93 } 94 95 /** 96 * Swaps x[a] with x[b]. 97 */ 98 private static void swap(int x[], int a, int b) { 99 int t = x[a]; 100 x[a] = x[b]; 101 x[b] = t; 102 } 103 104 /** 105 * Swaps x[a .. (a+n-1)] with x[b .. (b+n-1)]. 106 */ 107 private static void vecswap(int x[], int a, int b, int n) { 108 for (int i=0; i<n; i++, a++, b++) 109 swap(x, a, b); 110 } 111 112 /** 113 * Returns the index of the median of the three indexed integers. 114 */ 115 private static int med3(int x[], int a, int b, int c) { 116 return (x[a] < x[b] ? (x[b] < x[c] ? b : x[a] < x[c] ? c : a) 117 : (x[b] > x[c] ? b : x[a] > x[c] ? c : a)); 118 } 119 120 /** 121 * Check that fromIndex and toIndex are in range, and throw an 122 * appropriate exception if they aren't. 123 */ 124 private static void rangeCheck(int arrayLen, int fromIndex, int toIndex) { 125 if (fromIndex > toIndex) 126 throw new IllegalArgumentException("fromIndex(" + fromIndex 127 + ") > toIndex(" + toIndex + ")"); 128 if (fromIndex < 0) 129 throw new ArrayIndexOutOfBoundsException(fromIndex); 130 if (toIndex > arrayLen) 131 throw new ArrayIndexOutOfBoundsException(toIndex); 132 } 133 }

測試代碼如下：

1 package com.test; 2 3 import com.util.ArraysPrimitive; 4 5 public class ArraysTest { 6 public static void main(String[] args) { 7 int [] a={15,93,15,41,6,15,22,7,15,20}; 8 ArraysPrimitive.sort(a); 9 for(int i=0;i<a.length;i++){ 10 System.out.print(a[i]+","); 11 } 12 //結果：6,7,15,15,15,15,20,22,41,93, 13 } 14 }

2、對於Object類型源碼分析如下：

　　部分源代碼（二）如下：

按 Ctrl+C 復制代碼

package com.util;

import java.lang.reflect.Array;

public class ArraysObject {
    private static final int INSERTIONSORT_THRESHOLD = 7;

private ArraysObject() {}

public static void sort(Object[] a) {
        //java.lang.Object.clone()，理解深表復制和淺表復制
        Object[] aux = (Object[]) a.clone();
        mergeSort(aux, a, 0, a.length, 0);
   }

public static void sort(Object[] a, int fromIndex, int toIndex) {
       rangeCheck(a.length, fromIndex, toIndex);
        Object[] aux = copyOfRange(a, fromIndex, toIndex);
        mergeSort(aux, a, fromIndex, toIndex, -fromIndex);
   }

/**
    * Src is the source array that starts at index 0 
    * Dest is the (possibly larger) array destination with a possible offset 
    * low is the index in dest to start sorting 
    * high is the end index in dest to end sorting 
    * off is the offset to generate corresponding low, high in src
     */
    private static void mergeSort(Object[] src, Object[] dest, int low,
            int high, int off) {
        int length = high - low;

// Insertion sort on smallest arrays
        if (length < INSERTIONSORT_THRESHOLD) {
            for (int i = low; i < high; i++)
                for (int j = i; j > low && 
                        ((Comparable) dest[j - 1]).compareTo(dest[j]) > 0; j--)
                    swap(dest, j, j - 1);
            return;
       }

// Recursively sort halves of dest into src
        int destLow = low;
        int destHigh = high;
        low += off;
        high += off;
        /*
        *  >>>：無符號右移運算符
        *  expression1 >>> expresion2：expression1的各個位向右移expression2
        *  指定的位數。右移后左邊空出的位數用0來填充。移出右邊的位被丟棄。
        *  例如：-14>>>2；  結果為：1073741820
         */
        int mid = (low + high) >>> 1;
        mergeSort(dest, src, low, mid, -off);
        mergeSort(dest, src, mid, high, -off);

// If list is already sorted, just copy from src to dest. This is an
        // optimization that results in faster sorts for nearly ordered lists.
        if (((Comparable) src[mid - 1]).compareTo(src[mid]) <= 0) {
           System.arraycopy(src, low, dest, destLow, length);
            return;
       }

// Merge sorted halves (now in src) into dest
        for (int i = destLow, p = low, q = mid; i < destHigh; i++) {
            if (q >= high || p < mid
                    && ((Comparable) src[p]).compareTo(src[q]) <= 0)
                dest[i] = src[p++];
            else
                dest[i] = src[q++];
       }
   }

/**
    * Check that fromIndex and toIndex are in range, and throw an appropriate
    * exception if they aren't.
     */
    private static void rangeCheck(int arrayLen, int fromIndex, int toIndex) {
        if (fromIndex > toIndex)
            throw new IllegalArgumentException("fromIndex(" + fromIndex
                    + ") > toIndex(" + toIndex + ")");
        if (fromIndex < 0)
            throw new ArrayIndexOutOfBoundsException(fromIndex);
        if (toIndex > arrayLen)
            throw new ArrayIndexOutOfBoundsException(toIndex);
   }

public static <T> T[] copyOfRange(T[] original, int from, int to) {
        return copyOfRange(original, from, to, (Class<T[]>) original.getClass());
   }

public static <T, U> T[] copyOfRange(U[] original, int from, int to,
            Class<? extends T[]> newType) {
        int newLength = to - from;
        if (newLength < 0)
            throw new IllegalArgumentException(from + " > " + to);
        T[] copy = ((Object) newType == (Object) Object[].class)
                ? (T[]) new Object[newLength]
               : (T[]) Array.newInstance(newType.getComponentType(), newLength);
        System.arraycopy(original, from, copy, 0,
                Math.min(original.length - from, newLength));
        return copy;
   }

/**
    * Swaps x[a] with x[b].
     */
    private static void swap(Object[] x, int a, int b) {
        Object t = x[a];
        x[a] = x[b];
        x[b] = t;
   }
}

按 Ctrl+C 復制代碼

測試代碼如下：

按 Ctrl+C 復制代碼

package com.test;

import com.util.ArraysObject;

public class ArraysObjectSortTest {
    public static void main(String[] args) {
        Student stu1=new Student(1001,100.0F);
        Student stu2=new Student(1002,90.0F);
        Student stu3=new Student(1003,90.0F);
        Student stu4=new Student(1004,95.0F);
        Student[] stus={stu1,stu2,stu3,stu4};
        //Arrays.sort(stus);
       ArraysObject.sort(stus);
        for(int i=0;i<stus.length;i++){
            System.out.println(stus[i].getId()+" : "+stus[i].getScore());
       }
        /* 1002 : 90.0
        * 1003 : 90.0
        * 1004 : 95.0
        * 1001 : 100.0
         */
   }
}
class Student implements Comparable<Student>{
    private int id;  //學號
    private float score;  //成績
    public Student(){}
    public Student(int id,float score){
        this.id=id;
        this.score=score;
   }
   @Override
    public int compareTo(Student s) {
        return (int)(this.score-s.getScore());
   }
    public int getId() {
        return id;
   }
    public void setId(int id) {
        this.id = id;
   }
    public float getScore() {
        return score;
   }
    public void setScore(float score) {
        this.score = score;
   }
}

按 Ctrl+C 復制代碼

輔助理解代碼：

按 Ctrl+C 復制代碼

package com.lang;

public final class System {
    //System 類不能被實例化。 
    private System() {}
    //在 System 類提供的設施中，有標准輸入、標准輸出和錯誤輸出流；對外部定義的屬性
    //和環境變量的訪問；加載文件和庫的方法；還有快速復制數組的一部分的實用方法。
    /**
    * src and dest都必須是同類型或者可以進行轉換類型的數組．
    * @param      src      the source array.
    * @param      srcPos   starting position in the source array.
    * @param      dest     the destination array.
    * @param      destPos  starting position in the destination data.
    * @param      length   the number of array elements to be copied.
     */
    public static native void arraycopy(Object src, int srcPos, Object dest,
            int destPos, int length);
}
package com.lang.reflect;

public final class Array {
    private Array() {}
    
    //創建一個具有指定的組件類型和維度的新數組。
    public static Object newInstance(Class<?> componentType, int length)
            throws NegativeArraySizeException {
        return newArray(componentType, length);
   }

private static native Object newArray(Class componentType, int length)
            throws NegativeArraySizeException;
}

按 Ctrl+C 復制代碼

二、Arrays.asList

慎用ArrayList的contains方法，使用HashSet的contains方法代替

在啟動一個應用的時候，發現其中有一處數據加載要數分鍾，剛開始以為是需要load的數據比較多的緣故，查了一下數據庫有6條左右，但是單獨寫了一個數據讀取的方法，將這6萬多條全部讀過來，卻只需要不到10秒鍾，就覺得這里面肯定有問題，於是仔細看其中的邏輯，其中有一段數據去重的邏輯，就是記錄中存在某幾個字段相同的，就認為是重復數據，就需要將重復數據給過濾掉。這里就用到了一個List來存放這幾個字段所組成的主鍵，如果發現相同的就不處理，代碼無非就是下面這樣：

1 List<string> uniqueKeyList = new ArrayList<string>(); 2 //...... 3 if (uniqueKeyList.contains(uniqueKey)) { 4 continue; }

根據鍵去查找是不是已經存在了，來判斷是否重復數據。經過分析，這一塊耗費了非常多的時候，於是就去查看ArrayList的contains方法的源碼，發現其最終會調用他本身的indexOf方法：

7public int indexOf(Object elem) { 8 if (elem == null) { 9 for (int i = 0; i < size; i++) 10 if (elementData[i]==null) 11 return i; 12 } else { 13 for (int i = 0; i < size; i++) 14 if (elem.equals(elementData[i])) 15 return i; 16 } 17 return -1; 18 }

原來在這里他做的是遍歷整個list進行查找，最多可能對一個鍵的查找會達到6萬多次，也就是會掃描整個List，驗怪會這么慢了。

於是將原來的List替換為Set：

Set<string> uniqueKeySet = new HashSet<string>(); //...... if (uniqueKeySet.contains(uniqueKey)) { continue; }

速度一下就上去了，在去重這一塊最多花費了一秒鍾，為什么HashSet的速度一下就上去了，那是因為其內部使用的是Hashtable，這是HashSet的contains的源碼：

public boolean contains(Object o) { return map.containsKey(o); }

關於UnsupportedOperationException異常

在使用Arrays.asList()后調用add，remove這些method時出現java.lang.UnsupportedOperationException異常。這是由於Arrays.asList() 返回java.util.Arrays$ArrayList，而不是ArrayList。Arrays$ArrayList和ArrayList都是繼承AbstractList，remove，add等method在AbstractList中是默認throw UnsupportedOperationException而且不作任何操作。ArrayList override這些method來對list進行操作，但是Arrays$ArrayList沒有override remove()，add()等，所以throw UnsupportedOperationException。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Java集合類：AbstractCollection源碼解析 Java String類源碼解析 java 集合類Array、List、Map區別和優缺點【java集合總結】-- ArrayList源碼解析 java集合類源碼學習一觀V8源碼中的array.js，解析 Array.prototype.slice為什么能將類數組對象轉為真正的數組？ java中的集合類詳情解析以及集合和數組的區別 Java集合類 Java 集合系列12之 TreeMap詳細介紹(源碼解析)和使用示例 Java 集合系列05之 LinkedList詳細介紹(源碼解析)和使用示例