Java並發（三）線程池原理

本文轉載自查看原文 2019-04-17 23:22 1216 並發編程

Java中的線程池是運用場景最多的並發框架，幾乎所有需要異步或並發執行任務的程序都可以使用線程池。在開發過程中，合理地使用線程池能夠帶來3個好處。

1. 降低資源消耗。通過重復利用已創建的線程降低線程創建和銷毀造成的消耗；

2. 提高響應速度。當任務到達時，任務可以不需要等到線程創建就能立即執行；

3. 提高線程的可管理性。線程是稀缺資源，如果無限制地創建，不僅會消耗系統資源，還會降低系統的穩定性，使用線程池可以進行統一分配、調優和監控。但是，要做到合理利用線程池，必須對其實現原理了如指掌。

線程池實現原理

當向線程池提交一個任務之后，線程池是如何處理這個任務的呢？本節來看一下線程池的主要處理流程，處理流程圖如下圖所示：

從圖中可以看出，當提交一個新任務到線程池時，線程池的處理流程如下。

1. 線程池判斷核心線程池里的線程是否都在執行任務。如果不是，則創建一個新的工作線程來執行任務。如果核心線程池里的線程都在執行任務，則進入下個流程。

2. 線程池判斷工作隊列是否已經滿。如果工作隊列沒有滿，則將新提交的任務存儲在這個工作隊列里。如果工作隊列滿了，則進入下個流程。

3. 線程池判斷線程池的線程是否都處於工作狀態。如果沒有，則創建一個新的工作線程來執行任務。如果已經滿了，則交給飽和策略來處理這個任務。

ThreadPoolExecutor執行execute()方法的示意圖，如下圖所示

ThreadPoolExecutor執行execute方法分下面4種情況。

1）如果當前運行的線程少於corePoolSize，則創建新線程來執行任務（注意，執行這一步驟需要獲取全局鎖）。

2）如果運行的線程等於或多於corePoolSize，則將任務加入BlockingQueue。

3）如果無法將任務加入BlockingQueue（隊列已滿），則創建新的線程來處理任務（注意，執行這一步驟需要獲取全局鎖）。

4）如果創建新線程將使當前運行的線程超出maximumPoolSize，任務將被拒絕，並調用RejectedExecutionHandler.rejectedExecution()方法。

ThreadPoolExecutor采取上述步驟的總體設計思路，是為了在執行execute()方法時，盡可能地避免獲取全局鎖（那將會是一個嚴重的可伸縮瓶頸）。在ThreadPoolExecutor完成預熱之后（當前運行的線程數大於等於corePoolSize），幾乎所有的execute()方法調用都是執行步驟2，而步驟2不需要獲取全局鎖。

源碼分析：上面的流程分析讓我們很直觀地了解了線程池的工作原理，讓我們再通過源代碼來看看是如何實現的。

一、變量

public class ThreadPoolExecutor extends AbstractExecutorService {
    /**
     * The main pool control state, ctl, is an atomic integer packing
     * two conceptual fields
     *   workerCount, indicating the effective number of threads
     *   runState,    indicating whether running, shutting down etc
     *
     * In order to pack them into one int, we limit workerCount to
     * (2^29)-1 (about 500 million) threads rather than (2^31)-1 (2
     * billion) otherwise representable. If this is ever an issue in
     * the future, the variable can be changed to be an AtomicLong,
     * and the shift/mask constants below adjusted. But until the need
     * arises, this code is a bit faster and simpler using an int.
     *
     * The workerCount is the number of workers that have been
     * permitted to start and not permitted to stop.  The value may be
     * transiently different from the actual number of live threads,
     * for example when a ThreadFactory fails to create a thread when
     * asked, and when exiting threads are still performing
     * bookkeeping before terminating. The user-visible pool size is
     * reported as the current size of the workers set.
     *
     * The runState provides the main lifecycle control, taking on values:
     *
     *   RUNNING:  Accept new tasks and process queued tasks
     *   SHUTDOWN: Don't accept new tasks, but process queued tasks
     *   STOP:     Don't accept new tasks, don't process queued tasks,
     *             and interrupt in-progress tasks
     *   TIDYING:  All tasks have terminated, workerCount is zero,
     *             the thread transitioning to state TIDYING
     *             will run the terminated() hook method
     *   TERMINATED: terminated() has completed
     *
     * The numerical order among these values matters, to allow
     * ordered comparisons. The runState monotonically increases over
     * time, but need not hit each state. The transitions are:
     *
     * RUNNING -> SHUTDOWN
     *    On invocation of shutdown(), perhaps implicitly in finalize()
     * (RUNNING or SHUTDOWN) -> STOP
     *    On invocation of shutdownNow()
     * SHUTDOWN -> TIDYING
     *    When both queue and pool are empty
     * STOP -> TIDYING
     *    When pool is empty
     * TIDYING -> TERMINATED
     *    When the terminated() hook method has completed
     *
     * Threads waiting in awaitTermination() will return when the
     * state reaches TERMINATED.
     *
     * Detecting the transition from SHUTDOWN to TIDYING is less
     * straightforward than you'd like because the queue may become
     * empty after non-empty and vice versa during SHUTDOWN state, but
     * we can only terminate if, after seeing that it is empty, we see
     * that workerCount is 0 (which sometimes entails a recheck -- see
     * below).
     */
    /**
     * ctl 為原子類型的變量, 有兩個概念
     * workerCount, 表示有效的線程數
     * runState, 表示線程狀態, 是否正在運行, 關閉等
     */
    private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
    // 29
    private static final int COUNT_BITS = Integer.SIZE - 3;
    // 容量 2²⁹-1
    private static final int CAPACITY   = (1 << COUNT_BITS) - 1;

    // runState is stored in the high-order bits
    // 線程池的五種狀態
    // 即高3位為111, 接受新任務並處理排隊任務
    private static final int RUNNING    = -1 << COUNT_BITS;
    // 即高3位為000, 不接受新任務, 但處理排隊任務
    private static final int SHUTDOWN   =  0 << COUNT_BITS;
    // 即高3位為001, 不接受新任務, 不處理排隊任務, 並中斷正在進行的任務
    private static final int STOP       =  1 << COUNT_BITS;
    // 即高3位為010, 所有任務都已終止, 工作線程為0, 線程轉換到狀態TIDYING, 將運行terminate()鈎子方法
    private static final int TIDYING    =  2 << COUNT_BITS;
    // 即高3位為011, 標識terminate（）已經完成
    private static final int TERMINATED =  3 << COUNT_BITS;

    // Packing and unpacking ctl 用來計算線程的方法
    private static int runStateOf(int c)     { return c & ~CAPACITY; }
    private static int workerCountOf(int c)  { return c & CAPACITY; }
    private static int ctlOf(int rs, int wc) { return rs | wc; }

    ... ...
}

ctl 是對線程池的運行狀態和線程池中有效線程的數量進行控制的一個字段，它包含兩部分的信息：線程池的運行狀態 (runState) 和線程池內有效線程的數量 (workerCount)，這里可以看到，使用了Integer類型來保存，高3位保存runState，低29位保存workerCount。COUNT_BITS 就是29，CAPACITY就是1左移29位減1（29個1），這個常量表示workerCount的上限值，大約是5億。

下面再介紹下線程池的運行狀態，線程池一共有五種狀態，分別是：

狀態	描述
RUNNING	能接受新提交的任務，並且也能處理阻塞隊列中的任務
SHUTDOWN	關閉狀態，不再接受新提交的任務，但卻可以繼續處理阻塞隊列中已保存的任務。在線程池處於 RUNNING 狀態時，調用 shutdown()方法會使線程池進入到該狀態。（finalize() 方法在執行過程中也會調用shutdown()方法進入該狀態）
STOP	不能接受新任務，也不處理隊列中的任務，會中斷正在處理任務的線程。在線程池處於 RUNNING 或 SHUTDOWN 狀態時，調用 shutdownNow() 方法會使線程池進入到該狀態
TIDYING	如果所有的任務都已終止了，workerCount (有效線程數) 為0，線程池進入該狀態后會調用 terminated() 方法進入TERMINATED 狀態
TERMINATED	在terminated() 方法執行完后進入該狀態，默認terminated()方法中什么也沒有做

進入TERMINATED的條件如下：

線程池不是RUNNING狀態；
線程池狀態不是TIDYING狀態或TERMINATED狀態；
如果線程池狀態是SHUTDOWN並且workerQueue為空；
workerCount為0；
設置TIDYING狀態成功。

下圖為線程池的狀態轉換過程：

計算線程的幾個方法：

方法	描述
runStateOf	獲取運行狀態
workerCountOf	獲取活動線程數
ctlOf	獲取運行狀態和活動線程數的值

二、execute方法

/**
 * Executes the given task sometime in the future.  The task
 * may execute in a new thread or in an existing pooled thread.
 *
 * If the task cannot be submitted for execution, either because this
 * executor has been shutdown or because its capacity has been reached,
 * the task is handled by the current {@code RejectedExecutionHandler}.
 *
 * @param command the task to execute
 * @throws RejectedExecutionException at discretion of
 *         {@code RejectedExecutionHandler}, if the task
 *         cannot be accepted for execution
 * @throws NullPointerException if {@code command} is null
 */
public void execute(Runnable command) {
    // 空則拋出異常
    if (command == null)
        throw new NullPointerException();
    /*
     * Proceed in 3 steps:
     *
     * 1. If fewer than corePoolSize threads are running, try to
     * start a new thread with the given command as its first
     * task.  The call to addWorker atomically checks runState and
     * workerCount, and so prevents false alarms that would add
     * threads when it shouldn't, by returning false.
     *
     * 2. If a task can be successfully queued, then we still need
     * to double-check whether we should have added a thread
     * (because existing ones died since last checking) or that
     * the pool shut down since entry into this method. So we
     * recheck state and if necessary roll back the enqueuing if
     * stopped, or start a new thread if there are none.
     *
     * 3. If we cannot queue task, then we try to add a new
     * thread.  If it fails, we know we are shut down or saturated
     * and so reject the task.
     */
    /*
     * 獲取當前線程池的狀態
     * clt記錄着runState和workerCount
     *
     */
    int c = ctl.get();
    /*
     * 計算工作線程數 並判斷是否小於核心線程數
     * workerCountOf方法取出低29位的值，表示當前活動的線程數;
     * 如果當前活動線程數小於corePoolSize，則新建一個線程放入線程池中;
     * 並把任務添加到該線程中。
     *
     */
    if (workerCountOf(c) < corePoolSize) {
        // addWorker提交任務, 提交成功則結束
        /*
         * addWorker中的第二個參數表示限制添加線程的數量是根據corePoolSize來判斷還是maximumPoolSize來判斷；
         * 如果為true，根據corePoolSize來判斷；
         * 如果為false，則根據maximumPoolSize來判斷
         */
        if (addWorker(command, true))
            return;
        // 提交失敗再次獲取當前狀態
        c = ctl.get();
    }
    // 判斷線程狀態, 並插入隊列, 失敗則移除
    /*
     * 如果當前線程池是運行狀態並且任務添加到隊列成功
     */
    if (isRunning(c) && workQueue.offer(command)) {
        // 再次獲取狀態
        int recheck = ctl.get();
        // 如果狀態不是RUNNING, 並移除失敗
        /*
         * 再次判斷線程池的運行狀態，如果不是運行狀態，由於之前已經把command添加到workQueue中了，
         * 這時需要移除該command
         * 執行過后通過handler使用拒絕策略對該任務進行處理，整個方法返回
         */
        if (! isRunning(recheck) && remove(command))
            // 調用拒絕策略
            reject(command);
        // 如果工作線程為0 則調用 addWorker
        /*
         * 獲取線程池中的有效線程數，如果數量是0，則執行addWorker方法
         * 這里傳入的參數表示：
         * 1. 第一個參數為null，表示在線程池中創建一個線程，但不去啟動；
         * 2. 第二個參數為false，將線程池的有限線程數量的上限設置為maximumPoolSize，添加線程時根據maximumPoolSize來判斷；
         * 如果判斷workerCount大於0，則直接返回，在workQueue中新增的command會在將來的某個時刻被執行。
         */
        else if (workerCountOf(recheck) == 0)
            addWorker(null, false);
    }
    // 提交任務失敗 走拒絕策略
    /*
     * 如果執行到這里，有兩種情況：
     * 1. 線程池已經不是RUNNING狀態；
     * 2. 線程池是RUNNING狀態，但workerCount >= corePoolSize並且workQueue已滿。
     * 這時，再次調用addWorker方法，但第二個參數傳入為false，將線程池的有限線程數量的上限設置為maximumPoolSize；
     * 如果失敗則拒絕該任務
     */
    else if (!addWorker(command, false))
        reject(command);
}

簡單來說，在執行 execute() 方法時如果狀態一直是RUNNING時，的執行過程如下：

如果workerCount < corePoolSize，則創建並啟動一個線程來執行新提交的任務；
如果workerCount >= corePoolSize，且線程池內的阻塞隊列未滿，則將任務添加到該阻塞隊列中；
如果workerCount >= corePoolSize && workerCount < maximumPoolSize，且線程池內的阻塞隊列已滿，則創建並啟動一個線程來執行新提交的任務；
如果workerCount >= maximumPoolSize，並且線程池內的阻塞隊列已滿, 則根據拒絕策略來處理該任務, 默認的處理方式是直接拋異常。

這里要注意一下 addWorker(null, false) ，也就是創建一個線程，但並沒有傳入任務，因為任務已經被添加到workQueue中了，所以worker在執行的時候，會直接從workQueue中獲取任務。所以，在 workerCountOf(recheck) == 0 時執行 addWorker(null, false) 也是為了保證線程池在RUNNING狀態下必須要有一個線程來執行任務。

execute方法執行流程如下：

三、addWorker方法

addWorker方法的主要工作是在線程池中創建一個新的線程並執行，firstTask參數用於指定新增的線程執行的第一個任務，core參數為true表示在新增線程時會判斷當前活動線程數是否少於corePoolSize，false表示新增線程前需要判斷當前活動線程數是否少於maximumPoolSize，代碼如下：

/**
 * Checks if a new worker can be added with respect to current
 * pool state and the given bound (either core or maximum). If so,
 * the worker count is adjusted accordingly, and, if possible, a
 * new worker is created and started, running firstTask as its
 * first task. This method returns false if the pool is stopped or
 * eligible to shut down. It also returns false if the thread
 * factory fails to create a thread when asked.  If the thread
 * creation fails, either due to the thread factory returning
 * null, or due to an exception (typically OutOfMemoryError in
 * Thread.start()), we roll back cleanly.
 *
 * @param firstTask the task the new thread should run first (or
 * null if none). Workers are created with an initial first task
 * (in method execute()) to bypass queuing when there are fewer
 * than corePoolSize threads (in which case we always start one),
 * or when the queue is full (in which case we must bypass queue).
 * Initially idle threads are usually created via
 * prestartCoreThread or to replace other dying workers.
 *
 * @param core if true use corePoolSize as bound, else
 * maximumPoolSize. (A boolean indicator is used here rather than a
 * value to ensure reads of fresh values after checking other pool
 * state).
 * @return true if successful
 */
/**
 * 檢查任務是否可以提交
 */
private boolean addWorker(Runnable firstTask, boolean core) {
    retry:
    // 外層循環
    for (;;) {
        // 獲取運行狀態
        int c = ctl.get();
        int rs = runStateOf(c);

        /*
         * 這個if判斷
         * 如果rs >= SHUTDOWN，則表示此時不再接收新任務；
         * 接着判斷以下3個條件，只要有1個不滿足，則返回false：
         * 1. rs == SHUTDOWN，這時表示關閉狀態，不再接受新提交的任務，但卻可以繼續處理阻塞隊列中已保存的任務
         * 2. firsTask為空
         * 3. 阻塞隊列不為空
         * 
         * 首先考慮rs == SHUTDOWN的情況
         * 這種情況下不會接受新提交的任務，所以在firstTask不為空的時候會返回false；
         * 然后，如果firstTask為空，並且workQueue也為空，則返回false，
         * 因為隊列中已經沒有任務了，不需要再添加線程了
         */
        // Check if queue empty only if necessary. 檢查線程池是否關閉
        if (rs >= SHUTDOWN &&
            ! (rs == SHUTDOWN &&
               firstTask == null &&
               ! workQueue.isEmpty()))
            return false;
        // 內層循環
        for (;;) {
            // 獲取線程數
            int wc = workerCountOf(c);
            // 工作線程大於容量 或者大於 核心或最大線程數
            /*
             * 如果wc超過CAPACITY，也就是ctl的低29位的最大值（二進制是29個1），返回false；
             * 這里的core是addWorker方法的第二個參數，如果為true表示根據corePoolSize來比較，
             * 如果為false則根據maximumPoolSize來比較。
             */
            if (wc >= CAPACITY ||
                wc >= (core ? corePoolSize : maximumPoolSize))
                return false;
            // CAS 線程數增加, 成功則調到外層循環
            /*
             * 嘗試增加workerCount，如果成功，則跳出第一個for循環
             */
            if (compareAndIncrementWorkerCount(c))
                break retry;
            // 如果增加workerCount失敗，則重新獲取ctl的值
            c = ctl.get();  // Re-read ctl
            // 如果當前的運行狀態不等於rs，說明狀態已被改變，返回第一個for循環繼續執行
            if (runStateOf(c) != rs)
                continue retry;
            // else CAS failed due to workerCount change; retry inner loop
        }
    }

    /**
     * 創建新worker 開始新線程
     */
    boolean workerStarted = false;
    boolean workerAdded = false;
    Worker w = null;
    try {
        // 根據firstTask來創建Worker對象
        w = new Worker(firstTask);
        // 每一個Worker對象都會創建一個線程
        final Thread t = w.thread;
        if (t != null) {
            final ReentrantLock mainLock = this.mainLock;
            // 加鎖
            mainLock.lock();
            try {
                // Recheck while holding lock.
                // Back out on ThreadFactory failure or if
                // shut down before lock acquired.
                int rs = runStateOf(ctl.get());

                /*
                 * rs < SHUTDOWN表示是RUNNING狀態；
                 * 如果rs是RUNNING狀態或者rs是SHUTDOWN狀態並且firstTask為null，向線程池中添加線程。
                 * 因為在SHUTDOWN時不會在添加新的任務，但還是會執行workQueue中的任務
                 */
                if (rs < SHUTDOWN ||
                    (rs == SHUTDOWN && firstTask == null)) {
                    // 判斷線程是否存活, 已存活拋出非法異常
                    if (t.isAlive()) // precheck that t is startable
                        throw new IllegalThreadStateException();
                    //  設置包含池中的所有工作線程。僅在持有mainLock時訪問 workers是 HashSet 集合
                    workers.add(w);
                    int s = workers.size();
                    // 設置池最大大小, 並將 workerAdded設置為 true
                    // largestPoolSize記錄着線程池中出現過的最大線程數量
                    if (s > largestPoolSize)
                        largestPoolSize = s;
                    workerAdded = true;
                }
            } finally {
                // 解鎖
                mainLock.unlock();
            }
            // 添加成功 開始啟動線程 並將 workerStarted 設置為 true
            if (workerAdded) {
                // 啟動線程
                t.start();
                workerStarted = true;
            }
        }
    } finally {
        // 啟動線程失敗
        if (! workerStarted)
            addWorkerFailed(w);
    }
    return workerStarted;
}

注意一下這里的 t.start() 這個語句，啟動時會調用Worker類中的run方法，Worker本身實現了Runnable接口，所以一個Worker類型的對象也是一個線程。

四、Worker類

工作線程：線程池創建線程時，會將線程封裝成工作線程Worker，接下來看看源碼：

private final class Worker
    extends AbstractQueuedSynchronizer
    implements Runnable
{
    /**
     * This class will never be serialized, but we provide a
     * serialVersionUID to suppress a javac warning.
     */
    private static final long serialVersionUID = 6138294804551838833L;

    /** Thread this worker is running in.  Null if factory fails. */
    final Thread thread;
    /** Initial task to run.  Possibly null. */
    Runnable firstTask;
    /** Per-thread task counter */
    volatile long completedTasks;

    /**
     * Creates with given first task and thread from ThreadFactory.
     * @param firstTask the first task (null if none)
     */
    Worker(Runnable firstTask) {
        setState(-1); // inhibit interrupts until runWorker
        this.firstTask = firstTask;
        this.thread = getThreadFactory().newThread(this);
    }

    /** Delegates main run loop to outer runWorker  */
    public void run() {
        runWorker(this);
    }

    // Lock methods
    //
    // The value 0 represents the unlocked state.
    // The value 1 represents the locked state.

    protected boolean isHeldExclusively() {
        return getState() != 0;
    }

    protected boolean tryAcquire(int unused) {
        if (compareAndSetState(0, 1)) {
            setExclusiveOwnerThread(Thread.currentThread());
            return true;
        }
        return false;
    }

    protected boolean tryRelease(int unused) {
        setExclusiveOwnerThread(null);
        setState(0);
        return true;
    }

    public void lock()        { acquire(1); }
    public boolean tryLock()  { return tryAcquire(1); }
    public void unlock()      { release(1); }
    public boolean isLocked() { return isHeldExclusively(); }

    void interruptIfStarted() {
        Thread t;
        if (getState() >= 0 && (t = thread) != null && !t.isInterrupted()) {
            try {
                t.interrupt();
            } catch (SecurityException ignore) {
            }
        }
    }
}

Worker類繼承了AQS，並實現了Runnable接口，注意其中的 firstTask 和 thread 屬性： firstTask 用它來保存傳入的任務； thread 是在調用構造方法時通過 ThreadFactory 來創建的線程，是用來處理任務的線程。

在調用構造方法時，需要把任務傳入，這里通過 getThreadFactory().newThread(this)來新建一個線程， newThread 方法傳入的參數是this，因為Worker本身繼承了Runnable接口，也就是一個線程，所以一個Worker對象在啟動的時候會調用Worker類中的run方法。

Worker繼承了AQS，使用AQS來實現獨占鎖的功能。為什么不使用ReentrantLock來實現呢？可以看到tryAcquire方法，它是不允許重入的，而ReentrantLock是允許重入的：

lock方法一旦獲取了獨占鎖，表示當前線程正在執行任務中；
如果正在執行任務，則不應該中斷線程；
如果該線程現在不是獨占鎖的狀態，也就是空閑的狀態，說明它沒有在處理任務，這時可以對該線程進行中斷；
線程池在執行shutdown方法或tryTerminate方法時會調用interruptIdleWorkers方法來中斷空閑的線程，interruptIdleWorkers方法會使用tryLock方法來判斷線程池中的線程是否是空閑狀態；
之所以設置為不可重入，是因為我們不希望任務在調用像setCorePoolSize這樣的線程池控制方法時重新獲取鎖。如果使用ReentrantLock，它是可重入的，這樣如果在任務中調用了如setCorePoolSize這類線程池控制的方法，會中斷正在運行的線程。

所以，Worker繼承自AQS，用於判斷線程是否空閑以及是否可以被中斷。

此外，在構造方法中執行了 setState(-1) ，把state變量設置為-1，為什么這么做呢？是因為AQS中默認的state是0，如果剛創建了一個Worker對象，還沒有執行任務時，這時就不應該被中斷，看一下tryAquire方法：

protected boolean tryAcquire(int unused) {
    if (compareAndSetState(0, 1)) { setExclusiveOwnerThread(Thread.currentThread()); return true; } return false; }

tryAcquire方法是根據state是否是0來判斷的，所以，setState(-1);將state設置為-1是為了禁止在執行任務前對線程進行中斷。

正因為如此，在runWorker方法中會先調用Worker對象的unlock方法將state設置為0。

五、runWorker方法

在Worker類中的run方法調用了runWorker方法來執行任務，runWorker方法的代碼如下：

/**
 * Main worker run loop.  Repeatedly gets tasks from queue and
 * executes them, while coping with a number of issues:
 *
 * 1. We may start out with an initial task, in which case we
 * don't need to get the first one. Otherwise, as long as pool is
 * running, we get tasks from getTask. If it returns null then the
 * worker exits due to changed pool state or configuration
 * parameters.  Other exits result from exception throws in
 * external code, in which case completedAbruptly holds, which
 * usually leads processWorkerExit to replace this thread.
 *
 * 2. Before running any task, the lock is acquired to prevent
 * other pool interrupts while the task is executing, and then we
 * ensure that unless pool is stopping, this thread does not have
 * its interrupt set.
 *
 * 3. Each task run is preceded by a call to beforeExecute, which
 * might throw an exception, in which case we cause thread to die
 * (breaking loop with completedAbruptly true) without processing
 * the task.
 *
 * 4. Assuming beforeExecute completes normally, we run the task,
 * gathering any of its thrown exceptions to send to afterExecute.
 * We separately handle RuntimeException, Error (both of which the
 * specs guarantee that we trap) and arbitrary Throwables.
 * Because we cannot rethrow Throwables within Runnable.run, we
 * wrap them within Errors on the way out (to the thread's
 * UncaughtExceptionHandler).  Any thrown exception also
 * conservatively causes thread to die.
 *
 * 5. After task.run completes, we call afterExecute, which may
 * also throw an exception, which will also cause thread to
 * die. According to JLS Sec 14.20, this exception is the one that
 * will be in effect even if task.run throws.
 *
 * The net effect of the exception mechanics is that afterExecute
 * and the thread's UncaughtExceptionHandler have as accurate
 * information as we can provide about any problems encountered by
 * user code.
 *
 * @param w the worker
 */
final void runWorker(Worker w) {
    Thread wt = Thread.currentThread();
    // 獲取第一個任務
    Runnable task = w.firstTask;
    w.firstTask = null;
    // 允許中斷
    w.unlock(); // allow interrupts
    // 是否因為異常退出循環
    boolean completedAbruptly = true;
    try {
        // 如果task為空，則通過getTask來獲取任務
        // getTask()方法循環獲取工作隊列的任務
        while (task != null || (task = getTask()) != null) {
            w.lock();
            // If pool is stopping, ensure thread is interrupted;
            // if not, ensure thread is not interrupted.  This
            // requires a recheck in second case to deal with
            // shutdownNow race while clearing interrupt
            if ((runStateAtLeast(ctl.get(), STOP) ||
                 (Thread.interrupted() &&
                  runStateAtLeast(ctl.get(), STOP))) &&
                !wt.isInterrupted())
                wt.interrupt();
            try {
                beforeExecute(wt, task);
                Throwable thrown = null;
                try {
                    task.run();
                } catch (RuntimeException x) {
                    thrown = x; throw x;
                } catch (Error x) {
                    thrown = x; throw x;
                } catch (Throwable x) {
                    thrown = x; throw new Error(x);
                } finally {
                    afterExecute(task, thrown);
                }
            } finally {
                task = null;
                w.completedTasks++;
                w.unlock();
            }
        }
        completedAbruptly = false;
    } finally {
        processWorkerExit(w, completedAbruptly);
    }
}

這里說明一下第一個if判斷，目的是：

如果線程池正在停止，那么要保證當前線程是中斷狀態；
如果不是的話，則要保證當前線程不是中斷狀態；

這里要考慮在執行該if語句期間可能也執行了shutdownNow方法，shutdownNow方法會把狀態設置為STOP，回顧一下STOP狀態：

不能接受新任務，也不處理隊列中的任務，會中斷正在處理任務的線程。在線程池處於 RUNNING 或 SHUTDOWN 狀態時，調用 shutdownNow() 方法會使線程池進入到該狀態。

STOP狀態要中斷線程池中的所有線程，而這里使用 Thread.interrupted() 來判斷是否中斷是為了確保在RUNNING或者SHUTDOWN狀態時線程是非中斷狀態的，因為 Thread.interrupted() 方法會復位中斷的狀態。

總結一下runWorker方法的執行過程：

while循環不斷地通過getTask()方法獲取任務；
getTask()方法從阻塞隊列中取任務；
如果線程池正在停止，那么要保證當前線程是中斷狀態，否則要保證當前線程不是中斷狀態；
調用task.run()執行任務；
如果task為null則跳出循環，執行processWorkerExit()方法；
runWorker方法執行完畢，也代表着Worker中的run方法執行完畢，銷毀線程。

這里的beforeExecute方法和afterExecute方法在ThreadPoolExecutor類中是空的，留給子類來實現。

completedAbruptly變量來表示在執行任務過程中是否出現了異常，在processWorkerExit方法中會對該變量的值進行判斷。

六、getTask方法

getTask方法用來從阻塞隊列中取任務，代碼如下：

/**
 * Performs blocking or timed wait for a task, depending on
 * current configuration settings, or returns null if this worker
 * must exit because of any of:
 * 1. There are more than maximumPoolSize workers (due to
 *    a call to setMaximumPoolSize).
 * 2. The pool is stopped.
 * 3. The pool is shutdown and the queue is empty.
 * 4. This worker timed out waiting for a task, and timed-out
 *    workers are subject to termination (that is,
 *    {@code allowCoreThreadTimeOut || workerCount > corePoolSize})
 *    both before and after the timed wait, and if the queue is
 *    non-empty, this worker is not the last thread in the pool.
 *
 * @return task, or null if the worker must exit, in which case
 *         workerCount is decremented
 */
private Runnable getTask() {
    // timeOut變量的值表示上次從阻塞隊列中取任務時是否超時
    boolean timedOut = false; // Did the last poll() time out?

    for (;;) {
        int c = ctl.get();
        int rs = runStateOf(c);

        // Check if queue empty only if necessary.
        /*
         * 如果線程池狀態rs >= SHUTDOWN，也就是非RUNNING狀態，再進行以下判斷：
         * 1. rs >= STOP，線程池是否正在stop；
         * 2. 阻塞隊列是否為空。
         * 如果以上條件滿足，則將workerCount減1並返回null。
         * 因為如果當前線程池狀態的值是SHUTDOWN或以上時，不允許再向阻塞隊列中添加任務。
         */
        if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
            decrementWorkerCount();
            return null;
        }

        int wc = workerCountOf(c);

        // Are workers subject to culling?
        // 允許核心線程超時 或者當前線程數大於核心線程數
        /* timed變量用於判斷是否需要進行超時控制。
         * allowCoreThreadTimeOut默認是false，也就是核心線程不允許進行超時；
         * wc > corePoolSize，表示當前線程池中的線程數量大於核心線程數量；
         * 對於超過核心線程數量的這些線程，需要進行超時控制
         */
        boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;
        /*
         * wc > maximumPoolSize的情況是因為可能在此方法執行階段同時執行了setMaximumPoolSize方法；
         * timed && timedOut 如果為true，表示當前操作需要進行超時控制，並且上次從阻塞隊列中獲取任務發生了超時
         * 接下來判斷，如果有效線程數量大於1，或者阻塞隊列是空的，那么嘗試將workerCount減1；
         * 如果減1失敗，則返回重試。
         * 如果wc == 1時，也就說明當前線程是線程池中唯一的一個線程了。
         */
        if ((wc > maximumPoolSize || (timed && timedOut))
            && (wc > 1 || workQueue.isEmpty())) {
            if (compareAndDecrementWorkerCount(c))
                return null;
            continue;
        }

        try {
            /*
             * 根據timed來判斷，如果為true，則通過阻塞隊列的poll方法進行超時控制，如果在keepAliveTime時間內沒有獲取到任務，則返回null；
             * 否則通過take方法，如果這時隊列為空，則take方法會阻塞直到隊列不為空。
             * 
             */
            Runnable r = timed ?
                    // 從工作隊列poll任務，不阻塞
                workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                    // 阻塞等待任務
                workQueue.take();
            if (r != null)
                return r;
            // 如果 r == null，說明已經超時，timedOut設置為true
            timedOut = true;
        } catch (InterruptedException retry) {
            // 如果獲取任務時當前線程發生了中斷，則設置timedOut為false並返回循環重試
            timedOut = false;
        }
    }
}

這里重要的地方是第二個if判斷，目的是控制線程池的有效線程數量。由上文中的分析可以知道，在執行execute方法時，如果當前線程池的線程數量超過了corePoolSize且小於maximumPoolSize，並且workQueue已滿時，則可以增加工作線程，但這時如果超時沒有獲取到任務，也就是timedOut為true的情況，說明workQueue已經為空了，也就說明了當前線程池中不需要那么多線程來執行任務了，可以把多於corePoolSize數量的線程銷毀掉，保持線程數量在corePoolSize即可。

什么時候會銷毀？當然是runWorker方法執行完之后，也就是Worker中的run方法執行完，由JVM自動回收。

getTask方法返回null時，在runWorker方法中會跳出while循環，然后會執行processWorkerExit方法。

ThreadPoolExecutor中線程執行任務的示意圖如下圖所示。

線程池中的線程執行任務分兩種情況，如下。

1）在execute()方法中創建一個線程時，會讓這個線程執行當前任務。

2）這個線程執行完上圖中1的任務后，會反復從BlockingQueue獲取任務來執行。

七、processWorkerExit方法

private void processWorkerExit(Worker w, boolean completedAbruptly) {
    // 如果completedAbruptly值為true，則說明線程執行時出現了異常，需要將workerCount減1；
    // 如果線程執行時沒有出現異常，說明在getTask()方法中已經已經對workerCount進行了減1操作，這里就不必再減了。  
    if (completedAbruptly) // If abrupt, then workerCount wasn't adjusted
        decrementWorkerCount();
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        //統計完成的任務數
        completedTaskCount += w.completedTasks;
        // 從workers中移除，也就表示着從線程池中移除了一個工作線程
        workers.remove(w);
    } finally {
        mainLock.unlock();
    }
    // 根據線程池狀態進行判斷是否結束線程池
    tryTerminate();
    int c = ctl.get();
    /*
     * 當線程池是RUNNING或SHUTDOWN狀態時，如果worker是異常結束，那么會直接addWorker；
     * 如果allowCoreThreadTimeOut=true，並且等待隊列有任務，至少保留一個worker；
     * 如果allowCoreThreadTimeOut=false，workerCount不少於corePoolSize。
     */
    if (runStateLessThan(c, STOP)) {
        if (!completedAbruptly) {
            int min = allowCoreThreadTimeOut ? 0 : corePoolSize;
            if (min == 0 && ! workQueue.isEmpty())
                min = 1;
            if (workerCountOf(c) >= min)
                return; // replacement not needed
        }
        addWorker(null, false);
    }
}

至此，processWorkerExit執行完之后，工作線程被銷毀，以上就是整個工作線程的生命周期，從execute方法開始，Worker使用ThreadFactory創建新的工作線程，runWorker通過getTask獲取任務，然后執行任務，如果getTask返回null，進入processWorkerExit方法，整個線程結束，如圖所示：

八、tryTerminate方法

tryTerminate方法根據線程池狀態進行判斷是否結束線程池，代碼如下：

final void tryTerminate() {
    for (;;) {
        int c = ctl.get();
        /*
         * 當前線程池的狀態為以下幾種情況時，直接返回：
         * 1. RUNNING，因為還在運行中，不能停止；
         * 2. TIDYING或TERMINATED，因為線程池中已經沒有正在運行的線程了；
         * 3. SHUTDOWN並且等待隊列非空，這時要執行完workQueue中的task；
         */
        if (isRunning(c) ||
            runStateAtLeast(c, TIDYING) ||
            (runStateOf(c) == SHUTDOWN && ! workQueue.isEmpty()))
            return;
        // 如果線程數量不為0，則中斷一個空閑的工作線程，並返回
        if (workerCountOf(c) != 0) { // Eligible to terminate
            interruptIdleWorkers(ONLY_ONE);
            return;
        }
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
            // 這里嘗試設置狀態為TIDYING，如果設置成功，則調用terminated方法
            if (ctl.compareAndSet(c, ctlOf(TIDYING, 0))) {
                try {
                    // terminated方法默認什么都不做，留給子類實現
                    terminated();
                } finally {
                    // 設置狀態為TERMINATED
                    ctl.set(ctlOf(TERMINATED, 0));
                    termination.signalAll();
                }
                return;
            }
        } finally {
            mainLock.unlock();
        }
        // else retry on failed CAS
    }
}

interruptIdleWorkers(ONLY_ONE); 的作用是因為在getTask方法中執行 workQueue.take() 時，如果不執行中斷會一直阻塞。在下面介紹的shutdown方法中，會中斷所有空閑的工作線程，如果在執行shutdown時工作線程沒有空閑，然后又去調用了getTask方法，這時如果workQueue中沒有任務了，調用 workQueue.take() 時就會一直阻塞。所以每次在工作線程結束時調用tryTerminate方法來嘗試中斷一個空閑工作線程，避免在隊列為空時取任務一直阻塞的情況。

九、shutdown方法

shutdown方法要將線程池切換到SHUTDOWN狀態，並調用interruptIdleWorkers方法請求中斷所有空閑的worker，最后調用tryTerminate嘗試結束線程池。

public void shutdown() {
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        // 安全策略判斷
        checkShutdownAccess();
        // 切換狀態為SHUTDOWN
        advanceRunState(SHUTDOWN);
        // 中斷空閑線程
        interruptIdleWorkers();
        onShutdown(); // hook for ScheduledThreadPoolExecutor
    } finally {
        mainLock.unlock();
    }
    // 嘗試結束線程池
    tryTerminate();
}

這里思考一個問題：在runWorker方法中，執行任務時對Worker對象w進行了lock操作，為什么要在執行任務的時候對每個工作線程都加鎖呢？

下面仔細分析一下：

在getTask方法中，如果這時線程池的狀態是SHUTDOWN並且workQueue為空，那么就應該返回null來結束這個工作線程，而使線程池進入SHUTDOWN狀態需要調用shutdown方法；
shutdown方法會調用interruptIdleWorkers來中斷空閑的線程，interruptIdleWorkers持有mainLock，會遍歷workers來逐個判斷工作線程是否空閑。但getTask方法中沒有mainLock；
在getTask中，如果判斷當前線程池狀態是RUNNING，並且阻塞隊列為空，那么會調用 workQueue.take() 進行阻塞；
如果在判斷當前線程池狀態是RUNNING后，這時調用了shutdown方法把狀態改為了SHUTDOWN，這時如果不進行中斷，那么當前的工作線程在調用了 workQueue.take() 后會一直阻塞而不會被銷毀，因為在SHUTDOWN狀態下不允許再有新的任務添加到workQueue中，這樣一來線程池永遠都關閉不了了；
由上可知，shutdown方法與getTask方法（從隊列中獲取任務時）存在競態條件；
解決這一問題就需要用到線程的中斷，也就是為什么要用interruptIdleWorkers方法。在調用 workQueue.take() 時，如果發現當前線程在執行之前或者執行期間是中斷狀態，則會拋出InterruptedException，解除阻塞的狀態；
但是要中斷工作線程，還要判斷工作線程是否是空閑的，如果工作線程正在處理任務，就不應該發生中斷；
所以Worker繼承自AQS，在工作線程處理任務時會進行lock，interruptIdleWorkers在進行中斷時會使用tryLock來判斷該工作線程是否正在處理任務，如果tryLock返回true，說明該工作線程當前未執行任務，這時才可以被中斷。

下面就來分析一下interruptIdleWorkers方法。

十、interruptIdleWorkers方法

private void interruptIdleWorkers() {
    interruptIdleWorkers(false);
}
private void interruptIdleWorkers(boolean onlyOne) {
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        for (Worker w : workers) {
            Thread t = w.thread;
            if (!t.isInterrupted() && w.tryLock()) {
                try {
                    t.interrupt();
                } catch (SecurityException ignore) {
                } finally {
                    w.unlock();
                }
            }
            if (onlyOne)
                break;
        }
    } finally {
        mainLock.unlock();
    }
}

interruptIdleWorkers遍歷workers中所有的工作線程，若線程沒有被中斷tryLock成功，就中斷該線程。

為什么需要持有mainLock？因為workers是HashSet類型的，不能保證線程安全。

十一、shutdownNow方法

public List<Runnable> shutdownNow() {
    List<Runnable> tasks;
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        checkShutdownAccess();
        advanceRunState(STOP);
        // 中斷所有工作線程，無論是否空閑
        interruptWorkers();
        // 取出隊列中沒有被執行的任務
        tasks = drainQueue();
    } finally {
        mainLock.unlock();
    }
    tryTerminate();
    return tasks;
}

shutdownNow方法與shutdown方法類似，不同的地方在於：

設置狀態為STOP；
中斷所有工作線程，無論是否是空閑的；
取出阻塞隊列中沒有被執行的任務並返回。

shutdownNow方法執行完之后調用tryTerminate方法，該方法在上文已經分析過了，目的就是使線程池的狀態設置為TERMINATED。

參考：深入理解Java線程池：ThreadPoolExecutor

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Java並發包--線程池原理 Java並發集合及線程池實現原理 Java並發之——線程池 Java 並發編程——Executor框架和線程池原理 Java並發包中線程池ThreadPoolExecutor原理探究 Java並發（二十一）：線程池實現原理 <關於並發框架>Java原生線程池原理及Guava與之的補充 java高並發之線程池【Java並發編程六】線程池 Java並發（六）線程池監控