記一次 HttpClient 死鎖問題


原文:http://blog.kail.xyz/post/2019-04-21/tools/httpclient-lock.html

 

最近遇到一個使用 Apache HttpClient 過程中的問題,具體場景是

  • 通過 Spring @Scheduled(cron = "..") 方式執行定時任務
  • 定時任務中並發使用 HttpClient 拉取數據
  • 但是定時任務只會執行一次
  • 因為 Spring 基於注解的定時任務,在非異步的情況的,上一次任務執行完之前不會執行下一個任務
  • 所以懷疑是第一次執行的任務一直沒有執行完,卡在了某個地方

還原場景

maven 依賴

1
2
3
4
5
6
7
8
9
10
11
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpcore</artifactId>
<version>4.4.6</version>
</dependency>

<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.5.3</version>
</dependency>

程序簡化后,代碼如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
package xyz.kail.demo.java.se.temp;

import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;

import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class HttpClientMain {


public static void main(String[] args) throws IOException, InterruptedException {

int count = 20;

CountDownLatch countDownLatch = new CountDownLatch(count);

CloseableHttpClient httpClient = HttpClients.createDefault();

ExecutorService executorService = Executors.newFixedThreadPool(count);

for (int i = 0; i < count; i++) {

executorService.submit(() -> {
countDownLatch.countDown(); // line num: 32
try {
request(httpClient);
} catch (IOException | InterruptedException e) {
e.printStackTrace();
}
});

}

countDownLatch.await(); // line num: 42
System.out.println("countDownLatch.await();");

httpClient.close(); // line num: 45
System.out.println("httpClient.close();");

executorService.shutdown();
System.out.println("executorService.shutdown();");


}

private static void request(CloseableHttpClient client) throws IOException, InterruptedException {

HttpGet request = new HttpGet("http://blog.kail.xyz");
RequestConfig requestConfig = RequestConfig.custom()
.setConnectTimeout(1_000)
.setSocketTimeout(5_000)
.build();

request.setConfig(requestConfig);
try (CloseableHttpResponse response = client.execute(request)) { // line num: 63
String data = EntityUtils.toString(response.getEntity(), StandardCharsets.UTF_8);
} finally {
request.releaseConnection();
}
}
}

正常情況下,以上程序會輸出

countDownLatch.await();
httpClient.close();
executorService.shutdown();

但是多運行幾次,會發現有時候會只輸出 countDownLatch.await();,程序會卡在 httpClient.close();

查看線程信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
$ jcmd
...
51051 xyz.kail.demo.java.se.temp.HttpClientMain

$ jcmd 51051 Thread.print
...

"pool-1-thread-20" #30 prio=5 os_prio=31 tid=0x00007fbb5b22c000 nid=0x6803 waiting on condition [0x0000700005997000]
java.lang.Thread.State: WAITING (parking) # ❤❤❤❤ 關注 WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076c0e7488> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
# ❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤ 關注
at org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(AbstractConnPool.java:377)
at org.apache.http.pool.AbstractConnPool.access$200(AbstractConnPool.java:67)
at org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:243)
- locked <0x000000076de511b8> (a org.apache.http.pool.AbstractConnPool$2)
at org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:191)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:282)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:269)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:191)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
# ❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤ 關注
at xyz.kail.demo.java.se.temp.HttpClientMain.request(HttpClientMain.java:63)
at xyz.kail.demo.java.se.temp.HttpClientMain.lambda$main$0(HttpClientMain.java:34)
at xyz.kail.demo.java.se.temp.HttpClientMain$$Lambda$2/2137589296.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

...

"main" #1 prio=5 os_prio=31 tid=0x00007fbb5a802000 nid=0x1903 waiting for monitor entry [0x0000700003732000]
java.lang.Thread.State: BLOCKED (on object monitor) # ❤❤❤❤ 關注 BLOCKED
# ❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤ 關注
at org.apache.http.pool.AbstractConnPool$2.cancel(AbstractConnPool.java:207)
- waiting to lock <0x000000076c88eb20> (a org.apache.http.pool.AbstractConnPool$2)
at org.apache.http.pool.RouteSpecificPool.shutdown(RouteSpecificPool.java:155)
at org.apache.http.pool.AbstractConnPool.shutdown(AbstractConnPool.java:152)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.shutdown(PoolingHttpClientConnectionManager.java:396)
at org.apache.http.impl.client.HttpClientBuilder$2.close(HttpClientBuilder.java:1225)
at org.apache.http.impl.client.InternalHttpClient.close(InternalHttpClient.java:201)
# ❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤ 關注
at xyz.kail.demo.java.se.temp.HttpClientMain.main(HttpClientMain.java:45)

...

回憶一下線程狀態

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
package java.lang;

...

public class Thread implements Runnable {

...

public enum State {
/**
* 新創建了一個線程對象,但還沒有調用start()方法
*/
NEW,

/**
* Thead.start
*/
RUNNABLE,

/**
* 等待/阻塞 獲取 synchronized 鎖
*/
BLOCKED,

/**
* Object.wait / Thread.join / LockSupport.park 未設置超時時間
*/
WAITING,

/**
* 以下的幾種情況
* <li>{@link #sleep Thread.sleep}</li>
* <li>{@link Object#wait(long) Object.wait} with timeout</li>
* <li>{@link #join(long) Thread.join} with timeout</li>
* <li>{@link LockSupport#parkNanos LockSupport.parkNanos}</li>
* <li>{@link LockSupport#parkUntil LockSupport.parkUntil}</li>
*/
TIMED_WAITING,

/**
* 線程已經執行完成
*/
TERMINATED;
}
}

Java線程的6種狀態及切換

根據 Thread.print 信息找到源碼位置

AbstractConnPool.java:377

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// 入口
// xyz.kail.demo.java.se.temp.HttpClientMain.request(HttpClientMain.java:63)
// at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
private E getPoolEntryBlocking(...){
...
this.lock.lock();
try {
...
} else {
... // 5️⃣ 線程2 wait,但是這時候線程1已經 BLOCKED
this.condition.await(); // line num: 377 [WAITING (parking)]
success = true;
}
...
} finally {
this.lock.unlock();
}
}
}

AbstractConnPool.java:207

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
@Override
public Future<E> lease(final T route, final Object state, final FutureCallback<E> callback) {

...

return new Future<E>() {

...
// 入口
// at xyz.kail.demo.java.se.temp.HttpClientMain.main(HttpClientMain.java:45)
// at org.apache.http.impl.client.InternalHttpClient.close(InternalHttpClient.java:201)
@Override
public boolean cancel(final boolean mayInterruptIfRunning) {
cancelled = true;
lock.lock();
try {
condition.signalAll(); // 1️⃣ 線程1
} finally {
lock.unlock();
}
// 3️⃣ 線程1 這時候線程2已經獲取到鎖,這里 BLOCKED
synchronized (this) { // line num 207 BLOCKED (on object monitor)
...
}
}

...

@Override
public E get(final long timeout, final TimeUnit tunit) throws InterruptedException, ExecutionException, TimeoutException {
...
synchronized (this) { // 2️⃣ 線程2 獲取鎖
... // 4️⃣ 線程2 執行 getPoolEntryBlocking 方法
final E leasedEntry = getPoolEntryBlocking(...); // 調用 377 行的代碼
...
}
}

};
}

可能原因分析

根據調用入口 大致可以確定 是 close 釋放 HttpClient 資源的時候 和 execute 請求獲取資源的時候 產生了死鎖。

模擬可能的執行流程如下:

  • 線程1:condition.signalAll()
  • 線程2: 獲取 this 鎖
  • 線程1:獲取 this 鎖 失敗,BLOCKED
  • 線程2:執行 getPoolEntryBlocking 方法
    • 線程2:condition.wait (WAITING (parking))
  • 最終兩個線程 在互相等待對方釋放鎖/喚醒,產生死鎖

分析到這基本上可以確定 應該是 httpcore 中 org.apache.http.pool.AbstractConnPool 這個類的Bug

如何解決

如果是 HttpClient (httpcore 模塊) 的 Bug,可以看一下官方有沒有修復,到 Github 官方倉庫 httpcomponents-core 找到指定的文件 org/apache/http/pool/AbstractConnPool.java 查看 提交歷史, Ctrl + F 搜索 關鍵字 fix,最終找到了這次提交 HTTPCORE-446: fixed deadlock in AbstractConnPool shutdown code

點擊這次提交 右側的 <> 按鈕(Browse the repository at this point in the history) 查看這次提交后的 git 倉庫,發現修復之后 httpcore 的版本是 4.4.7-SNAPSHOT 。

升級到 httpcore maven 版本到 4.4.7+ 后重試最初的代碼,發現死鎖問題已經解決,但是會拋出以下異常:

1
2
3
4
5
6
7
8
9
10
11
12
org.apache.http.impl.execchain.RequestAbortedException: Request aborted
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:194)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
...
Caused by: java.lang.InterruptedException: Operation interrupted
at org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(AbstractConnPool.java:384)
...

如何避免

  • 多線程並發使用共享資源的時候,如果不了解共享資源的內部機制,不了解是否存在並發問題的時候,一定要小心,如果不分析源碼,最好也上網查一下相關的問題,如:”httpclient 並發問題” 等
  • CountDownLatch 的使用方式也存在問題,比如這個示例程序中,countDownLatch.countDown() 寫在了線程執行邏輯的第一行,真正的邏輯還沒開始執行,就已經 countDown,實際上並沒有起到相應的作用
  • 如果確定共享資源存在並發問題,並且不確定官方有沒有提供相應的解決方案的話,最快但不是最好的方式是:把共享資源放到線程內作為線程內部的資源,避免並發問題

其它收獲


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM