參考:https://www.jianshu.com/p/3ef261ab157c
參考:https://www.jianshu.com/p/89033630ab7a
發現問題
在項目開發中發現,發起網絡請求是會一直顯示Loading。但是我們在okhttp初始化的時候已經設置的網絡請求超時時間為30s。為什么會出現這種情況 WTF!最后發現原來是OKHttp的重試機制挖的坑
OKHttp重試機制剖析
OKHttp擁有網絡連接失敗時的重試功能:
OkHttp perseveres when the network is troublesome: it will silently recover from common connection problems. If your service has multiple IP addresses OkHttp will attempt alternate addresses if the first connect fails. This is necessary for IPv4+IPv6 and for services hosted in redundant data centers. OkHttp initiates new connections with modern TLS features (SNI, ALPN), and falls back to TLS 1.0 if the handshake fails.
要了解OKHttp的重試機制,我們最關心的就是
RetryAndFollowUpInterceptor
, 在遭遇網絡異常時,OKHttp的網絡異常相關的重試都在
RetryAndFollowUpInterceptor
完成。具體我們先從
RetryAndFollowUpInterceptor
的#
intercept(Chain chian)
方法開始入手
1 public Response intercept(Chain chain) throws IOException { 2 Request request = chain.request(); 3 this.streamAllocation = new StreamAllocation(this.client.connectionPool(), this.createAddress(request.url())); 4 int followUpCount = 0; 5 Response priorResponse = null; 6 //while循環 7 while(!this.canceled) { 8 Response response = null; 9 boolean releaseConnection = true; 10 11 try { 12 response = ((RealInterceptorChain)chain).proceed(request, this.streamAllocation, (HttpStream)null, (Connection)null); 13 releaseConnection = false; 14 } catch (RouteException var12) { 15 if(!this.recover(var12.getLastConnectException(), true, request)) { 16 throw var12.getLastConnectException(); 17 } 18 19 releaseConnection = false; 20 continue; 21 } catch (IOException var13) { 22 if(!this.recover(var13, false, request)) { 23 throw var13; 24 } 25 26 releaseConnection = false; 27 continue; 28 } finally { 29 if(releaseConnection) { 30 this.streamAllocation.streamFailed((IOException)null); 31 this.streamAllocation.release(); 32 } 33 34 } 35 36 if(priorResponse != null) { 37 response = response.newBuilder().priorResponse(priorResponse.newBuilder().body((ResponseBody)null).build()).build(); 38 } 39 40 Request followUp = this.followUpRequest(response); 41 if(followUp == null) { 42 if(!this.forWebSocket) { 43 this.streamAllocation.release(); 44 } 45 46 return response; 47 } 48 49 Util.closeQuietly(response.body()); 50 ++followUpCount; 51 if(followUpCount > 20) { 52 this.streamAllocation.release(); 53 throw new ProtocolException("Too many follow-up requests: " + followUpCount); 54 } 55 56 if(followUp.body() instanceof UnrepeatableRequestBody) { 57 throw new HttpRetryException("Cannot retry streamed HTTP body", response.code()); 58 } 59 60 if(!this.sameConnection(response, followUp.url())) { 61 this.streamAllocation.release(); 62 this.streamAllocation = new StreamAllocation(this.client.connectionPool(), this.createAddress(followUp.url())); 63 } else if(this.streamAllocation.stream() != null) { 64 throw new IllegalStateException("Closing the body of " + response + " didn\'t close its backing stream. Bad interceptor?"); 65 } 66 67 request = followUp; 68 priorResponse = response; 69 } 70 71 this.streamAllocation.release(); 72 throw new IOException("Canceled"); 73 }
去掉代碼片段中的非核心邏輯:
1 //StreamAllocation init... 2 Response priorResponse = null; 3 while (true) { 4 if (canceled) { 5 streamAllocation.release(); 6 throw new IOException("Canceled"); 7 } 8 9 Response response; 10 boolean releaseConnection = true; 11 try { 12 response = realChain.proceed(request, streamAllocation, null, null); 13 releaseConnection = false; 14 } catch (RouteException e) { 15 //socket連接階段,如果發生連接失敗,會統一封裝成該異常並拋出 16 `RouteException`:通過路由的嘗試失敗了,請求將不會被發送,此時會嘗試通過調用`#recover`來恢復; 17 // The attempt to connect via a route failed. The request will not have been sent. 18 if (!recover(e.getLastConnectException(), false, request)) { 19 throw e.getLastConnectException(); 20 } 21 releaseConnection = false; 22 continue; 23 } catch (IOException e) { 24 //socket連接成功后,發生請求階段時拋出的各類網絡異常 25 // An attempt to communicate with a server failed. The request may have been sent. 26 boolean requestSendStarted = !(e instanceof ConnectionShutdownException); 27 if (!recover(e, requestSendStarted, request)) throw e; 28 releaseConnection = false; 29 continue; 30 } finally { 31 // We're throwing an unchecked exception. Release any resources. 32 if (releaseConnection) { 33 streamAllocation.streamFailed(null); 34 streamAllocation.release(); 35 } 36 }
原來一直在執行while循環,Okhttp在網絡請示出現錯誤時會重新發送請求,最終會不斷執行
1 catch (IOException var13) { 2 if(!this.recover(var13, false, request)) { 3 throw var13; 4 } 5 6 releaseConnection = false; 7 continue; 8 }
接下來看核心的recover方法:
1 /** 2 * Report and attempt to recover from a failure to communicate with a server. Returns true if 3 * {@code e} is recoverable, or false if the failure is permanent. Requests with a body can only 4 * be recovered if the body is buffered or if the failure occurred before the request has been 5 * sent. 6 */ 7 private boolean recover(IOException e, boolean requestSendStarted, Request userRequest) { 8 streamAllocation.streamFailed(e); 9 10 // The application layer has forbidden retries. 應用層禁止重試則不再重試 11 if (!client.retryOnConnectionFailure()) return false; 12 13 // We can't send the request body again. 如果請求已經發出,並且請求的body不支持重試則不再重試 14 if (requestSendStarted && userRequest.body() instanceof UnrepeatableRequestBody) return false; 15 16 // This exception is fatal. //致命錯誤 17 if (!isRecoverable(e, requestSendStarted)) return false; 18 19 // No more routes to attempt. 沒有更多route發起重試 20 if (!streamAllocation.hasMoreRoutes()) return false; 21 22 // For failure recovery, use the same route selector with a new connection. 23 return true; 24 }
在該方法中,首先是通過調用
streamAllocation.streamFailed(e)
來記錄該次異常,進而在
RouteDatabase
中記錄錯誤的route以降低優先級,避免下次相同address的請求依然使用這個失敗過的route。如果沒有更多可用的連接線路則不能重試連接。
1 public final class RouteDatabase { 2 private final Set<Route> failedRoutes = new LinkedHashSet<>(); 3 4 /** Records a failure connecting to {@code failedRoute}. */ 5 public synchronized void failed(Route failedRoute) { 6 failedRoutes.add(failedRoute); 7 } 8 9 /** Records success connecting to {@code route}. */ 10 public synchronized void connected(Route route) { 11 failedRoutes.remove(route); 12 } 13 14 /** Returns true if {@code route} has failed recently and should be avoided. */ 15 public synchronized boolean shouldPostpone(Route route) { 16 return failedRoutes.contains(route); 17 } 18 }
接着我們重點再關注isRecoverable
方法:
1 private boolean isRecoverable(IOException e, boolean requestSendStarted) { 2 // If there was a protocol problem, don't recover. 協議錯誤不再重試 3 if (e instanceof ProtocolException) { 4 return false; 5 } 6 7 // If there was an interruption don't recover, but if there was a timeout connecting to a route 8 // we should try the next route (if there is one) 9 if (e instanceof InterruptedIOException) { 10 return e instanceof SocketTimeoutException && !requestSendStarted; 11 } 12 13 // Look for known client-side or negotiation errors that are unlikely to be fixed by trying 14 // again with a different route. 15 if (e instanceof SSLHandshakeException) { 16 // If the problem was a CertificateException from the X509TrustManager, 17 // do not retry. 18 if (e.getCause() instanceof CertificateException) { 19 return false; 20 } 21 } 22 //使用 HostnameVerifier 來驗證 host 是否合法,如果不合法會拋出 SSLPeerUnverifiedException 23 // 握手HandShake#getSeesion 拋出的異常,屬於握手過程中的一環 24 if (e instanceof SSLPeerUnverifiedException) { 25 // e.g. a certificate pinning error. 26 return false; 27 } 28 29 // An example of one we might want to retry with a different route is a problem connecting to a 30 // proxy and would manifest as a standard IOException. Unless it is one we know we should not 31 // retry, we return true and try a new route. 32 return true; 33 }
問題解決
可以關閉okhttp的重試,讓retryOnConnectionFailure返回false就好了:
1 sClient = builder.retryOnConnectionFailure(false).build();
更新
該問題 在3.4.2版本已處理
https://github.com/square/okhttp/issues/2756
常見網絡異常分析:
UnknowHostException
產生原因:
- 網絡中斷
- DNS 服務器故障
- 域名解析劫持
解決辦法:
- HttpDNS
- 合理的兜底策略
![Uploading image_079055.png . . .]
InterruptedIOException
產生原因:
- 請求讀寫階段,請求線程被中斷
解決辦法:
- 檢查是否符合業務邏輯
SocketTimeoutException
產生原因:
- 帶寬低、延遲高
- 路徑擁堵、服務端負載吃緊
- 路由節點臨時異常
解決辦法:
- 合理設置重試
- 切換ip重試
要特別注意: 請求時因為讀寫超時等原因產生的SocketTimeoutException,OkHttp內部是不會重試的
因此如果app層特別關心該異常,則應該自定義intercetors,對該異常進行特殊處理。
SSLHandshakeException
產生原因:
- Tls協議協商失敗/握手格式不兼容
- 辦法服務器證書的CA未知
- 服務器證書不是由CA簽名的,而是自簽名
- 服務器配置缺少中間CA(不完整的證書鏈)
- 服務器主機名不匹配(SNI);
- 遭遇了中間人攻擊。
解決辦法:
- 指定SNI
- 證書鎖定
- 降級Http。。。
- 聯系SA
SSLPeerUnverifiedException
產生原因:
- 證書域名校驗錯誤
解決辦法:
- 指定SNI
- 證書鎖定
- 降級Http。。。
- 聯系SA