HttpClient is a HTTP/1.1 compliant HTTP agent implementation based on HttpCore.
It also provides reusable components for client-side authentication, HTTP state management, and HTTP connection management.
http://hc.apache.org/index.html
為什么使用HttpClient連接池?
使用連接池的好處主要有
在 keep-alive 時間內,可以使用同一個 tcp 連接發起多次 http 請求。
如果不使用連接池,在大並發的情況下,每次連接都會打開一個端口,使系統資源很快耗盡,無法建立新的連接,可以限定最多打開的端口數。
我的理解是連接池內的連接數其實就是可以同時創建多少個 tcp 連接,httpClient 維護着兩個 Set,leased(被占用的連接集合) 和 avaliabled(可用的連接集合) 兩個集合,釋放連接就是將被占用連接放到可用連接里面。
什么是 Keep-Alive
HTTP1.1 默認啟用 Keep-Alive,我們的 client(如瀏覽器)訪問 http-server(比如 tomcat/nginx/apache)連接,其實就是發起一次 tcp 連接,要經歷連接三次握手,關閉四次握手,在一次連接里面,可以發起多個 http 請求。
如果沒有 Keep-Alive,每次 http 請求都要發起一次 tcp 連接。
下圖是 apache-server 一次 http 請求返回的響應頭,可以看到連接方式就是 Keep-Alive,另外還有 Keep-Alive 頭,
其中timeout=5 5s 之內如果沒有發起新的 http 請求,服務器將斷開這次 tcp 連接,如果發起新的請求,斷開連接時間將繼續刷新為 5s
max=100 的意思在這次 tcp 連接之內,最多允許發送 100 次 http 請求,100 次之后,即使在 timeout 時間之內發起新的請求,服務器依然會斷開這次 tcp 連接
對同一個 httpClient 多次調用如下方法之后,並沒有釋放連接,最終導致了下面的問題:
解決辦法:
方法1:每次使用過HttpClient后,就關閉連接。
方法2:使用HttpClient連接池,具體的組件就是PoolingHttpClientConnectionManager
"pool-3-thread-4@10651" prio=5 tid=0x4f nid=NA waiting
java.lang.Thread.State: WAITING
at sun.misc.Unsafe.park(Unsafe.java:-1) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(AbstractConnPool.java:377) at org.apache.http.pool.AbstractConnPool.access$200(AbstractConnPool.java:67) at org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:243) - locked <0x2d1c> (a org.apache.http.pool.AbstractConnPool$2) at org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:191) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:282) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:269) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:191) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
下面報錯的原因:使用了連接池,但連接不夠用,造成大量等待。等待時間超過RequestConfig的ConnectionRequestTimeout就會拋下面的異常
如果沒有設置ConnectionRequestTimeout參數(連接不夠用時等待超時時間),當連接池連接不夠用,就會線程阻塞。因此一定要設置,設置后,不會阻塞線程,而是直接拋異常
connectionRequestTimeout:從連接池中獲取連接的超時時間,超過該時間未拿到可用連接,會拋出org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool connectTimeout:連接上服務器(握手成功)的時間,超出該時間拋出connect timeout socketTimeout:服務器返回數據(response)的時間,超過該時間拋出read timeout 通過打斷點的方式我們知道,HttpClients在我們沒有指定連接工廠的時候默認使用的是連接池工廠org.apache.http.impl.conn.PoolingHttpClientConnectionManager.PoolingHttpClientConnectionManager(Registry<ConnectionSocketFactory>),
所以我們需要配置一下從連接池獲取連接池的超時時間。
以上3個超時相關的參數如果未配置,默認為-1,意味着無限大,就是一直阻塞等待!
作者:燙燙燙燙燙燙燙燙燙燙燙燙燙燙燙
鏈接:https://www.jianshu.com/p/f38a62efaa96
來源:簡書
著作權歸作者所有。商業轉載請聯系作者獲得授權,非商業轉載請注明出處。
org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:292) ~[httpclient-4.5.3.jar!/:4.5.3] at org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:269) ~[httpclient-4.5.3.jar!/:4.5.3] at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:191) ~[httpclient-4.5.3.jar!/:4.5.3] at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) ~[httpclient-4.5.3.jar!/:4.5.3] at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) ~[httpclient-4.5.3.jar!/:4.5.3] at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) ~[httpclient-4.5.3.jar!/:4.5.3] at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[httpclient-4.5.3.jar!/:4.5.3] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) ~[httpclient-4.5.3.jar!/:4.5.3] at com.hujiang.career.databank.spider.download.CustomHttpClientDownloader.download(CustomHttpClientDownloader.java:88) ~[classes!/:1.0.0] at us.codecraft.webmagic.Spider.processRequest(Spider.java:404) [webmagic-core-0.7.3.jar!/:na] at us.codecraft.webmagic.Spider.access$000(Spider.java:61) [webmagic-core-0.7.3.jar!/:na] at us.codecraft.webmagic.Spider$1.run(Spider.java:320) [webmagic-core-0.7.3.jar!/:na] at us.codecraft.webmagic.thread.CountableThreadPool$1.run(CountableThreadPool.java:74) [webmagic-core-0.7.3.jar!/:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [na:1.8.0_144] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [na:1.8.0_144] at java.lang.Thread.run(Unknown Source) [na:1.8.0_144]
相對BasicHttpClientConnectionManager來說,PoolingHttpClientConnectionManager是個更復雜的類,它管理着連接池,可以同時為很多線程提供http連接請求。
Connections are pooled on a per route basis.
當請求一個新的連接時,如果連接池有有可用的持久連接,連接管理器就會使用其中的一個,而不是再創建一個新的連接。
《億級流量網站構架核心技術》 P236
import org.apache.http.client.config.RequestConfig; import org.apache.http.config.Registry; import org.apache.http.config.RegistryBuilder; import org.apache.http.config.SocketConfig; import org.apache.http.conn.DnsResolver; import org.apache.http.conn.HttpConnectionFactory; import org.apache.http.conn.ManagedHttpClientConnection; import org.apache.http.conn.routing.HttpRoute; import org.apache.http.conn.socket.ConnectionSocketFactory; import org.apache.http.conn.socket.PlainConnectionSocketFactory; import org.apache.http.conn.ssl.SSLConnectionSocketFactory; import org.apache.http.impl.DefaultConnectionReuseStrategy; import org.apache.http.impl.client.*; import org.apache.http.impl.conn.DefaultHttpResponseParserFactory; import org.apache.http.impl.conn.ManagedHttpClientConnectionFactory; import org.apache.http.impl.conn.PoolingHttpClientConnectionManager; import org.apache.http.impl.conn.SystemDefaultDnsResolver; import org.apache.http.impl.io.DefaultHttpRequestWriterFactory; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.http.client.ClientHttpRequestFactory; import org.springframework.http.client.HttpComponentsClientHttpRequestFactory; import org.springframework.web.client.RestTemplate; import java.io.IOException; import java.util.concurrent.TimeUnit; /** * Created by tang.cheng on 2016/10/19. */ @Configuration public class RestTemplateConfig { private static final Logger LOGGER = LoggerFactory.getLogger(RestTemplateConfig.class); @Bean public RestTemplate restTemplate() { ClientHttpRequestFactory requestFactory = new HttpComponentsClientHttpRequestFactory(getHttpClient()); return new RestTemplate(requestFactory); } //httpclient 4.5.2使用連接池的經典配置 private CloseableHttpClient getHttpClient() { //注冊訪問協議相關的Socket工廠 Registry<ConnectionSocketFactory> socketFactoryRegistry = RegistryBuilder.<ConnectionSocketFactory>create() .register("http", PlainConnectionSocketFactory.INSTANCE) .register("https", SSLConnectionSocketFactory.getSocketFactory()) .build(); //HttpConnectionFactory:配置寫請求/解析響應處理器 HttpConnectionFactory<HttpRoute, ManagedHttpClientConnection> connectionFactory = new ManagedHttpClientConnectionFactory( DefaultHttpRequestWriterFactory.INSTANCE, DefaultHttpResponseParserFactory.INSTANCE ); //DNS解析器 DnsResolver dnsResolver = SystemDefaultDnsResolver.INSTANCE; //創建連接池管理器 PoolingHttpClientConnectionManager manager = new PoolingHttpClientConnectionManager(socketFactoryRegistry, connectionFactory, dnsResolver); //設置默認的socket參數 manager.setDefaultSocketConfig(SocketConfig.custom().setTcpNoDelay(true).build()); manager.setMaxTotal(300);//設置最大連接數。高於這個值時,新連接請求,需要阻塞,排隊等待 //路由是對MaxTotal的細分。 // 每個路由實際最大連接數默認值是由DefaultMaxPerRoute控制。 // MaxPerRoute設置的過小,無法支持大並發:ConnectionPoolTimeoutException:Timeout waiting for connection from pool manager.setDefaultMaxPerRoute(200);//每個路由的最大連接 manager.setValidateAfterInactivity(5 * 1000);//在從連接池獲取連接時,連接不活躍多長時間后需要進行一次驗證,默認為2s //配置默認的請求參數 RequestConfig defaultRequestConfig = RequestConfig.custom() .setConnectTimeout(2 * 1000)//連接超時設置為2s .setSocketTimeout(5 * 1000)//等待數據超時設置為5s .setConnectionRequestTimeout(2 * 1000)//從連接池獲取連接的等待超時時間設置為2s // .setProxy(new Proxy(Proxy.Type.HTTP, new InetSocketAddress("192.168.0.2", 1234))) //設置代理 .build(); CloseableHttpClient closeableHttpClient = HttpClients.custom() .setConnectionManager(manager) .setConnectionManagerShared(false)//連接池不是共享模式,這個共享是指與其它httpClient是否共享 .evictIdleConnections(60, TimeUnit.SECONDS)//定期回收空閑連接 .evictExpiredConnections()//回收過期連接 .setConnectionTimeToLive(60, TimeUnit.SECONDS)//連接存活時間,如果不設置,則根據長連接信息決定 .setDefaultRequestConfig(defaultRequestConfig)//設置默認的請求參數 .setConnectionReuseStrategy(DefaultConnectionReuseStrategy.INSTANCE)//連接重用策略,即是否能keepAlive .setKeepAliveStrategy(DefaultConnectionKeepAliveStrategy.INSTANCE)//長連接配置,即獲取長連接生產多長時間 .setRetryHandler(new DefaultHttpRequestRetryHandler(0, false))//設置重試次數,默認為3次;當前是禁用掉 .build(); /** *JVM停止或重啟時,關閉連接池釋放掉連接 */ Runtime.getRuntime().addShutdownHook(new Thread() { @Override public void run() { try { LOGGER.info("closing http client"); closeableHttpClient.close(); LOGGER.info("http client closed"); } catch (IOException e) { LOGGER.error(e.getMessage(), e); } } }); return closeableHttpClient; } }
1.1.5. Ensuring release of low level resources
In order to ensure proper release of system resources one must close either the content stream associated with the entity or the response itself
CloseableHttpClient httpclient = HttpClients.createDefault(); HttpGet httpget = new HttpGet("http://localhost/"); CloseableHttpResponse response = httpclient.execute(httpget); try { HttpEntity entity = response.getEntity(); if (entity != null) { InputStream instream = entity.getContent(); try { // do something useful } finally { instream.close(); } } } finally { response.close(); }
The difference between closing the content stream and closing the response is that the former will attempt to keep the underlying connection alive by consuming the entity content while the latter immediately shuts down and discards the connection.
closing the content stream : 關閉內容連接,嘗試通過使用實體內容來保持底層連接的活動
closing the response : 關閉並廢棄掉連接
Please note that the HttpEntity#writeTo(OutputStream)
method is also required to ensure proper release of system resources once the entity has been fully written out. If this method obtains an instance of java.io.InputStream
by calling HttpEntity#getContent()
, it is also expected to close the stream in a finally clause.
When working with streaming entities, one can use the EntityUtils#consume(HttpEntity)
method to ensure that the entity content has been fully consumed and the underlying stream has been closed.
There can be situations, however, when only a small portion of the entire response content needs to be retrieved and the performance penalty for consuming the remaining content and making the connection reusable is too high, in which case one can terminate the content stream by closing the response.
CloseableHttpClient httpclient = HttpClients.createDefault(); HttpGet httpget = new HttpGet("http://localhost/"); CloseableHttpResponse response = httpclient.execute(httpget); try { HttpEntity entity = response.getEntity(); if (entity != null) { InputStream instream = entity.getContent(); int byteOne = instream.read(); int byteTwo = instream.read(); // Do not need the rest } } finally { response.close(); }
try (CloseableHttpResponse res = HttpClientUtil.getHttpClient().execute(post)) { // 輸出 response = JSON.parseObject(EntityUtils.toString(res.getEntity()), UploadMediaResponse.class); } catch (Exception e) { log.error(e.getMessage(), e); }
The connection will not be reused, but all level resources held by it will be correctly deallocated.
https://hc.apache.org/httpcomponents-client-ga/tutorial/html/fundamentals.html#d5e206
httpclient參數配置 https://segmentfault.com/a/1190000010771138
http://hc.apache.org/httpcomponents-client-ga/examples.html
/* http://hc.apache.org/httpcomponents-client-4.5.x/httpclient/examples/org/apache/http/examples/client/ClientConfiguration.java */ package org.apache.http.examples.client; import java.net.InetAddress; import java.net.UnknownHostException; import java.nio.charset.CodingErrorAction; import java.util.Arrays; import javax.net.ssl.SSLContext; import org.apache.http.Consts; import org.apache.http.Header; import org.apache.http.HttpHost; import org.apache.http.HttpRequest; import org.apache.http.HttpResponse; import org.apache.http.ParseException; import org.apache.http.client.CookieStore; import org.apache.http.client.CredentialsProvider; import org.apache.http.client.config.AuthSchemes; import org.apache.http.client.config.CookieSpecs; import org.apache.http.client.config.RequestConfig; import org.apache.http.client.methods.CloseableHttpResponse; import org.apache.http.client.methods.HttpGet; import org.apache.http.client.protocol.HttpClientContext; import org.apache.http.config.ConnectionConfig; import org.apache.http.config.MessageConstraints; import org.apache.http.config.Registry; import org.apache.http.config.RegistryBuilder; import org.apache.http.config.SocketConfig; import org.apache.http.conn.DnsResolver; import org.apache.http.conn.HttpConnectionFactory; import org.apache.http.conn.ManagedHttpClientConnection; import org.apache.http.conn.routing.HttpRoute; import org.apache.http.conn.socket.ConnectionSocketFactory; import org.apache.http.conn.socket.PlainConnectionSocketFactory; import org.apache.http.conn.ssl.SSLConnectionSocketFactory; import org.apache.http.impl.DefaultHttpResponseFactory; import org.apache.http.impl.client.BasicCookieStore; import org.apache.http.impl.client.BasicCredentialsProvider; import org.apache.http.impl.client.CloseableHttpClient; import org.apache.http.impl.client.HttpClients; import org.apache.http.impl.conn.DefaultHttpResponseParser; import org.apache.http.impl.conn.DefaultHttpResponseParserFactory; import org.apache.http.impl.conn.ManagedHttpClientConnectionFactory; import org.apache.http.impl.conn.PoolingHttpClientConnectionManager; import org.apache.http.impl.conn.SystemDefaultDnsResolver; import org.apache.http.impl.io.DefaultHttpRequestWriterFactory; import org.apache.http.io.HttpMessageParser; import org.apache.http.io.HttpMessageParserFactory; import org.apache.http.io.HttpMessageWriterFactory; import org.apache.http.io.SessionInputBuffer; import org.apache.http.message.BasicHeader; import org.apache.http.message.BasicLineParser; import org.apache.http.message.LineParser; import org.apache.http.ssl.SSLContexts; import org.apache.http.util.CharArrayBuffer; import org.apache.http.util.EntityUtils; /** * This example demonstrates how to customize and configure the most common aspects * of HTTP request execution and connection management. */ public class ClientConfiguration { public final static void main(String[] args) throws Exception { // Use custom message parser / writer to customize the way HTTP // messages are parsed from and written out to the data stream. HttpMessageParserFactory<HttpResponse> responseParserFactory = new DefaultHttpResponseParserFactory() { @Override public HttpMessageParser<HttpResponse> create( SessionInputBuffer buffer, MessageConstraints constraints) { LineParser lineParser = new BasicLineParser() { @Override public Header parseHeader(final CharArrayBuffer buffer) { try { return super.parseHeader(buffer); } catch (ParseException ex) { return new BasicHeader(buffer.toString(), null); } } }; return new DefaultHttpResponseParser( buffer, lineParser, DefaultHttpResponseFactory.INSTANCE, constraints) { @Override protected boolean reject(final CharArrayBuffer line, int count) { // try to ignore all garbage preceding a status line infinitely return false; } }; } }; HttpMessageWriterFactory<HttpRequest> requestWriterFactory = new DefaultHttpRequestWriterFactory(); // Use a custom connection factory to customize the process of // initialization of outgoing HTTP connections. Beside standard connection // configuration parameters HTTP connection factory can define message // parser / writer routines to be employed by individual connections. HttpConnectionFactory<HttpRoute, ManagedHttpClientConnection> connFactory = new ManagedHttpClientConnectionFactory( requestWriterFactory, responseParserFactory); // Client HTTP connection objects when fully initialized can be bound to // an arbitrary network socket. The process of network socket initialization, // its connection to a remote address and binding to a local one is controlled // by a connection socket factory. // SSL context for secure connections can be created either based on // system or application specific properties. SSLContext sslcontext = SSLContexts.createSystemDefault(); // Create a registry of custom connection socket factories for supported // protocol schemes. Registry<ConnectionSocketFactory> socketFactoryRegistry = RegistryBuilder.<ConnectionSocketFactory>create() .register("http", PlainConnectionSocketFactory.INSTANCE) .register("https", new SSLConnectionSocketFactory(sslcontext)) .build(); // Use custom DNS resolver to override the system DNS resolution. DnsResolver dnsResolver = new SystemDefaultDnsResolver() { @Override public InetAddress[] resolve(final String host) throws UnknownHostException { if (host.equalsIgnoreCase("myhost")) { return new InetAddress[] { InetAddress.getByAddress(new byte[] {127, 0, 0, 1}) }; } else { return super.resolve(host); } } }; // Create a connection manager with custom configuration. PoolingHttpClientConnectionManager connManager = new PoolingHttpClientConnectionManager( socketFactoryRegistry, connFactory, dnsResolver); // Create socket configuration SocketConfig socketConfig = SocketConfig.custom() .setTcpNoDelay(true) .build(); // Configure the connection manager to use socket configuration either // by default or for a specific host. connManager.setDefaultSocketConfig(socketConfig); connManager.setSocketConfig(new HttpHost("somehost", 80), socketConfig); // Validate connections after 1 sec of inactivity connManager.setValidateAfterInactivity(1000); // Create message constraints MessageConstraints messageConstraints = MessageConstraints.custom() .setMaxHeaderCount(200) .setMaxLineLength(2000) .build(); // Create connection configuration ConnectionConfig connectionConfig = ConnectionConfig.custom() .setMalformedInputAction(CodingErrorAction.IGNORE) .setUnmappableInputAction(CodingErrorAction.IGNORE) .setCharset(Consts.UTF_8) .setMessageConstraints(messageConstraints) .build(); // Configure the connection manager to use connection configuration either // by default or for a specific host. connManager.setDefaultConnectionConfig(connectionConfig); connManager.setConnectionConfig(new HttpHost("somehost", 80), ConnectionConfig.DEFAULT); // Configure total max or per route limits for persistent connections // that can be kept in the pool or leased by the connection manager. connManager.setMaxTotal(100); connManager.setDefaultMaxPerRoute(10); connManager.setMaxPerRoute(new HttpRoute(new HttpHost("somehost", 80)), 20); // Use custom cookie store if necessary. CookieStore cookieStore = new BasicCookieStore(); // Use custom credentials provider if necessary. CredentialsProvider credentialsProvider = new BasicCredentialsProvider(); // Create global request configuration RequestConfig defaultRequestConfig = RequestConfig.custom() .setCookieSpec(CookieSpecs.DEFAULT) .setExpectContinueEnabled(true) .setTargetPreferredAuthSchemes(Arrays.asList(AuthSchemes.NTLM, AuthSchemes.DIGEST)) .setProxyPreferredAuthSchemes(Arrays.asList(AuthSchemes.BASIC)) .build(); // Create an HttpClient with the given custom dependencies and configuration. CloseableHttpClient httpclient = HttpClients.custom() .setConnectionManager(connManager) .setDefaultCookieStore(cookieStore) .setDefaultCredentialsProvider(credentialsProvider) .setProxy(new HttpHost("myproxy", 8080)) .setDefaultRequestConfig(defaultRequestConfig) .build(); try { HttpGet httpget = new HttpGet("http://httpbin.org/get"); // Request configuration can be overridden at the request level. // They will take precedence over the one set at the client level. RequestConfig requestConfig = RequestConfig.copy(defaultRequestConfig) .setSocketTimeout(5000) .setConnectTimeout(5000) .setConnectionRequestTimeout(5000) .setProxy(new HttpHost("myotherproxy", 8080)) .build(); httpget.setConfig(requestConfig); // Execution context can be customized locally. HttpClientContext context = HttpClientContext.create(); // Contextual attributes set the local context level will take // precedence over those set at the client level. context.setCookieStore(cookieStore); context.setCredentialsProvider(credentialsProvider); System.out.println("executing request " + httpget.getURI()); CloseableHttpResponse response = httpclient.execute(httpget, context); try { System.out.println("----------------------------------------"); System.out.println(response.getStatusLine()); System.out.println(EntityUtils.toString(response.getEntity())); System.out.println("----------------------------------------"); // Once the request has been executed the local context can // be used to examine updated state and various objects affected // by the request execution. // Last executed request context.getRequest(); // Execution route context.getHttpRoute(); // Target auth state context.getTargetAuthState(); // Proxy auth state context.getTargetAuthState(); // Cookie origin context.getCookieOrigin(); // Cookie spec used context.getCookieSpec(); // User security token context.getUserToken(); } finally { response.close(); } } finally { httpclient.close(); } } }
一個HttpClient的demo:
public class HttpClientTests { static PoolingHttpClientConnectionManager cm; static { cm = new PoolingHttpClientConnectionManager(); cm.setMaxTotal(20); // 最大連接數 cm.setDefaultMaxPerRoute(cm.getMaxTotal()); } HttpClient client; @Test public void testClient() throws Exception { CookieStore cookieStore = new BasicCookieStore(); RequestConfig requestConfig = RequestConfig.custom() .setConnectTimeout(3 * 1000) // 請求超時時間 .setSocketTimeout(60 * 1000) // 等待數據超時時間 .setConnectionRequestTimeout(500). // 連接超時時間 build(); client = HttpClients.custom() .setConnectionManager(cm) .setDefaultRequestConfig(requestConfig) .setDefaultCookieStore(cookieStore) .setKeepAliveStrategy((response, context) -> { HeaderElementIterator it = new BasicHeaderElementIterator( response.headerIterator(HTTP.CONN_KEEP_ALIVE)); while (it.hasNext()) { HeaderElement he = it.nextElement(); String param = he.getName(); String value = he.getValue(); if (value != null && param.equalsIgnoreCase("timeout")) { try { return Long.parseLong(value) * 1000; } catch (NumberFormatException ignore) { } } } return 5 * 1000; // 設置 Keep-alive 時間為 5s }) .build(); URIBuilder builder = new URIBuilder(); builder.setScheme("http").setHost("localhost:8912").setPath("login"); builder.addParameter("username", "dahuang"); builder.addParameter("password", "dahuang123"); HttpGet get = new HttpGet(builder.build()); HttpResponse response = client.execute(get); Thread.sleep(120 * 1000); // 線程休眠時間 HttpResponse response1 = client.execute(get); }
// 維持登錄狀態 CookieStore cookieStore = new BasicCookieStore(); HttpClient httpClient = HttpClientBuilder.create().setDefaultCookieStore(cookieStore).build(); // 多次調用 HttpGet get = new HttpGet("http://localhost:8912/hello"); HttpResponse response = httpClient.execute(get);
使用 httpclient 連接池及注意事項 http://hrps.me/2017/08/14/java-httpclient/
HttpClient連接池的連接保持、超時和失效機制
HTTP是一種無連接的 事務協議 ,底層使用的還是TCP,連接池復用的就是TCP連接,目的就是在一個TCP連接上進行多次的HTTP請求從而提高性能。每次HTTP請求結束的時候,HttpClient會判斷連接是否可以保持,如果可以則交給連接管理器進行管理以備下次重用,否則直接關閉連接。這里涉及到三個問題:
1、如何判斷連接是否可以保持?
要想保持連接,首先客戶端需要告訴服務器希望保持長連接,這就是所謂的Keep-Alive模式(又稱持久連接,連接重用),
HTTP1.0中默認是關閉的,需要在HTTP頭加入"Connection: Keep-Alive",才能啟用Keep-Alive;
HTTP1.1中默認啟用Keep-Alive,加入"Connection: close ",才關閉。
但客戶端設置了Keep-Alive並不能保證連接就可以保持,這里情況比較復。要想在一個TCP上進行多次的HTTP會話,關鍵是如何判斷一次HTTP會話結束了?非Keep-Alive模式下可以使用EOF(-1)來判斷,但Keep-Alive時服務器不會自動斷開連接,有兩種最常見的方式。
使用Conent-Length
顧名思義,Conent-Length表示實體內容長度,客戶端(服務器)可以根據這個值來判斷數據是否接收完成。當請求的資源是靜態的頁面或圖片,服務器很容易知道內容的大小,但如果遇到動態的內容,或者文件太大想多次發送怎么辦?
使用Transfer-Encoding
當需要一邊產生數據,一邊發給客戶端,服務器就需要使用 Transfer-Encoding: chunked 這樣的方式來代替 Content-Length,Chunk編碼將數據分成一塊一塊的發送。它由若干個Chunk串連而成,以一個標明長度為0 的chunk標示結束。每個Chunk分為頭部和正文兩部分,頭部內容指定正文的字符總數(十六進制的數字 )和數量單位(一般不寫),正文部分就是指定長度的實際內容,兩部分之間用回車換行(CRLF) 隔開。在最后一個長度為0的Chunk中的內容是稱為footer的內容,是一些附加的Header信息。
對於如何判斷消息實體的長度,實際情況還要復雜的多,可以參考這篇文章:https://zhanjindong.com/2015/05/08/http-keep-alive-header
總結下HttpClient如何判斷連接是否保持:
檢查返回response報文頭的Transfer-Encoding字段,若該字段值存在且不為chunked,則連接不保持,直接關閉。
檢查返回的response報文頭的Content-Length字段,若該字段值為空或者格式不正確(多個長度,值不是整數),則連接不保持,直接關閉。
檢查返回的response報文頭的Connection字段(若該字段不存在,則為Proxy-Connection字段)值:
如果這倆字段都不存在,則1.1版本默認為保持, 1.0版本默認為連接不保持,直接關閉。
如果字段存在,若字段值為close 則連接不保持,直接關閉;若字段值為keep-alive則連接標記為保持。
2、 保持多長時間?
保持時間計時開始時間為連接交換至連接池的時間。 保持時長計算規則為:獲取response中 Keep-Alive字段中timeout值,若該存在,則保持時間為 timeout值*1000,單位毫秒。若不存在,則連接保持時間設置為-1,表示為無窮。
3、保持過程中如何保證連接沒有失效?
很難保證。傳統阻塞I/O模型,只有當I/O操做的時候,socket才能響應I/O事件。當TCP連接交給連接管理器后,它可能還處於“保持連接”的狀態,但是無法監聽socket狀態和響應I/O事件。如果這時服務器將連接關閉的話,客戶端是沒法知道這個狀態變化的,從而也無法采取適當的手段來關閉連接。
針對這種情況,HttpClient采取一個策略,通過一個后台的監控線程定時的去檢查連接池中連接是否還“新鮮”,如果過期了,或者空閑了一定時間則就將其從連接池里刪除掉。ClientConnectionManager提供了 closeExpiredConnections和closeIdleConnections兩個方法。
參考文章
引申閱讀
http://www.cnblogs.com/zhanjindong/p/httpclient-connection-pool.html
2.1.持久連接
兩個主機建立連接的過程是很復雜的一個過程,涉及到多個數據包的交換,並且也很耗時間。Http連接需要的三次握手開銷很大,這一開銷對於比較小的http消息來說更大。但是如果我們直接使用已經建立好的http連接,這樣花費就比較小,吞吐率更大。
HTTP/1.1默認就支持Http連接復用。兼容HTTP/1.0的終端也可以通過聲明來保持連接,實現連接復用。HTTP代理也可以在一定時間內保持連接不釋放,方便后續向這個主機發送http請求。這種保持連接不釋放的情況實際上是建立的持久連接。HttpClient也支持持久連接。
2.2.HTTP連接路由
HttpClient既可以直接、又可以通過多個中轉路由(hops)和目標服務器建立連接。HttpClient把路由分為三種plain(明文 ),tunneled(隧道)和layered(分層)。隧道連接中使用的多個中間代理被稱作代理鏈。
客戶端直接連接到目標主機或者只通過了一個中間代理,這種就是Plain路由。客戶端通過第一個代理建立連接,通過代理鏈tunnelling,這種情況就是Tunneled路由。不通過中間代理的路由不可能時tunneled路由。客戶端在一個已經存在的連接上進行協議分層,這樣建立起來的路由就是layered路由。協議只能在隧道—>目標主機,或者直接連接(沒有代理),這兩種鏈路上進行分層。
2.2.1.路由計算
RouteInfo接口包含了數據包發送到目標主機過程中,經過的路由信息。HttpRoute類繼承了RouteInfo接口,是RouteInfo的具體實現,這個類是不允許修改的。HttpTracker類也實現了RouteInfo接口,它是可變的,HttpClient會在內部使用這個類來探測到目標主機的剩余路由。HttpRouteDirector是個輔助類,可以幫助計算數據包的下一步路由信息。這個類也是在HttpClient內部使用的。
HttpRoutePlanner接口可以用來表示基於http上下文情況下,客戶端到服務器的路由計算策略。HttpClient有兩個HttpRoutePlanner的實現類。SystemDefaultRoutePlanner這個類基於java.net.ProxySelector,它默認使用jvm的代理配置信息,這個配置信息一般來自系統配置或者瀏覽器配置。DefaultProxyRoutePlanner這個類既不使用java本身的配置,也不使用系統或者瀏覽器的配置。它通常通過默認代理來計算路由信息。
2.2.2. 安全的HTTP連接
為了防止通過Http消息傳遞的信息不被未授權的第三方獲取、截獲,Http可以使用SSL/TLS協議來保證http傳輸安全,這個協議是當前使用最廣的。當然也可以使用其他的加密技術。但是通常情況下,Http信息會在加密的SSL/TLS連接上進行傳輸。
2.3. HTTP連接管理器
2.3.1. 管理連接和連接管理器
Http連接是復雜,有狀態的,線程不安全的對象,所以它必須被妥善管理。一個Http連接在同一時間只能被一個線程訪問。HttpClient使用一個叫做Http連接管理器的特殊實體類來管理Http連接,這個實體類要實現HttpClientConnectionManager接口。Http連接管理器在新建http連接時,作為工廠類;管理持久http連接的生命周期;同步持久連接(確保線程安全,即一個http連接同一時間只能被一個線程訪問)。Http連接管理器和ManagedHttpClientConnection的實例類一起發揮作用,ManagedHttpClientConnection實體類可以看做http連接的一個代理服務器,管理着I/O操作。如果一個Http連接被釋放或者被它的消費者明確表示要關閉,那么底層的連接就會和它的代理進行分離,並且該連接會被交還給連接管理器。這是,即使服務消費者仍然持有代理的引用,它也不能再執行I/O操作,或者更改Http連接的狀態。
下面的代碼展示了如何從連接管理器中取得一個http連接:
HttpClientContext context = HttpClientContext.create(); HttpClientConnectionManager connMrg = new BasicHttpClientConnectionManager(); HttpRoute route = new HttpRoute(new HttpHost("www.yeetrack.com", 80)); // 獲取新的連接. 這里可能耗費很多時間 ConnectionRequest connRequest = connMrg.requestConnection(route, null); // 10秒超時 HttpClientConnection conn = connRequest.get(10, TimeUnit.SECONDS); try { // 如果創建連接失敗 if (!conn.isOpen()) { // establish connection based on its route info connMrg.connect(conn, route, 1000, context); // and mark it as route complete connMrg.routeComplete(conn, route, context); } // 進行自己的操作. } finally { connMrg.releaseConnection(conn, null, 1, TimeUnit.MINUTES); }
如果要終止連接,可以調用ConnectionRequest的cancel()方法。這個方法會解鎖被ConnectionRequest類get()方法阻塞的線程。
2.3.2.簡單連接管理器
BasicHttpClientConnectionManager是個簡單的連接管理器,它一次只能管理一個連接。盡管這個類是線程安全的,它在同一時間也只能被一個線程使用。BasicHttpClientConnectionManager會盡量重用舊的連接來發送后續的請求,並且使用相同的路由。如果后續請求的路由和舊連接中的路由不匹配,BasicHttpClientConnectionManager就會關閉當前連接,使用請求中的路由重新建立連接。如果當前的連接正在被占用,會拋出java.lang.IllegalStateException異常。
2.3.3.連接池管理器
相對BasicHttpClientConnectionManager來說,PoolingHttpClientConnectionManager是個更復雜的類,它管理着連接池,可以同時為很多線程提供http連接請求。Connections are pooled on a per route basis.當請求一個新的連接時,如果連接池有有可用的持久連接,連接管理器就會使用其中的一個,而不是再創建一個新的連接。
PoolingHttpClientConnectionManager維護的連接數在每個路由基礎和總數上都有限制。默認,每個路由基礎上的連接不超過2個,總連接數不能超過20。在實際應用中,這個限制可能會太小了,尤其是當服務器也使用Http協議時。
下面的例子演示了如果調整連接池的參數:
PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager(); // 將最大連接數增加到200 cm.setMaxTotal(200); // 將每個路由基礎的連接增加到20 cm.setDefaultMaxPerRoute(20); //將目標主機的最大連接數增加到50 HttpHost localhost = new HttpHost("www.yeetrack.com", 80); cm.setMaxPerRoute(new HttpRoute(localhost), 50); CloseableHttpClient httpClient = HttpClients.custom() .setConnectionManager(cm) .build();
2.3.4.關閉連接管理器
當一個HttpClient的實例不在使用,或者已經脫離它的作用范圍,我們需要關掉它的連接管理器,來關閉掉所有的連接,釋放掉這些連接占用的系統資源。
CloseableHttpClient httpClient = <...>
httpClient.close();
2.4.多線程請求執行
當使用了請求連接池管理器(比如PoolingClientConnectionManager)后,HttpClient就可以同時執行多個線程的請求了。
PoolingClientConnectionManager會根據它的配置來分配請求連接。如果連接池中的所有連接都被占用了,那么后續的請求就會被阻塞,直到有連接被釋放回連接池中。為了防止永遠阻塞的情況發生,我們可以把http.conn-manager.timeout的值設置成一個整數。如果在超時時間內,沒有可用連接,就會拋出ConnectionPoolTimeoutException異常。
PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager(); CloseableHttpClient httpClient = HttpClients.custom() .setConnectionManager(cm) .build(); // URL列表數組 String[] urisToGet = { "http://www.domain1.com/", "http://www.domain2.com/", "http://www.domain3.com/", "http://www.domain4.com/" }; // 為每個url創建一個線程,GetThread是自定義的類 GetThread[] threads = new GetThread[urisToGet.length]; for (int i = 0; i < threads.length; i++) { HttpGet httpget = new HttpGet(urisToGet[i]); threads[i] = new GetThread(httpClient, httpget); } // 啟動線程 for (int j = 0; j < threads.length; j++) { threads[j].start(); } // join the threads for (int j = 0; j < threads.length; j++) { threads[j].join(); }
即使HttpClient的實例是線程安全的,可以被多個線程共享訪問,但是仍舊推薦每個線程都要有自己專用實例的HttpContext。
下面是GetThread類的定義:
static class GetThread extends Thread { private final CloseableHttpClient httpClient; private final HttpContext context; private final HttpGet httpget; public GetThread(CloseableHttpClient httpClient, HttpGet httpget) { this.httpClient = httpClient; this.context = HttpClientContext.create(); this.httpget = httpget; } @Override public void run() { try { CloseableHttpResponse response = httpClient.execute( httpget, context); try { HttpEntity entity = response.getEntity(); } finally { response.close(); } } catch (ClientProtocolException ex) { // Handle protocol errors } catch (IOException ex) { // Handle I/O errors } } }
2.5. 連接回收策略
經典阻塞I/O模型的一個主要缺點就是只有當組側I/O時,socket才能對I/O事件做出反應。當連接被管理器收回后,這個連接仍然存活,但是卻無法監控socket的狀態,也無法對I/O事件做出反饋。如果連接被服務器端關閉了,客戶端監測不到連接的狀態變化(也就無法根據連接狀態的變化,關閉本地的socket)。
HttpClient為了緩解這一問題造成的影響,會在使用某個連接前,監測這個連接是否已經過時,如果服務器端關閉了連接,那么連接就會失效。這種過時檢查並不是100%有效,並且會給每個請求增加10到30毫秒額外開銷。唯一一個可行的,且does not involve a one thread per socket model for idle connections的解決辦法,是建立一個監控線程,來專門回收由於長時間不活動而被判定為失效的連接。這個監控線程可以周期性的調用ClientConnectionManager類的closeExpiredConnections()方法來關閉過期的連接,回收連接池中被關閉的連接。它也可以選擇性的調用ClientConnectionManager類的closeIdleConnections()方法來關閉一段時間內不活動的連接。
public static class IdleConnectionMonitorThread extends Thread { private final HttpClientConnectionManager connMgr; private volatile boolean shutdown; public IdleConnectionMonitorThread(HttpClientConnectionManager connMgr) { super(); this.connMgr = connMgr; } @Override public void run() { try { while (!shutdown) { synchronized (this) { wait(5000); // 關閉失效的連接 connMgr.closeExpiredConnections(); // 可選的, 關閉30秒內不活動的連接 connMgr.closeIdleConnections(30, TimeUnit.SECONDS); } } } catch (InterruptedException ex) { // terminate } } public void shutdown() { shutdown = true; synchronized (this) { notifyAll(); } } }
2.6. 連接存活策略
Http規范沒有規定一個持久連接應該保持存活多久。有些Http服務器使用非標准的Keep-Alive頭消息和客戶端進行交互,服務器端會保持數秒時間內保持連接。HttpClient也會利用這個頭消息。如果服務器返回的響應中沒有包含Keep-Alive頭消息,HttpClient會認為這個連接可以永遠保持。然而,很多服務器都會在不通知客戶端的情況下,關閉一定時間內不活動的連接,來節省服務器資源。在某些情況下默認的策略顯得太樂觀,我們可能需要自定義連接存活策略。
ConnectionKeepAliveStrategy myStrategy = new ConnectionKeepAliveStrategy() { public long getKeepAliveDuration(HttpResponse response, HttpContext context) { // Honor 'keep-alive' header HeaderElementIterator it = new BasicHeaderElementIterator( response.headerIterator(HTTP.CONN_KEEP_ALIVE)); while (it.hasNext()) { HeaderElement he = it.nextElement(); String param = he.getName(); String value = he.getValue(); if (value != null && param.equalsIgnoreCase("timeout")) { try { return Long.parseLong(value) * 1000; } catch(NumberFormatException ignore) { } } } HttpHost target = (HttpHost) context.getAttribute( HttpClientContext.HTTP_TARGET_HOST); if ("www.naughty-server.com".equalsIgnoreCase(target.getHostName())) { // Keep alive for 5 seconds only return 5 * 1000; } else { // otherwise keep alive for 30 seconds return 30 * 1000; } } }; CloseableHttpClient client = HttpClients.custom() .setKeepAliveStrategy(myStrategy) .build();
2.7.socket連接工廠
Http連接使用java.net.Socket類來傳輸數據。這依賴於ConnectionSocketFactory接口來創建、初始化和連接socket。這樣也就允許HttpClient的用戶在代碼運行時,指定socket初始化的代碼。PlainConnectionSocketFactory是默認的創建、初始化明文socket(不加密)的工廠類。
創建socket和使用socket連接到目標主機這兩個過程是分離的,所以我們可以在連接發生阻塞時,關閉socket連接。
HttpClientContext clientContext = HttpClientContext.create(); PlainConnectionSocketFactory sf = PlainConnectionSocketFactory.getSocketFactory(); Socket socket = sf.createSocket(clientContext); int timeout = 1000; //ms HttpHost target = new HttpHost("www.yeetrack.com"); InetSocketAddress remoteAddress = new InetSocketAddress( InetAddress.getByName("www.yeetrack.com", 80); //connectSocket源碼中,實際沒有用到target參數 sf.connectSocket(timeout, socket, target, remoteAddress, null, clientContext);
2.7.1.安全SOCKET分層
LayeredConnectionSocketFactory是ConnectionSocketFactory的拓展接口。分層socket工廠類可以在明文socket的基礎上創建socket連接。分層socket主要用於在代理服務器之間創建安全socket。HttpClient使用SSLSocketFactory這個類實現安全socket,SSLSocketFactory實現了SSL/TLS分層。請知曉,HttpClient沒有自定義任何加密算法。它完全依賴於Java加密標准(JCE)和安全套接字(JSEE)拓展。
2.7.2.集成連接管理器
自定義的socket工廠類可以和指定的協議(Http、Https)聯系起來,用來創建自定義的連接管理器。
ConnectionSocketFactory plainsf = <...> LayeredConnectionSocketFactory sslsf = <...> Registry<ConnectionSocketFactory> r = RegistryBuilder.<ConnectionSocketFactory>create() .register("http", plainsf) .register("https", sslsf) .build(); HttpClientConnectionManager cm = new PoolingHttpClientConnectionManager(r); HttpClients.custom() .setConnectionManager(cm) .build();
2.7.3.SSL/TLS定制
HttpClient使用SSLSocketFactory來創建ssl連接。SSLSocketFactory允許用戶高度定制。它可以接受javax.net.ssl.SSLContext這個類的實例作為參數,來創建自定義的ssl連接。
HttpClientContext clientContext = HttpClientContext.create(); KeyStore myTrustStore = <...> SSLContext sslContext = SSLContexts.custom() .useTLS() .loadTrustMaterial(myTrustStore) .build(); SSLConnectionSocketFactory sslsf = new SSLConnectionSocketFactory(sslContext);
2.7.4.域名驗證
除了信任驗證和在ssl/tls協議層上進行客戶端認證,HttpClient一旦建立起連接,就可以選擇性驗證目標域名和存儲在X.509證書中的域名是否一致。這種驗證可以為服務器信任提供額外的保障。X509HostnameVerifier接口代表主機名驗證的策略。在HttpClient中,X509HostnameVerifier有三個實現類。重要提示:主機名有效性驗證不應該和ssl信任驗證混為一談。
StrictHostnameVerifier: 嚴格的主機名驗證方法和java 1.4,1.5,1.6驗證方法相同。和IE6的方式也大致相同。這種驗證方式符合RFC 2818通配符。The hostname must match either the first CN, or any of the subject-alts. A wildcard can occur in the CN, and in any of the subject-alts.
BrowserCompatHostnameVerifier: 這種驗證主機名的方法,和Curl及firefox一致。The hostname must match either the first CN, or any of the subject-alts. A wildcard can occur in the CN, and in any of the subject-alts.StrictHostnameVerifier和BrowserCompatHostnameVerifier方式唯一不同的地方就是,帶有通配符的域名(比如*.yeetrack.com),BrowserCompatHostnameVerifier方式在匹配時會匹配所有的的子域名,包括 a.b.yeetrack.com .
AllowAllHostnameVerifier: 這種方式不對主機名進行驗證,驗證功能被關閉,是個空操作,所以它不會拋出javax.net.ssl.SSLException異常。HttpClient默認使用BrowserCompatHostnameVerifier的驗證方式。如果需要,我們可以手動執行驗證方式。
SSLContext sslContext = SSLContexts.createSystemDefault(); SSLConnectionSocketFactory sslsf = new SSLConnectionSocketFactory( sslContext, SSLConnectionSocketFactory.STRICT_HOSTNAME_VERIFIER);
2.8.HttpClient代理服務器配置
盡管,HttpClient支持復雜的路由方案和代理鏈,它同樣也支持直接連接或者只通過一跳的連接。
使用代理服務器最簡單的方式就是,指定一個默認的proxy參數。
HttpHost proxy = new HttpHost("someproxy", 8080); DefaultProxyRoutePlanner routePlanner = new DefaultProxyRoutePlanner(proxy); CloseableHttpClient httpclient = HttpClients.custom() .setRoutePlanner(routePlanner) .build();
我們也可以讓HttpClient去使用jre的代理服務器。
SystemDefaultRoutePlanner routePlanner = new SystemDefaultRoutePlanner( ProxySelector.getDefault()); CloseableHttpClient httpclient = HttpClients.custom() .setRoutePlanner(routePlanner) .build();
又或者,我們也可以手動配置RoutePlanner,這樣就可以完全控制Http路由的過程。
HttpRoutePlanner routePlanner = new HttpRoutePlanner() { public HttpRoute determineRoute( HttpHost target, HttpRequest request, HttpContext context) throws HttpException { return new HttpRoute(target, null, new HttpHost("someproxy", 8080), "https".equalsIgnoreCase(target.getSchemeName())); } }; CloseableHttpClient httpclient = HttpClients.custom() .setRoutePlanner(routePlanner) .build(); } }
HttpClient4.3教程 第二章 連接管理http://www.yeetrack.com/?p=782