HttpClient使用代理IP


在爬取網頁的時候,有的網站會有反爬蟲措施,導致服務器請求拒接,可以使用代理IP來訪問,解決請求拒絕的問題

代理IP分 透明代理、匿名代理、混淆代理、高匿代理

  1、透明代理(Transparent Proxy):透明代理雖然可以“隱藏”IP地址,但是還是可以從HTTP_X_FORWARDED_FOR來查到IP
    REMOTE_ADDR = Proxy IP
    HTTP_VIA = Proxy IP
    HTTP_X_FORWARDED_FOR = Your IP
  2、匿名代理(Anonymous Proxy):匿名代理比透明代理進步了一點:別人只能知道你用了代理,無法知道你是誰
    REMOTE_ADDR = proxy IP
    HTTP_VIA = proxy IP
    HTTP_X_FORWARDED_FOR = proxy IP
  3、混淆代理(Distorting Proxies):如果使用了混淆代理,別人還是能知道你在用代理,但是會得到一個假的IP地址,偽裝的更逼真
    REMOTE_ADDR = Proxy IP
    HTTP_VIA = Proxy IP
    HTTP_X_FORWARDED_FOR = Random IP address
  4、高匿代理(Elite proxy或High Anonymity Proxy):高匿代理讓別人根本無法發現你是在用代理
    REMOTE_ADDR = Proxy IP
    HTTP_VIA = not determined
    HTTP_X_FORWARDED_FOR = not determined

import org.apache.http.HttpEntity;
import org.apache.http.HttpHost;
import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
import org.junit.Test;
/**
 * @author test
 * @Title: JunitHttpClient
 * @ProjectName JunitHttpClient
 * @Description: TODO
 * @date 2018/12/1216:07
 */
public class JunitHttpClient {

    @Test
    public void test()throws Exception{
        // 創建httpget實例
        HttpGet httpGet=new HttpGet("https://www.****.com");
        CloseableHttpClient client = setProxy(httpGet, "192.168.1.1", 8888);
        //設置請求頭消息
        httpGet.setHeader("User-Agent","Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36");
        // 執行http get請求  也可以使用psot
        CloseableHttpResponse response=client.execute(httpGet);
        // 獲取返回實體
        if (response != null){
            HttpEntity entity = response.getEntity();
            if (entity != null){
                System.out.println("網頁內容為:"+ EntityUtils.toString(entity,"utf-8"));
            }
        }
        //關閉response
        response.close();
        //關閉httpClient
        client.close();

    }
    /**
     * 設置代理
     * @param httpGet
     * @param proxyIp
     * @param proxyPort
     * @return
     */
    public CloseableHttpClient setProxy(HttpGet httpGet,String proxyIp,int proxyPort){
        // 創建httpClient實例
        CloseableHttpClient httpClient= HttpClients.createDefault();
        //設置代理IP、端口
        HttpHost proxy=new HttpHost(proxyIp,proxyPort,"http");
        //也可以設置超時時間   RequestConfig requestConfig = RequestConfig.custom().setProxy(proxy).setConnectTimeout(3000).setSocketTimeout(3000).setConnectionRequestTimeout(3000).build();
        RequestConfig requestConfig=RequestConfig.custom().setProxy(proxy).build();
        httpGet.setConfig(requestConfig);
        return httpClient;
    }
}

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM