使用代理IP
一,requests使用代理
requests的代理需要構造一個字典,然后通過設置proxies參數即可。
import requests proxy = '60.186.9.233' proxies = { 'http': 'http://' + proxy, 'https': 'https://' + proxy } try: res = requests.get('http://httpbin.org/get', proxies=proxies) print(res.text) except requests.exceptions.ConnectionError as e: print('error', e.args)
運行結果:
{ "args": {}, "headers": { "Accept": "*/*", "Accept-Encoding": "gzip, deflate", "Host": "httpbin.org", "User-Agent": "python-requests/2.18.4" }, "origin": "60.186.9.233", "url": "https://httpbin.org/get" }
其運行結果的origin是代理的IP,說明代理設置成功。如果代理需要認證,再代理的前面加上用戶名密碼即可。
proxy = 'username:password@60.186.9.233'
二,Selenium使用代理
Selenium同樣可以設置代理,一種是有界面瀏覽器,Chrome為例;另一種是無頭瀏覽器,以PhantomJS為例。
Chrome瀏覽器設置
通過chrome_options來設置代理,才創建Chrome對象的時候用chrome_options參數傳遞即可。運行代碼會彈出Chrome瀏覽器,訪問連接后看到如下結果。
# chrome代理設置 from selenium import webdriver proxy = '60.186.9.233' chrome_options = webdriver.ChromeOptions() chrome_options.add_argument('--proxy-server=http://' + proxy) browser = webdriver.Chrome(chrome_options=chrome_options) res = browser.get('http://httpbin.org/get')
{ "args": {}, "headers": { "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "zh-CN,zh;q=0.9", "Host": "httpbin.org", "Upgrade-Insecure-Requests": "1", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36" }, "origin": "60.186.9.233", "url": "https://httpbin.org/get" }
PhantomJS設置
使用service_args參數將命令行的一些參數定義為列表,在初始化的時候傳遞給PhantomJS就可以了。
# PhantomJs代理設置 from selenium import webdriver service_args = [ '--proxy=60.186.9.233', '--proxy-type=http' ] browser = webdriver.PhantomJS(service_args=service_args) browser.get('http://httpbin.org/get') print(browser.page_source)
運行結果:
{ "args": {}, "headers": { "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "zh-CN,zh;q=0.9", "Host": "httpbin.org", "Upgrade-Insecure-Requests": "1", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36" }, "origin": "60.186.9.233", "url": "https://httpbin.org/get" }
如果需要認證,那么在service_args參數中加入--proxy-auth選項即可。
service_args = [ '--proxy=60.186.9.233', '--proxy-type=http', '--proxy-auth=username:password' ]