PhantomJS1是一個可編寫腳本的無頭網頁瀏覽器。它運行在Windows,macOS,Linux和FreeBSD上。
使用QtWebKit作為后端,它為各種Web標准提供快速和本機支持:DOM處理,CSS選擇器,JSON,Canvas和SVG。
注意:多進程情況下PhantomJS性能會下降很嚴重。
到PhantomJS官網 http://phantomjs.org/download.html 下載相應環境的版本。
簡單使用:
from selenium import webdriver
# from time import sleep
brower = webdriver.PhantomJS(executable_path='D:/selenium/phantomjs.exe')
brower.get("http://httpbin.org/ip")
print(brower.page_source)
# sleep(10)
brower.quit()
輸出:
UserWarning: Selenium support for PhantomJS has been deprecated, please use headless versions of Chrome or Firefox instead
warnings.warn('Selenium support for PhantomJS has been deprecated, please use headless '
<html><head></head><body><pre style="word-wrap: break-word; white-space: pre-wrap;">{
"origin": "194.156.230.140, 194.156.230.140"
}
</pre></body></html>
警告說:對PhantomJS的Selenium支持已被棄用,請使用無界面的Chrome或Firefox。
下面我們介紹使用pyvirtualdisplay
2:
以Ubuntu為例:
root@onefine-virtual-machine:/home/onefine# cat /proc/version
Linux version 4.18.0-13-generic (buildd@lgw01-amd64-048) (gcc version 8.2.0 (Ubuntu 8.2.0-7ubuntu1)) #14-Ubuntu SMP Wed Dec 5 09:04:24 UTC 2018
root@onefine-virtual-machine:/home/onefine#
安裝pyvirtualdisplay
:
pip install pyvirtualdisplay
安裝xvfb
:
sudo apt install xvfb
安裝 chrome 瀏覽器: https://www.google.com/chrome/
安裝 chromedriver: https://sites.google.com/a/chromium.org/chromedriver/downloads
例子:
顯示瀏覽器的情況:
from selenium import webdriver
from pyvirtualdisplay import Display
from time import sleep
xephyr = Display(visible=1, size=(800, 600)).start()
url = "http://www.baidu.com"
browser = webdriver.Chrome(executable_path='./chromedriver')
browser.get(url)
sleep(5)
browser.quit()
xephyr.stop()
不顯示瀏覽器的情況:
from selenium import webdriver
from pyvirtualdisplay import Display
from time import sleep
xephyr = Display(visible=0, size=(800, 600)).start()
url = "http://www.baidu.com"
browser = webdriver.Chrome(executable_path='./chromedriver')
browser.get(url)
print('browser.page_source', browser.page_source)
sleep(5)
browser.quit()
xephyr.stop()
運行情況:
關於scrapy-splash
詳情: https://github.com/scrapy-plugins/scrapy-splash
關於selenium-grid
詳情: https://docs.seleniumhq.org/docs/07_selenium_grid.jsp
關於splinter
詳情: https://github.com/cobrateam/splinter
參考:
如何在使用 RemoteWebDriver 打開網頁的同時獲取 Http 狀態碼 https://www.cnblogs.com/lexfu/p/5288299.html
Selenium+PhantomJS使用時報錯原因及解決方案 https://blog.csdn.net/u010358168/article/details/79749149
PyVirtualDisplay https://pyvirtualdisplay.readthedocs.io/en/latest/
selenium 不打開瀏覽器窗口模擬瀏覽器 http://www.leesven.com/2401.html