一.判斷異步加載方式(常用的JS庫)
1. jQuery(70%)
# 搜索 jquery 茅塞頓開
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script>
<script src="/Scripts/jquery-1.11.2.min.js"></script>
2.Google Analytics(50%)
# 搜索 Google Analytics
<!-- Google Analytics -->
<script type="text/javascript">
二.解決
- 安裝pip Selenium
- 下載PhantomJS http://phantomjs.org/download.html
1.Ajax Asynchronous JavaScript and XML(異步 JavaScript 和 XML)
使用Ajax向服務器發送表單(如,延遲加載,下拉刷新,底部刷新...)
2.動態HTML(dynamic HTML, DHTML)
一系列用於解決網絡問題的技術集合(如,鼠標指向顯示,下拉菜單實現)
代碼實現
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import time
# 指明phantomjs的執行路徑
driver = webdriver.PhantomJS(executable_path=r'E:\software\phantomjs-2.1.1-windows\bin\phantomjs.exe')
driver.get("http://pythonscraping.com/pages/javascript/ajaxDemo.html")
# 方法1:顯式給3秒加載時間
time.sleep(3)
# 方法2:讓 Selenium 不斷地檢查某個元素是否存在,以此確定頁面是否已經完全加載(需要導入庫)
try:
element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "loadedButton")))
finally:
print(driver.page_source)
driver.close()
# 獲取內容
# print(driver.page_source)
#
# driver.close()