Python使用selenium（一）

本文轉載自查看原文 2018-12-04 11:03 2777 測試

文檔路徑：https://selenium-python.readthedocs.io/installation.html

以下代碼講解的是在windows系統上的操作

1. 使用webdriver打開火狐瀏覽器
from selenium import webdriver
browser =webdriver.Firefox()
selenium內部有firefox瀏覽器，所以可以直接打開且對firefox的支持性最好

firefox前端工具介紹
fireBug:Firefox瀏覽器下的一套開發類插件
作用：查看頁面上的元素，從而根據其屬性進行定位。
需要自行安裝，在firefox瀏覽器中開發者查找安裝
該工具里可以直接復制xpath,這比一個個xpath找方便多了。

2.使用webdriver打開chrome瀏覽器
from selenium import webdriver
browser=webdriver.chrome()
如果只安裝了chrome瀏覽器，直接通過webdriver打開會報錯
需要安裝chrome瀏覽器webdriver驅動
a.安裝chromedriver.exe（去網上找下載，貌似官網上沒有了）
b.windows系統需要添加環境變量

chrome瀏覽器更方便

3.瀏覽器窗口的一些操作

打開一個瀏覽器b
b =webdriver.Firefox()

關閉瀏覽器

b.quit()

關閉窗口

b.close()

打開一個網頁
url='http://www.baidu.com'
b.get(url)

當前的url
b.current_url

當前頁面標題
b.title

返回到上一頁，也就是瀏覽器操作返回

b.back()

將窗口最大化

b.maximize_window（）

窗口全屏

b.fullscreen_window()

4.頁面元素的定位及操作

通過檢查頁面查看元素的屬性，然后確定使用何種方法查找該元素。

通過id查找到元素ele

ele=b.find_element_by_id('id1')

通過name屬性查找元素

ele=b.find_element_by_name('name1')

通過ClassName查找到元素

ele1=b.find_element_by_class_name('classname')

tag name 針對的是標簽名，通過tag name 查找元素

ele2=b.find_element_by_tag_name('input')

當頁面有很多個同類型的標簽，會返回第一個標簽

通過link text來查找標簽對於a標簽，通過其text

ele3=b.find_element_by_link_text(‘百度鏈接’)

通過模糊查詢，只要text中有搜索的字樣就可以查詢出來

ele4=b.find_element_by_partial_link_text(‘百度’)

通過css選擇器來定位元素當有些元素沒有id name 等一些屬性可以使用

css路徑在Firefox瀏覽器中通過安裝的 fireBug查找元素中復制css路徑來獲取的

ele5=b.find_element_by_css_selector(‘’css路徑‘’)

ele6=b.find_element_by_css_selector(' input [id=\'search \' ] ') css還有這種語法

ele7=b.find_element_by_css_selector(' input [type=”text ” ] ') 可以選擇任何屬性

ele8=b.find_element_by_css_selector(' img [alt=”水果圖片” ] ')

其他的css選擇器用法可以網上查找，簡單的直接使用相關語法，復雜的通過firebug直接拷貝

通過xpath來查找定位元素

xpath用於在XML文檔中通過元素和屬性進行導航。是一個w3c標准。

xpath節點類型：

元素，屬性，文本，命名空間，指令處理，注釋及文檔

ele9=b.find_element_by_xpath（‘/div’）

/html/body/input[1] 絕對路徑下的input元素【1】表示同級多個input時的第一個

//input 任意路徑下的input元素查找到所有的input元素

ele9=b.find_element_by_xpath（‘/input’）返回第一個元素

//input[2]

//input/p

//input//p

ele10=b.find_element_by_xpath（‘/input/..’） ele10是ele9的父節點

//input[@id] 有id屬性的input元素也可以通過其他屬性查找元素

//input [ not（@id）] 沒有id屬性的input元素

//input[@name=‘firstname’] name 屬性為firstname的input元素

//input[@id=‘id1’] id等於id1的input元素

//* 所有元素

//*[ count(input)=2 包含兩個input元素的元素

//*[local-name()="input"] 找到tag為input的元素找到多個元素時，返回的都是第一個元素

//*[starts-with(local-name(), 'i')] 找到所有tag以i開頭，如input img 標簽

//*[starts-with(local-name(), 'i')] [last()] 找到所有tag以i開頭，如input img 標簽最后一個

//*[starts-with(local-name(), 'i')] [last()-1] 倒數第二個

//title | //input 查找所有的title或者input標簽

也可以通過firebug查找元素，然后復制xpath

其實，對於使用者，xpath和css選擇器哪個習慣用哪個。xpath更強大，而css選擇器語法更簡潔，且效率更高。

xpath性能差點，但是在瀏覽器中有比較好的插件支持。使用css selector跟xpath不需要安裝第三方什么插件。

測

對查找到的元素操作

ele.clear()

元素的屬性

ele.size

ele.id

ele.name

ele.get_attribute('name') 獲取元素的name屬性的值

ele.tag_name 元素的標簽名

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 python爬蟲之selenium的使用 Python: 在 Edge 上使用 selenium 【Python】selenium使用代理模式 Python模塊之selenium的簡單使用 Python爬蟲學習（9）：Selenium的使用 python+selenium使用cookie python_selenium之xpath的使用 python使用selenium安裝chromedriver的問題使用python+selenium發送QQ郵件 python使用selenium chrome + headless記錄