Selenium入門16 獲取頁面源代碼

本文轉載自查看原文 2018-10-12 15:48 6858 Selenium[Python]/ Selenium

頁面源代碼：page_source屬性

獲取源代碼之后，再用正則表達式匹配出所有的鏈接，代碼如下：

#coding:utf-8
from selenium import webdriver
import re #引入正則表達式

dr = webdriver.Firefox()
dr.get('https://www.baidu.com')

source = dr.page_source #獲取網頁源代碼
#print(source)

linklist = re.findall(r'<a.*?</a>',source) #匹配所有的a節點

print("the number of link : %d."%len(linklist)) #鏈接個數

for link in linklist: #打印出所有link
    print(link)
    
dr.quit()

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 selenium獲取html源代碼【Android Demo】獲取指定網頁的頁面源代碼如何從GitHub獲取源代碼 C# Crc16 源代碼 java+selenium+new——獲取網頁源代碼driver.getPageSource() Jenkins獲取Gitlab源代碼 NodeJS 獲取網頁源代碼 Linux內核入門-如何獲取Linux內核源代碼、生成配置內核 PHP用curl抓取https頁面源代碼博客園頁面美化源代碼