1.下行遍歷
標簽樹的下行遍歷
.content 子節點列表,將tag所有兒子節點存入列表
.children 子節點的迭代類型,與.contents類似用於循環遍歷兒子節點
.descendants 子孫節點的迭代類型,包含所有子孫節點,用於循環遍歷
測試代碼:
import requests from bs4 import BeautifulSoup r=requests.get("http://python123.io/ws/demo.html") demo=r.text soup=BeautifulSoup(demo,"html.parser") print(soup.head) #head標簽內容 print(soup.head.contents) #head標簽子節點的內容 print(soup.body.contents) #body標簽子節點的內容 print(len(soup.body.contents)) #body標簽的子節點層數 print(soup.body.contents[1]) #
2.上行遍歷
.parent 節點的父親標簽
.parents 循環遍歷先輩節點
測試代碼:
import requests from bs4 import BeautifulSoup r=requests.get("http://python123.io/ws/demo.html") demo=r.text soup=BeautifulSoup(demo,"html.parser") #print(soup.title.parent) #print(soup.html.parent) for parent in soup.a.parents: if parent is None: print(parent) else: print(parent.name)
3.平行遍歷
標簽樹的平行遍歷
.next_sibling 返回按照HTML文本順序的下一個平行節點標簽
.previous_sibling 返回按照HTML文本順序的上一個平行節點標簽
.nex_siblings 迭代類型,返回按照HTML文本順序的后續所有平行節點標簽
.previous_siblings 迭代類型,返回按照HTML文本順序的前續所有平行節點標簽
測試代碼:
import requests from bs4 import BeautifulSoup r=requests.get("http://python123.io/ws/demo.html") demo=r.text soup=BeautifulSoup(demo,"html.parser") print(soup.a.next_sibling) #a的平行標簽 print(soup.a.next_sibling.next_sibling) #a標簽的下一個標簽的平行標簽 print(soup.a.previous_sibling) #a標簽的上一個標簽 print(soup.a.previous_sibling.previous_sibling) #a標簽的上一個標簽的平行標簽