python爬虫---污言污语网站数据采集

本文转载自查看原文 2021-12-23 15:15 70426 爬虫/ python

代码：

import requests
from lxml import etree

headers = {
    "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36 Edg/96.0.1054.62"
}


def get_text():
    count = 0
    while True:
        with open("nihaowua.txt", "a") as file:
            resp = requests.get("https://www.nihaowua.com/", headers=headers, timeout=10).text
            html = etree.HTML(resp)
            content = html.xpath("//section/div/*/text()")[0]
            file.write(content + "\n")
            count += 1


get_text()

免责声明！

本站转载的文章为个人学习借鉴使用，本站对版权不负任何法律责任。如果侵犯了您的隐私权益，请联系本站邮箱yoyou2525@163.com删除。

猜您在找 python 之爬虫数据采集 python爬虫数据采集 python爬虫采集网站数据入门数据采集，python爬虫常见的数据采集与保存、基于scrapy爬虫的天气数据采集(python) 【Python爬虫】拉钩网招聘信息数据采集爬虫-----数据采集的基本原理数据分析实战（8-10）-数据采集简介&八爪鱼采集工具&python爬虫 Python3爬虫基础实战篇之机票数据采集学习爬虫:《Python网络数据采集》中英文PDF+代码