python爬虫beautifulsoup4系列3

本文转载自查看原文 2017-06-03 11:08 1293 beautifulsoup4

前言

本篇手把手教大家如何爬取网站上的图片，并保存到本地电脑

一、目标网站

1.随便打开一个风景图的网站：http://699pic.com/sousuo-218808-13-1.html

2.用firebug定位，打开firepath里css定位目标图片

3.从下图可以看出，所有的图片都是img标签，class属性都是lazy

二、用find_all找出所有的标签

1.find_all(class_="lazy")获取所有的图片对象标签

2.从标签里面提出jpg的url地址和title

 1 # coding:utf-8
 2 from bs4 import BeautifulSoup  3 import requests  4 import os  5 r = requests.get("http://699pic.com/sousuo-218808-13-1.html")  6 fengjing = r.content  7 soup = BeautifulSoup(fengjing, "html.parser")  8 # 找出所有的标签
 9 images = soup.find_all(class_="lazy") 10 # print images # 返回list对象
11 
12 for i in images: 13     jpg_rl = i["data-original"] # 获取url地址 14     title = i["title"] # 返回title名称 15     print title 16     print jpg_rl 17     print ""

三、保存图片

1.在当前脚本文件夹下创建一个jpg的子文件夹

2.导入os模块，os.getcwd()这个方法可以获取当前脚本的路径

3.用open打开写入本地电脑的文件路径，命名为：os.getcwd()+"\\jpg\\"+title+'.jpg'（命名重复的话，会被覆盖掉）

4.requests里get打开图片的url地址，content方法返回的是二进制流文件，可以直接写到本地

四、参考代码

from bs4 import BeautifulSoup
import requests
import os
r = requests.get("http://699pic.com/sousuo-218808-13-1.html")
fengjing = r.content
soup = BeautifulSoup(fengjing, "html.parser")
# 找出所有的标签
images = soup.find_all(class_="lazy")
# print images # 返回list对象

for i in images:
    try:
        jpg_rl = i["data-original"]
        title = i["title"]
        print(title)
        print(jpg_rl)
        print("")
        with open(os.getcwd()+"\\jpg\\"+title+'.jpg', "wb") as f:
            f.write(requests.get(jpg_rl).content)
    except:
        pass

对python接口自动化有兴趣的，可以加python接口自动化QQ群：226296743

也可以关注下我的个人公众号：

免责声明！

本站转载的文章为个人学习借鉴使用，本站对版权不负任何法律责任。如果侵犯了您的隐私权益，请联系本站邮箱yoyou2525@163.com删除。

猜您在找 【python小练】图片爬虫之BeautifulSoup4 python 3.x 爬虫基础---Requersts,BeautifulSoup4（bs4） python网络爬虫（四）python第三方库BeautifulSoup4的安装及测试 $python爬虫系列（2）—— requests和BeautifulSoup库的基本用法 python3解析库BeautifulSoup4 Python学习之beautifulsoup4库的使用 BeautifulSoup4的find_all()和select()，简单爬虫学习 Python 爬虫—— requests BeautifulSoup Python爬虫之BeautifulSoup和requests python爬虫（beautifulsoup）