這是一個罪惡的爬蟲
爬取 http://www.27gif.net/gifcc 中的gif圖,並以‘神秘代碼’為它的文件名保存。
------------------------------------------------------------------------------------------------------
import requests
from bs4 import BeautifulSoup
page = 1
while True:
# 請求起始頁,找到每個圖帖子的連接,並自動保存在list中
star_url = 'http://www.27gif.net/gifcc/page/%s/' % str(page)
star_html = requests.get(star_url).text
star_soup = BeautifulSoup(star_html,'lxml')
gif_list = star_soup.find_all('div',class_='wow fadeInUp')
# 遍歷所有帖子的list
for gif_html in gif_list:
# 找到img標簽中的'alt屬性' 整理得到gif的url
try:
gif_name = gif_html.find('img')['alt'].split(':')[1]
except TypeError as E:
continue
except IndexError as e:
gif_name = gif_html.find('img')['alt']
try:
gif_url = gif_html.find('img')['src'].split('src=')[1].split('&w=')[0]
except TypeError as E:
continue
# 請求gif的url 並保存
gif_content = requests.get(gif_url).content
with open(gif_name+'.gif','wb') as f:
f.write(gif_content)
print(gif_name+' OK!')
if page < 13:
page += 1
else:
break
運行完畢后,會在當前文件夾保存GIF圖。
使用前請備好紙巾,使用后請及時喝營養快線

