摘要:根據一個運營朋友的需求,取出上萬個微信公眾號的關注度排行,最終用python實現了這一需求,工作量從至少3天縮減至2小時。
簡介:本文使用python+requests框架實現接口訪問,通過字典方式實現需求字段的摘取,其中還有excel的讀寫哦。ok!廢話不多說,直接上代碼。PS:該爬蟲僅僅是為了朋友的燃眉之急,並沒有使用函數方法等代碼優化規范,切勿模仿~~~
程序思路:
1,requests模塊引用,實現接口調用
2,讀取excel中指定行、列的值
3,抓取需要的數據,寫入excel中
代碼:
#模塊的引用
import requests
import json
import xlrd,xlwt
from xlutils.copy import copy
#程序主干
print "Start".center(40,"*")
wx = xlrd.open_workbook("123.xls")
table = wx.sheet_by_name(u'test')
nrows = table.nrows #行
ncols = table.ncols #列
for i in range(1,3):
cell_C2 = table.col(2)[i].value
data1 = {"PageIndex":1,"PageSize":10,"Kw":cell_C2}
try:
r1 = requests.post('http://top.aiweibang.com/user/getsearch',data = data1)
dic1 = json.loads(r1.content)
dic3 = dic1['data']['data']
userId = dic3[0]['Id']
data2 = {"id":userId}
r = requests.post("http://top.aiweibang.com/statistics/readnum",data = data2)
dic4 = json.loads(r.content)
s = dic4['data']
l = [cell_C2,s[0]['ArticleCount'],s[0]['ReadNumAvg'],s[0]['ReadNumMax'],s[0]['LikeNumAvg'],s[0]['LikeNumMax']]
n = [u'地址/ID',u'篇數',u'閱讀平均',u'閱讀最高',u'點贊平均',u'點贊最高']
except Exception,e:
pass
rb = xlrd.open_workbook('test02.xls')
wb = copy(rb)
ws = wb.get_sheet(0)
for k in range(len(n)):
ws.write(0,k,n[k])
#print str(l[1])
for j in range(len(l)):
try:
ws.write(i,j,str(l[j]))
except Exception,e:
pass
wb.save('test02.xls')
print "End".center(40,"*")
程序運行結果:
抓取數據后寫入excel的內容如下,僅供參考~