核心代碼:
def ipPools(numPage):
headers = randomHeads()
url = 'http://www.xicidaili.com/nn/'
saveFsvFile = open('ips.csv', 'wb')
writer = csv.writer(saveFsvFile)
for num in range(1, numPage + 1):
full_url = url + str(num)
re = requests.get(full_url, headers=headers)
soup = BeautifulSoup(re.text, 'lxml')
res = soup.find(id="ip_list").find_all('tr')
for item in res:
try:
temp = []
tds = item.find_all('td')
proxyIp = tds[1].text.encode("utf-8")
proxyPort = tds[2].text.encode("utf-8")
temp.append(proxyIp)
temp.append(proxyPort)
writer.writerow(temp)
print('保存為excel成功!')
except IndexError:
pass
注意點:
一定要將str轉為bytes :
str.encode("utf-8")
python36 file方法改為open
open('ips.csv', 'wb')將wb改為w 我出錯就在這。 如果有相同錯誤可以,作為參考吧!
推薦鏈接:
https://stackoverflow.com/questions/43582925/python-a-bytes-like-object-is-required-not-str-while-printing
https://blog.csdn.net/csu_vc/article/details/78372932
這兩個可以看下。