和風天氣會提供一個API接口,方便其他的開發人員或者是學生,之前做手機APP的時候就使用過,現在回頭看數據爬蟲的東西,發現之前的接口已經不能用了,好可惜啊。
雖然不能連接,但是展示下思路吧。
1.首先獲取所有城市的ID
在https://dev.heweather.com/docs/refer/city下載中國城市的csv文件,運行下面這段代碼可以提取所有城市ID:(要去除文件的第一行)
import pandas as pd df = pd.read_csv('china-city-list.csv') for item in df['City_ID']: print(item)
2.完成了城市編碼的提取后,下一步就是調用接口獲取數據,代碼如下:
import requests import time import pandas as pd df = pd.read_csv('china-city-list.csv') for item in df['City_ID']: url = 'https://free-api.heweather.net/s6/weather/now?location=' +item+ '&key=1f0ed5e276a04402bc761d764d8f7cc8' strhtml = requests.get(url) strhtml.encoding = 'utf8' time.sleep(1)
3.存儲數據到MongoDB
import requests import pymongo import pandas as pd # 建立鏈接,其中localhost是主機名,2017是端口號(默認情況下是這個參數) client = pymongo.MongoClient('localhost', 27017) # 新建名為'weather'的數據庫 book_weather = client['weather'] # 新建名為'sheet_weather_3'的表 sheet_weather = book_weather['sheet_weather_3'] df = pd.read_csv('china-city-list.csv') for item in df['City_ID']: url = 'https://free-api.heweather.net/s6/weather/now?location=' +item+ '&key=1f0ed5e276a04402bc761d764d8f7cc8' strhtml = requests.get(url) strhtml.encoding = 'utf8' # requests庫返回的數據可以編碼成JSON格式的數據 dic = strhtml.json() # 寫入數據 sheet_weather.insert_one(dic)
MongoDB里沒有數據啊。
4.MongoDB查詢數據
4.1 查找鍵 HeWeather6.basic.location 值為北京的數據
import pymongo client = pymongo.MongoClient('localhost', 27017) book_weather = client['weather'] sheet_weather = book_weather['sheet_weather_3'] for item in sheet_weather.find({'HeWeather6.basic.location':"北京"}): print(item)
4.2 查詢最低氣溫大於5度的城市,代碼如下:
import pymongo client = pymongo.MongoClient('localhost', 27017) book_weather = client['weather'] sheet_weather = book_weather['sheet_weather_3'] for item in sheet_weather.find(): tmp = item['HeWeather6'][0]['now']['tmp']['min'] # 將表中的氣溫數據修改為數值型 sheet_weather.update_one({'_id':item['_id']},{'$set':{'HeWeather6.0.now.tmp':int(tmp)}}) # 提取表中最低氣溫高於5攝氏度的城市 for item in sheet_weather.find({'HeWeather6.0.now.tmp':{'$gt':5}}): print(item['HeWeather6'][0]['basic']['city'])
4.3 查詢三天里,天氣最低溫大於5度的城市,代碼如下:
import pymongo client = pymongo.MongoClient('localhost', 27017) book_weather = client['weather'] sheet_weather = book_weather['sheet_weather_3'] for item in sheet_weather.find(): # 因為數據需要3天的天氣預報,因此要循環3次 for i in range(3): tmp = item['HeWeather6'][0]['daily_forecast'][i]['tmp']['min'] sheet_weather.update_one({'_id':item['_id']},{'$set':{'HeWeather6.0.daily_forecast.{}.tmp.min'.format(i):int(tmp)}}) for item in sheet_weather.find({'HeWeather6.0.daily_forecast.tmp.min':{'$gt':5}}): print(item['HeWeather6'][0]['basic']['city'])