微博爬蟲 ----- 微博發布時間清洗


 

 

from datetime import datetime
from datetime import timedelta

if "剛剛" in publish_time:
    publish_time = datetime.now().strftime('%Y-%m-%d %H:%M')

elif "分鍾" in publish_time:
    minute = publish_time[:publish_time.find("分鍾")]
    minute = timedelta(minutes=int(minute))
    publish_time = (
        datetime.now() - minute).strftime(
        "%Y-%m-%d %H:%M")
elif "今天" in publish_time:
    today = datetime.now().strftime("%Y-%m-%d")
    time = publish_time.replace('今天','')
    publish_time = today + " " + time

elif "" in publish_time:
    year = datetime.now().strftime("%Y")
    publish_time = str(publish_time)
    print publish_time

    publish_time = year + "-" +publish_time.replace('','-').replace('','')
else:
    publish_time = publish_time[:16]

print "微博發布時間: " + publish_time

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM