微博爬虫 ----- 微博发布时间清洗


 

 

from datetime import datetime
from datetime import timedelta

if "刚刚" in publish_time:
    publish_time = datetime.now().strftime('%Y-%m-%d %H:%M')

elif "分钟" in publish_time:
    minute = publish_time[:publish_time.find("分钟")]
    minute = timedelta(minutes=int(minute))
    publish_time = (
        datetime.now() - minute).strftime(
        "%Y-%m-%d %H:%M")
elif "今天" in publish_time:
    today = datetime.now().strftime("%Y-%m-%d")
    time = publish_time.replace('今天','')
    publish_time = today + " " + time

elif "" in publish_time:
    year = datetime.now().strftime("%Y")
    publish_time = str(publish_time)
    print publish_time

    publish_time = year + "-" +publish_time.replace('','-').replace('','')
else:
    publish_time = publish_time[:16]

print "微博发布时间: " + publish_time

 


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM