系統:centos7.4
安裝scrapyd:pip isntall scrapyd
因為我騰訊雲上是python2與python3並存的 所以我執行的命令是:pip3 isntall scrapyd
安裝后新建一個配置文件:
sudo mkdir /etc/scrapyd
sudo vim /etc/scrapyd/scrapyd.conf
寫入如下內容:(給內容在https://scrapyd.readthedocs.io/en/stable/config.html可找到)
[scrapyd] eggs_dir = eggs logs_dir = logs items_dir = jobs_to_keep = 5 dbs_dir = dbs max_proc = 0 max_proc_per_cpu = 10 finished_to_keep = 100 poll_interval = 5.0 bind_address = 0.0.0.0 http_port = 6800 debug = off runner = scrapyd.runner application = scrapyd.app.application launcher = scrapyd.launcher.Launcher webroot = scrapyd.website.Root [services] schedule.json = scrapyd.webservice.Schedule cancel.json = scrapyd.webservice.Cancel addversion.json = scrapyd.webservice.AddVersion listprojects.json = scrapyd.webservice.ListProjects listversions.json = scrapyd.webservice.ListVersions listspiders.json = scrapyd.webservice.ListSpiders delproject.json = scrapyd.webservice.DeleteProject delversion.json = scrapyd.webservice.DeleteVersion listjobs.json = scrapyd.webservice.ListJobs daemonstatus.json = scrapyd.webservice.DaemonStatus
主要更改bind_address=0.0.0.0
創建文件后執行命令啟動scrapyd: (scrapyd > /dev/null &) 當想要記錄輸出日志時: (scrapyd > /root/scrapyd.log &)
坑1:當我執行完命令后報錯,說是找不到命令:
那是因為我系統上python2與3並存,所以找不到,這時應該做軟連接:
我的python3路徑: /usr/local/python3
制作軟連接: ln -s /usr/local/python3/bin/scrapy /usr/bin/scrapy
昨晚軟連接后,執行上邊命令,又報錯:
坑2:
這個好像是那個配置文件的最后一行有問題,具體原因不大清楚,我將最后一行刪除,再次重新執行,scrapyd就跑起來了
想了解更多Python關於爬蟲、數據分析的內容,歡迎大家關注我的微信公眾號:悟道Python