INFO: Ignoring response <503 http://www.xicidaili.com/nn>: HTTP status code is not handled or not allowed 用scrapy爬虫

本文转载自查看原文 2018-04-17 17:10 3612 爬虫/ scrapy

用scrapy爬取http://www.xicidaili.com/nt/1（国内ip）是启动小蜘蛛一直报错，将网址换成百度是可以进入parse。

错误：

2018-04-17 16:55:52 [scrapy.core.engine] DEBUG: Crawled (503) <GET http://www.xicidaili.com/nn> (referer: None)
2018-04-17 16:55:53 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <503 http://www.xicidaili.com/nn>: HTTP status code is not handled or not allowed

在setting中设置

HTTPERROR_ALLOWED_CODES = [503] #忽略503页面（不建议使用）

HTTPERROR_ALLOWED_CODES默认: `[]` 忽略该列表中所有非200状态码的response。

重新启动小蜘蛛没问题了但实际问题仍没解决

免责声明！

本站转载的文章为个人学习借鉴使用，本站对版权不负任何法律责任。如果侵犯了您的隐私权益，请联系本站邮箱yoyou2525@163.com删除。

猜您在找 爬虫框架Scrapy之Request/Response 转 Js 跨域CORS报错 Response for preflight has invalid HTTP status code 405 Js 跨域CORS报错 Response for preflight has invalid HTTP status code 405 Js 跨域CORS报错 Response for preflight has invalid HTTP status code 405 常见的HTTP状态码(HTTP Status Code) HTTP Status Code （http状态码）常见的HTTP状态码(HTTP Status Code) 你需要了解的 HTTP Status Code 【Django】HTTP status code must be an integer. HTTP Status code（状态码）和 Status text（状态文本）