Scrapy logger 在每個spider實例中提供了一個可以訪問和使用的實例,方法如下:
import scrapy class MySpider(scrapy.Spider): name = 'myspider' start_url = ['https://www.baidu.com'] def parse(self,response): self.logger.info('Parse function called on %s',response.url)
方法二:
該記錄器是使用spider的名稱創建的,當然也可以應用到任意項目中
import logging import scrapy logger = logging.getLogger('mycustomlogger') #創建logger模塊 class MySpider(scrapy.Spider): name = 'myspider' start_url = ['https://www.baidu.com'] #觸發模塊 def parse(self,response): logger.info('Parse function called on %s',response.url)
只需使用logging.getLogger函數獲取其名稱即可使用其記錄器:
import logging logger = logging.getLogger('mycustomlogger') logger.warning('This is a warning')
so anyway:我們也可以使用__name__變量填充當前模塊的路徑,確保正在處理的任何模塊設置自定義記錄器:
import logging logger = logging.getLogger(__name__) logger.warning('This is a warning')
在scrapy項目的settings 文件中配置
LOG_ENABLED = True #是否啟動日志記錄,默認True LOG_ENCODING = 'UTF-8' LOG_FILE = 'TEST1.LOG'#日志輸出文件,如果為NONE,就打印到控制台 LOG_LEVEL = 'INFO'#日志級別,默認debug
LOG_FORMAT #日志格式
LOG_DATEFORMAT#日志日期格式
LOG_STDOUT #日志標准輸出,默認False,如果True所有標准輸出都將寫入日志中,比如代碼中的print輸出也會被寫入到文件
LOG_SHORT_NAMES#短日志名,默認為false,如果為True將不輸出組件名
日志如下

2019-04-26 15:48:33 [scrapy.utils.log] INFO: Scrapy 1.6.0 started (bot: patencent) 2019-04-26 15:48:33 [scrapy.utils.log] INFO: Versions: lxml 4.3.3.0, libxml2 2.9.5, cssselect 1.0.3, parsel 1.5.1, w3lib 1.20.0, Twisted 19.2.0, Python 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)], pyOpenSSL 19.0.0 (OpenSSL 1.1.1b 26 Feb 2019), cryptography 2.6.1, Platform Windows-10-10.0.17134-SP0 2019-04-26 15:48:33 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'patencent', 'LOG_ENCODING': 'UTF8', 'LOG_FILE': 'TEST1.LOG', 'LOG_LEVEL': 'INFO', 'NEWSPIDER_MODULE': 'patencent.spiders', 'SPIDER_MODULES': ['patencent.spiders'], 'USER_AGENT': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36'} 2019-04-26 15:48:34 [scrapy.extensions.telnet] INFO: Telnet Password: 08683c5af998704b 2019-04-26 15:48:34 [scrapy.middleware] INFO: Enabled extensions: ['scrapy.extensions.corestats.CoreStats', 'scrapy.extensions.telnet.TelnetConsole', 'scrapy.extensions.logstats.LogStats'] 2019-04-26 15:48:34 [scrapy.middleware] INFO: Enabled downloader middlewares: ['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', 'scrapy.downloadermiddlewares.retry.RetryMiddleware', 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware', 'scrapy.downloadermiddlewares.stats.DownloaderStats'] 2019-04-26 15:48:34 [scrapy.middleware] INFO: Enabled spider middlewares: ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware', 'scrapy.spidermiddlewares.referer.RefererMiddleware', 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', 'scrapy.spidermiddlewares.depth.DepthMiddleware'] 2019-04-26 15:48:34 [scrapy.middleware] INFO: Enabled item pipelines: ['patencent.pipelines.PatencentPipeline'] 2019-04-26 15:48:34 [scrapy.core.engine] INFO: Spider opened 2019-04-26 15:48:34 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2019-04-26 15:48:34 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023 2019-04-26 15:49:34 [scrapy.extensions.logstats] INFO: Crawled 1384 pages (at 1384 pages/min), scraped 1255 items (at 1255 items/min) 2019-04-26 15:49:52 [scrapy.utils.log] INFO: Scrapy 1.6.0 started (bot: patencent) 2019-04-26 15:49:52 [scrapy.utils.log] INFO: Versions: lxml 4.3.3.0, libxml2 2.9.5, cssselect 1.0.3, parsel 1.5.1, w3lib 1.20.0, Twisted 19.2.0, Python 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)], pyOpenSSL 19.0.0 (OpenSSL 1.1.1b 26 Feb 2019), cryptography 2.6.1, Platform Windows-10-10.0.17134-SP0 2019-04-26 15:49:52 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'patencent', 'CONCURRENT_REQUESTS': 32, 'LOG_ENCODING': 'UTF8', 'LOG_FILE': 'TEST1.LOG', 'LOG_LEVEL': 'INFO', 'NEWSPIDER_MODULE': 'patencent.spiders', 'SPIDER_MODULES': ['patencent.spiders'], 'USER_AGENT': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36'} 2019-04-26 15:49:52 [scrapy.extensions.telnet] INFO: Telnet Password: b2f1951d137cf133 2019-04-26 15:49:52 [scrapy.middleware] INFO: Enabled extensions: ['scrapy.extensions.corestats.CoreStats', 'scrapy.extensions.telnet.TelnetConsole', 'scrapy.extensions.logstats.LogStats'] 2019-04-26 15:49:52 [scrapy.middleware] INFO: Enabled downloader middlewares: ['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', 'scrapy.downloadermiddlewares.retry.RetryMiddleware', 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware', 'scrapy.downloadermiddlewares.stats.DownloaderStats'] 2019-04-26 15:49:52 [scrapy.middleware] INFO: Enabled spider middlewares: ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware', 'scrapy.spidermiddlewares.referer.RefererMiddleware', 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', 'scrapy.spidermiddlewares.depth.DepthMiddleware'] 2019-04-26 15:49:52 [scrapy.middleware] INFO: Enabled item pipelines: ['patencent.pipelines.PatencentPipeline'] 2019-04-26 15:49:52 [scrapy.core.engine] INFO: Spider opened 2019-04-26 15:49:52 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2019-04-26 15:49:52 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023 2019-04-26 15:50:43 [scrapy.utils.log] INFO: Scrapy 1.6.0 started (bot: patencent) 2019-04-26 15:50:43 [scrapy.utils.log] INFO: Versions: lxml 4.3.3.0, libxml2 2.9.5, cssselect 1.0.3, parsel 1.5.1, w3lib 1.20.0, Twisted 19.2.0, Python 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)], pyOpenSSL 19.0.0 (OpenSSL 1.1.1b 26 Feb 2019), cryptography 2.6.1, Platform Windows-10-10.0.17134-SP0 2019-04-26 15:50:43 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'patencent', 'CONCURRENT_REQUESTS': 32, 'LOG_ENCODING': 'UTF8', 'LOG_FILE': 'TEST1.LOG', 'LOG_LEVEL': 'INFO', 'NEWSPIDER_MODULE': 'patencent.spiders', 'SPIDER_MODULES': ['patencent.spiders'], 'USER_AGENT': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36'} 2019-04-26 15:50:43 [scrapy.extensions.telnet] INFO: Telnet Password: 24d0317609676d2e 2019-04-26 15:50:43 [scrapy.middleware] INFO: Enabled extensions: ['scrapy.extensions.corestats.CoreStats', 'scrapy.extensions.telnet.TelnetConsole', 'scrapy.extensions.logstats.LogStats'] 2019-04-26 15:50:44 [scrapy.middleware] INFO: Enabled downloader middlewares: ['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', 'scrapy.downloadermiddlewares.retry.RetryMiddleware', 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware', 'scrapy.downloadermiddlewares.stats.DownloaderStats'] 2019-04-26 15:50:44 [scrapy.middleware] INFO: Enabled spider middlewares: ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware', 'scrapy.spidermiddlewares.referer.RefererMiddleware', 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', 'scrapy.spidermiddlewares.depth.DepthMiddleware'] 2019-04-26 15:50:44 [scrapy.middleware] INFO: Enabled item pipelines: ['patencent.pipelines.PatencentPipeline'] 2019-04-26 15:50:44 [scrapy.core.engine] INFO: Spider opened 2019-04-26 15:50:44 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2019-04-26 15:50:44 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023 2019-04-26 15:51:44 [scrapy.extensions.logstats] INFO: Crawled 1364 pages (at 1364 pages/min), scraped 1238 items (at 1238 items/min)