python找尋合適的日志庫logging Handler——Handler自定義實現

本文轉載自查看原文 2014-04-13 08:49 16719 mongo/ python/ logging/ 日志

最近在用python tornado開發一個app的服務端。投產的系統肯定需要包含日志功能，這里就自然想到了用python自帶的logging庫。

logging中日志內容的輸出都交由Handler來實現，但是logging中的自帶的Handler都不能滿足我們的需求。

我們希望能按時間段分割日志，如果使用FileHandler日志只能保存在一個文件，到后期日志文件會非常大，讀寫都成問題；而TimedRotatingFileHandler雖然可分割日志但是多進程時可能會造成日志文件被相互覆蓋，導致日志丟失。

如此我便開始踏上找尋合適Handler的路上。

首先嘗試使用FileHandler，然后寫個腳本（比如用supervisord）定時切分日志的方式。但這里有一個問題是日志文件會被正在運行FileHandler保持，無法重命名，日志還是會一直寫到同一個文件，嘗試失敗。

然后嘗試使用繼承logging自帶的切分文件處理TimedRotatingFileHandler再重寫處理切分日志的方法。

這里使用了一個網友所寫的類

#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Copyright 2012 Ethan Zhang<http://github.com/Ethan-Zhang> 
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.


import time
import os
from logging.handlers import TimedRotatingFileHandler

class MultiProcessTimedRotatingFileHandler(TimedRotatingFileHandler):

    def doRollover(self):
        """ 
        do a rollover; in this case, a date/time stamp is appended to the filename
        when the rollover happens.  However, you want the file to be named for the
        start of the interval, not the current time.  If there is a backup count,
        then we have to get a list of matching filenames, sort them and remove
        the one with the oldest suffix.
        """
        if self.stream:
            self.stream.close()
        # get the time that this sequence started at and make it a TimeTuple
        t = self.rolloverAt - self.interval
        if self.utc:
            timeTuple = time.gmtime(t)
        else:
            timeTuple = time.localtime(t)
        dfn = self.baseFilename + "." + time.strftime(self.suffix, timeTuple)
        #if os.path.exists(dfn):
        #    os.remove(dfn)
        if not os.path.exists(dfn):
            os.rename(self.baseFilename, dfn)
        if self.backupCount > 0:
            # find the oldest log file and delete it
            #s = glob.glob(self.baseFilename + ".20*")
            #if len(s) > self.backupCount:
            #    s.sort()
            #    os.remove(s[0])
            for s in self.getFilesToDelete():
                os.remove(s)
        #print "%s -> %s" % (self.baseFilename, dfn)
        self.mode = 'a'
        self.stream = self._open()
        currentTime = int(time.time())
        newRolloverAt = self.computeRollover(currentTime)
        while newRolloverAt <= currentTime:
            newRolloverAt = newRolloverAt + self.interval
        #If DST changes and midnight or weekly rollover, adjust for this.
        if (self.when == 'MIDNIGHT' or self.when.startswith('W')) and not self.utc:
            dstNow = time.localtime(currentTime)[-1]
            dstAtRollover = time.localtime(newRolloverAt)[-1]
            if dstNow != dstAtRollover:
                if not dstNow:  # DST kicks in before next rollover, so we need to deduct an hour
                    newRolloverAt = newRolloverAt - 3600
                else:           # DST bows out before next rollover, so we need to add an hour
                    newRolloverAt = newRolloverAt + 3600
        self.rolloverAt = newRolloverAt

View Code

最后經過驗證還是會出現日志文件被覆蓋的情況，這個方案又被自己給否定。

“文件式的Handler不能滿足需求，要就嘗試則用數據庫存日志吧。” 經過以上的失敗后這個想法在浮現在腦海。

要實現這個方式需要兩個步驟；第一步，自定義一個Handler；第二步，選擇一中數據庫存放日志。

如何自定義一個Handler呢？這個可用用順藤摸瓜的方式找到方案。

首先，到python logging庫中找到其自帶的handler；對比找到一個自定義Handler最基本的約束，也就是找到其中最簡單Handler的實現。

在handlers.py 文件下找到了最簡單的SMTPHandler

其中getSubject只是給emit內部調用，所以很容易就得出實現自定義Handler的方式為：繼承logging.Handler 然后再實現emit方法（寫入日志時會調用）。

第一步已經實現，第二步就是選數據庫了。

首先想選sqlite，查了一下其性能不太好，后面就嘗試選mongodb。

在要自力更生的寫一個之前，手賤的到網上一搜果然已經有人寫好的mongodb handler，唉...

import logging

from bson.timestamp import Timestamp
from pymongo import Connection
from pymongo.collection import Collection
from pymongo.errors import OperationFailure, PyMongoError


"""
Example format of generated bson document:
{
    'thread': -1216977216,
    'threadName': 'MainThread',
    'level': 'ERROR',
    'timestamp': Timestamp(1290895671, 63),
    'message': 'test message',
    'module': 'test_module',
    'fileName': '/var/projects/python/log4mongo-python/tests/test_handlers.py',
    'lineNumber': 38,
    'method': 'test_emit_exception',
    'loggerName':  'testLogger',
    'exception': {
        'stackTrace': 'Traceback (most recent call last):
                       File "/var/projects/python/log4mongo-python/tests/test_handlers.py", line 36, in test_emit_exception
                       raise Exception(\'exc1\')
                       Exception: exc1',
        'message': 'exc1',
        'code': 0
    }
}
"""


class MongoFormatter(logging.Formatter):

    DEFAULT_PROPERTIES = logging.LogRecord('', '', '', '', '', '', '', '').__dict__.keys()

    def format(self, record):
        """Formats LogRecord into python dictionary."""
        # Standard document
        document = {
            'timestamp': Timestamp(int(record.created), int(record.msecs)),
            'level': record.levelname,
            'thread': record.thread,
            'threadName': record.threadName,
            'message': record.getMessage(),
            'loggerName': record.name,
            'fileName': record.pathname,
            'module': record.module,
            'method': record.funcName,
            'lineNumber': record.lineno
        }
        # Standard document decorated with exception info
        if record.exc_info is not None:
            document.update({
                'exception': {
                    'message': str(record.exc_info[1]),
                    'code': 0,
                    'stackTrace': self.formatException(record.exc_info)
                }
            })
        # Standard document decorated with extra contextual information
        if len(self.DEFAULT_PROPERTIES) != len(record.__dict__):
            contextual_extra = set(record.__dict__).difference(set(self.DEFAULT_PROPERTIES))
            if contextual_extra:
                for key in contextual_extra:
                    document[key] = record.__dict__[key]
        return document


class MongoHandler(logging.Handler):

    def __init__(self, level=logging.NOTSET, host='localhost', port=27017, database_name='logs', collection='logs',
                 username=None, password=None, fail_silently=False, formatter=None, capped=False,
                 capped_max=1000, capped_size=1000000, **options):
        """Setting up mongo handler, initializing mongo database connection via pymongo."""
        logging.Handler.__init__(self, level)
        self.host = host
        self.port = port
        self.database_name = database_name
        self.collection_name = collection
        self.username = username
        self.password = password
        self.fail_silently = fail_silently
        self.connection = None
        self.db = None
        self.collection = None
        self.authenticated = False
        self.formatter = formatter or MongoFormatter()
        self.capped = capped
        self.capped_max = capped_max
        self.capped_size = capped_size
        self.options = options
        self._connect()

    def _connect(self):
        """Connecting to mongo database."""

        try:
            self.connection = Connection(host=self.host, port=self.port, **self.options)
        except PyMongoError:
            if self.fail_silently:
                return
            else:
                raise

        self.db = self.connection[self.database_name]
        if self.username is not None and self.password is not None:
            self.authenticated = self.db.authenticate(self.username, self.password)

        if self.capped:
            try:  # We don't want to override the capped collection (and it throws an error anyway)
                self.collection = Collection(self.db, self.collection_name, capped=True, max=self.capped_max, size=self.capped_size)
            except OperationFailure:
                # Capped collection exists, so get it.
                self.collection = self.db[self.collection_name]
        else:
            self.collection = self.db[self.collection_name]

    def close(self):
        """If authenticated, logging out and closing mongo database connection."""
        if self.authenticated:
            self.db.logout()
        if self.connection is not None:
            self.connection.disconnect()

    def emit(self, record):
        """Inserting new logging record to mongo database."""
        if self.collection is not None:
            try:
                self.collection.save(self.format(record))
            except Exception:
                if not self.fail_silently:
                    self.handleError(record)

View Code

還好我發現了他的連接不支持mongodb 的主從模式和副本集模式。哈哈，找到了發泄我寫代碼情緒的地方了（雖然只寫了兩三行代碼）。

mongodb_url='mongodb://192.168.10.200:10001,192.168.10.201:10001'
handler=MongoHandler(url=mongodb_url)
logger.addHandler(handler)

加上這幾句就可以開始使用啦。

本來這應該算高一段落了，但我又想到了這個方案的缺欠——這個日志系統要依賴以數據庫，而我們的日志系統是整個系統的基礎模塊，這個方案又不太合適了。

讓我們回到最初的夢想，日志還是直接寫到文件中，logging只帶的不能有效的分割文件我們就自己寫一個。

於是有了這一個完全自己寫、支持按時間分割的Handler

#!/usr/bin/env python
# -*- coding:utf-8 -*-

import logging
import os,os.path
import datetime

_filefmt=os.path.join("logs","%Y-%m-%d","%H.log")
class MyLoggerHandler(logging.Handler):
    def __init__(self,filefmt=None):
        self.filefmt=filefmt
        if filefmt is None:
            self.filefmt=_filefmt
        logging.Handler.__init__(self)
    def emit(self,record):
        msg=self.format(record)
        _filePath=datetime.datetime.now().strftime(self.filefmt)
        _dir=os.path.dirname(_filePath)
        try:
            if os.path.exists(_dir) is False:
                os.makedirs(_dir)
        except Exception:
            print "can not make dirs"
            print "filepath is "+_filePath
            pass
        try:
            _fobj=open(_filePath,'a') 
            _fobj.write(msg)
            _fobj.write("\n")
            _fobj.flush()
            _fobj.close()
        except Exception:
            print "can not write to file"
            print "filepath is "+_filePath
            pass

這里的思路是每次寫日志完后馬上釋放文件的句柄，這樣這多進程中就不會照成沖突了。當然這里還可以最一下優化就是先緩沖一部分內容、或一段時間再一次性寫入（這個后面覺得性能不敢再做吧）。

回顧一下這個找尋合適Handler的過程，主要集中在三步。第一、確認python自帶的handler不合適，第二、確認handler的自定義實現方式，第三、選擇日志的存儲載體（這里思路似乎走的有點遠，遠離了最初的設想）。值得欣慰的是這個過程終於可以告一段落了。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Python 日志打印之自定義logger handler python--使用logging庫自定義日志輸出 CAS擴展——自定義查詢數據庫驗證Handler 自定義URL Protocol Handler 呼出應用程序爬蟲之Handler處理器和自定義Opener 基於SpringMVC自定義HandlerMapping擴展Handler&注冊自己的Servlet 【python logging】自定義日志過濾器，通過參數控制日志記錄 python自定義封裝logging模塊 Python logging自定義字段輸出及設置日志打印顏色什么是Handler