在linux環境下,沒有root權限的情況下,有時會碰到如下問題:
Building prefix dict from the default dictionary ... Loading model from cache /tmp/jieba.cache Dumping model to file cache /tmp/jieba.cache Dump cache file failed. Traceback (most recent call last): File "/home/work/anaconda3/envs/py27/lib/python2.7/site-packages/jieba/__init__.py", line 153, in initialize _replace_file(fpath, cache_file) OSError: [Errno 1] Operation not permitted
這是因為jieba默認情況下在/tmp下存儲緩存文件,然而不是root用戶,權限不夠。解決辦法是修改默認緩存文件的目錄,把緩存文件放在用戶的目錄下面。 jieba文檔提到了tmp_dir和cache_file可以改,所以我們查看了下源碼
/home/work/anaconda3/envs/py27/lib/python2.7/site-packages/jieba/__init__.py,文件52行-66行如下:
class Tokenizer(object): def __init__(self, dictionary=DEFAULT_DICT): self.lock = threading.RLock() if dictionary == DEFAULT_DICT: self.dictionary = dictionary else: self.dictionary = _get_abs_path(dictionary) self.FREQ = {} self.total = 0 self.user_word_tag_tab = {} self.initialized = False self.tmp_dir = None # self.tmp_dir = '/' self.cache_file = None
修改源碼,在64行self.tmp_dir中可以設置自定義緩存路徑。
另外一種方式是在代碼中修改,以下是jieba單例模式demo
1 class Singleton(object): 2 """ 3 Jieba Utils Class 4 """ 5 _instance = None 6 7 def __new__(cls, *args, **kwargs): 8 if not cls._instance: 9 cls._instance = super(Singleton, cls).__new__(cls, *args, **kwargs) 10 return cls._instance 11 12 13 class JiebaUtil(Singleton): 14 """ 15 jiebautil 工具包 16 """ 17 _jieba_instance = None 18 19 def get_instance(self): 20 """ 21 get the global jieba instance 22 """ 23 if self._jieba_instance: 24 return self._jieba_instance 25 print 'initialize...' 26 obj = jieba.Tokenizer() 27 obj.tmp_dir = dirpath 28 obj.load_userdict(user_dict_path) 29 obj.initialize() 30 self._jieba_instance = obj 31 return obj 32 33 34 if __name__ == '__main__': 35 36 one = JiebaUtil() 37 two = JiebaUtil() 38 39 print one == two 40 41 tkn = one.get_instance() 42 tkn2 = one.get_instance() 43 print tkn == tkn2 44 45 print id(one), id(two) 46 47 print id(tkn), id(tkn2)
在27行中可以設置自定義的他們tmp_dir緩存路徑。
參考:
http://funhacks.net/2017/01/17/singleton/
https://blog.csdn.net/sijiaqi11/article/details/78601258