Python thread local

本文轉載自查看原文 2017-02-20 21:28 2729 python2.7/ gevent

　　由於GIL的原因，筆者在日常開發中幾乎沒有用到python的多線程。如果需要並發，一般使用多進程，對於IO Bound這種情況，使用協程也是不錯的注意。但是在python很多的網絡庫中，都支持多線程，基本上都會使用到threading.local。在python中threading.local用來表示線程相關的數據，線程相關指的是這個屬性再各個線程中是獨立的互不影響，先來看一個最簡答的例子：

 1 class Widgt(object):
 2     pass
 3 
 4 import threading
 5 def test():
 6     local_data = threading.local()
 7     # local_data = Widgt()
 8     local_data.x = 1
 9 
10     def thread_func():
11         print('Has x in new thread: %s' % hasattr(local_data, 'x'))
12         local_data.x = 2
13 
14     t = threading.Thread(target = thread_func)
15     t.start()
16     t.join()
17     print('x in pre thread is %s' % local_data.x)
18 
19 if __name__ == '__main__':
20     test()

輸出：

Has x in new thread: False

x in pre thread is 1

　　可以看到，在新的線程中 local_data 並沒有x屬性，並且在新線程中的賦值並不會影響到其他線程。也可以稍微改改代碼，去掉第7行的注釋，local_data就變成了線程共享的變量。

　　local怎么實現的呢在threading.py 代碼如下：

1 try:
2     from thread import _local as local
3 except ImportError:
4     from _threading_local import local

　　可以看到，local是python的buildin class，同時也提供了一個純python版本的參考實現，在_threading_local.py，我們來看看代碼（代碼不全省略了幾個函數）：

 1 class _localbase(object):
 2     __slots__ = '_local__key', '_local__args', '_local__lock'
 3 
 4     def __new__(cls, *args, **kw):
 5         self = object.__new__(cls)
 6         key = '_local__key', 'thread.local.' + str(id(self)) # 產生一個key，這個key在同一個進程的多個線程中是一樣的
 7         object.__setattr__(self, '_local__key', key)
 8         object.__setattr__(self, '_local__args', (args, kw))
 9         object.__setattr__(self, '_local__lock', RLock()) # 可重入的鎖
10 
11         if (args or kw) and (cls.__init__ is object.__init__):
12             raise TypeError("Initialization arguments are not supported")
13 
14         # We need to create the thread dict in anticipation of
15         # __init__ being called, to make sure we don't call it
16         # again ourselves.
17         dict = object.__getattribute__(self, '__dict__')
18         current_thread().__dict__[key] = dict   # 在current_thread這個線程唯一的對象的—__dict__中加入 key
19 
20         return self
21 
22 def _patch(self):
23     key = object.__getattribute__(self, '_local__key')
24     d = current_thread().__dict__.get(key)    # 注意 current_thread 在每一個線程是不同的對象
25     if d is None: # 在新的線程第一次調用時
26         d = {}    # 一個空的dict ！！！
27         current_thread().__dict__[key] = d 
28         object.__setattr__(self, '__dict__', d) # 將實例的__dict__賦值為 線程獨立的一個字典
29 
30         # we have a new instance dict, so call out __init__ if we have
31         # one
32         cls = type(self)
33         if cls.__init__ is not object.__init__:
34             args, kw = object.__getattribute__(self, '_local__args')
35             cls.__init__(self, *args, **kw)
36     else:
37         object.__setattr__(self, '__dict__', d)
38 
39 class local(_localbase):
40 
41     def __getattribute__(self, name):
42         lock = object.__getattribute__(self, '_local__lock')
43         lock.acquire()
44         try:
45             _patch(self) # 這條語句執行之后，self.__dict__ 被修改成了線程獨立的一個dict
46             return object.__getattribute__(self, name)
47         finally:
48             lock.release()

　　代碼中已經加入了注釋，便於理解。總結就是，在每個線程中增加一個獨立的dict（通過current_thread()這個線程獨立的對象），然后每次對local實例增刪改查的時候，進行__dict__的替換。我們看看測試代碼：

 1 import threading
 2 from _threading_local import local
 3 def test():
 4     local_data = local()
 5     local_data.x = 1
 6     print 'id of local_data', id(local_data)
 7 
 8     def thread_func():
 9         before_keys = threading.current_thread().__dict__.keys()
10         local_data.x = 2
11         after = threading.current_thread().__dict__
12         # print set(after.keys())  - set(before.keys())
13         print [(e, v) for (e, v) in after.iteritems() if e not in before_keys]
14 
15     t = threading.Thread(target = thread_func)
16     t.start()
17     t.join()
18     print('x in pre thread is %s' % local_data.x)
19 
20 if __name__ == '__main__':
21     test()

輸出：

　　id of local_data 40801456
　　[(('_local__key', 'thread.local.40801456'), {'x': 2})]

　　從輸出可以看到，在這次運行總，local_data的id是40801456，在每個線程中都是一樣的。在新的線程（thread_func函數）中訪問local_data對象之前，current_thread()返回的對象是沒有__local_key的，在第10行訪問的時候會增加這個屬性（_patch函數中）。

　　在gevent中，也有一個類叫local，其作用是提供協程獨立的數據。PS：gevent中提供了幾乎與python原生協程一樣的數據結構，如Event、Semaphore、Local，而且，gevent的代碼和文檔中也自稱為“thread”，這點需要注意。gevent.local的實現借鑒了上面介紹的_threading_local.py, 區別在於，_threading_local.local 將線程獨立的數據存放在current_thread()中，而gevent.local將協程獨立的數據存放在greenlet.getcurrent()中。

　　最后，如果在代碼中使用了gevent.monkey.patch_all()，那么python原生的threading.local將會被替換成gevent.local.local。之前在看bottle的代碼的時候，發現里面都是使用的threading.local，當時也對monkey_patch具體patch了那些模塊不了解，於是就想如果使用gevent是否會出錯呢，結果測試了很久都發現沒問題，直到重新細看bottle源碼才發現原因所在。代碼如下：

 1 class GeventServer(ServerAdapter):
 2     """ Untested. Options:
 3 
 4         * See gevent.wsgi.WSGIServer() documentation for more options.
 5     """
 6 
 7     def run(self, handler):
 8         from gevent import pywsgi, local
 9         if not isinstance(threading.local(), local.local): ＃注意這里
10             msg = "Bottle requires gevent.monkey.patch_all() (before import)"
11             raise RuntimeError(msg)
12         if self.quiet:
13             self.options['log'] = None
14         address = (self.host, self.port)
15         server = pywsgi.WSGIServer(address, handler, **self.options)
16         if 'BOTTLE_CHILD' in os.environ:
17             import signal
18             signal.signal(signal.SIGINT, lambda s, f: server.stop())
19         server.serve_forever()

　　這個小插曲其實也反映了monkey-patch的一些優勢與劣勢。其優勢在於不對源碼修改就能改變運行時行為，提高性能；同時，對於缺乏經驗或者對patch細節不了解的人來說，會帶來靜態代碼與運行結果之間的認知差異。

references：

bottle.py源碼

gevent tutorial

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 thread_local變量 c++11： thread_local 線程本地存儲（Thread Local Storage）【Python@Thread】thread模塊線程TLAB局部緩存區域（Thread Local Allocation Buffer）線程本地存儲(Thread Local Storage, TLS)簡單分析與使用 keras報錯：AttributeError: '_thread._local' object has no attribute 'value' C++11多線程（thread_local） python 線程(thread)阻塞 python的thread模塊作用