How collections.deque works?
前言:在 Python 生態中,我們經常使用 collections.deque 來實現棧、隊列這些只需要進行頭尾操作的數據結構,它的 append/pop 操作都是 O(1) 時間復雜度。list 的 pop(0) 的時間復雜度是 O(n), 在這個場景中,它的效率沒有 deque 高。那 deque 內部是怎樣實現的呢? 我從 GitHub 上挖出了 CPython collections 模塊的第二個 commit 的源碼。
dequeobject 對象定義
注釋寫得優雅了,無法進行更加精簡的總結。
/* The block length may be set to any number over 1. Larger numbers * reduce the number of calls to the memory allocator but take more * memory. Ideally, BLOCKLEN should be set with an eye to the * length of a cache line. */ #define BLOCKLEN 62 #define CENTER ((BLOCKLEN - 1) / 2) /* A `dequeobject` is composed of a doubly-linked list of `block` nodes. * This list is not circular (the leftmost block has leftlink==NULL, * and the rightmost block has rightlink==NULL). A deque d's first * element is at d.leftblock[leftindex] and its last element is at * d.rightblock[rightindex]; note that, unlike as for Python slice * indices, these indices are inclusive on both ends. By being inclusive * on both ends, algorithms for left and right operations become * symmetrical which simplifies the design. * * The list of blocks is never empty, so d.leftblock and d.rightblock * are never equal to NULL. * * The indices, d.leftindex and d.rightindex are always in the range * 0 <= index < BLOCKLEN. * Their exact relationship is: * (d.leftindex + d.len - 1) % BLOCKLEN == d.rightindex. * * Empty deques have d.len == 0; d.leftblock==d.rightblock; * d.leftindex == CENTER+1; and d.rightindex == CENTER. * Checking for d.len == 0 is the intended way to see whether d is empty. * * Whenever d.leftblock == d.rightblock, * d.leftindex + d.len - 1 == d.rightindex. * * However, when d.leftblock != d.rightblock, d.leftindex and d.rightindex * become indices into distinct blocks and either may be larger than the * other. */ typedef struct BLOCK { struct BLOCK *leftlink; struct BLOCK *rightlink; PyObject *data[BLOCKLEN]; } block; typedef struct { PyObject_HEAD block *leftblock; block *rightblock; int leftindex; /* in range(BLOCKLEN) */ int rightindex; /* in range(BLOCKLEN) */ int len; long state; /* incremented whenever the indices move */ PyObject *weakreflist; /* List of weak references */ } dequeobject;
下面是我為 Block 結構體畫的一個圖
+----------------------------------------+
| data: 62 objects |
+----------+ | | +-----------+
| leftlink |---| | ... | Obj1 | Obj2 | Obj3 | ... | |---| rightlink |
+----------+ | 30 31 32 | +-----------+
+----------------------------------------+
創建一個 block
static block * newblock(block *leftlink, block *rightlink, int len) { block *b; /* To prevent len from overflowing INT_MAX on 64-bit machines, we * refuse to allocate new blocks if the current len is dangerously * close. There is some extra margin to prevent spurious arithmetic * overflows at various places. The following check ensures that * the blocks allocated to the deque, in the worst case, can only * have INT_MAX-2 entries in total. */ if (len >= INT_MAX - 2*BLOCKLEN) { PyErr_SetString(PyExc_OverflowError, "cannot add more blocks to the deque"); return NULL; } b = PyMem_Malloc(sizeof(block)); if (b == NULL) { PyErr_NoMemory(); return NULL; } b->leftlink = leftlink; b->rightlink = rightlink; return b; }
創建一個 dequeobject
- 創建一個 block
- 實例化一個 dequeobject Python 對象(這一塊的內在邏輯目前我也不太懂)
- leftblock 和 rightblock 指針都指向這個 block
- leftindex 是 CENTER+1,rightindex 是 CENTER
- 初始化其他一些屬性, len state 等
這個第一步和第四步都有點意思,第一步創建一個 block,也就是說, deque 對象創建的時候,就預先分配了一塊內存。第四步隱約告訴我們, 當元素來的時候,它先會被放在中間,然后逐漸往頭和尾散開。
static PyObject * deque_new(PyTypeObject *type, PyObject *args, PyObject *kwds) { dequeobject *deque; block *b; if (type == &deque_type && !_PyArg_NoKeywords("deque()", kwds)) return NULL; /* create dequeobject structure */ deque = (dequeobject *)type->tp_alloc(type, 0); if (deque == NULL) return NULL; b = newblock(NULL, NULL, 0); if (b == NULL) { Py_DECREF(deque); return NULL; } assert(BLOCKLEN >= 2); deque->leftblock = b; deque->rightblock = b; deque->leftindex = CENTER + 1; deque->rightindex = CENTER; deque->len = 0; deque->state = 0; deque->weakreflist = NULL; return (PyObject *)deque; }
deque.append 實現
步驟:
- 如果 rightblock 可以容納更多的元素,則放在 rightblock 中
- 如果不能,就新建一個 block,然后更新若干指針,將元素放在更新后的 rightblock 中
static PyObject * deque_append(dequeobject *deque, PyObject *item) { deque->state++; if (deque->rightindex == BLOCKLEN-1) { block *b = newblock(deque->rightblock, NULL, deque->len); if (b == NULL) return NULL; assert(deque->rightblock->rightlink == NULL); deque->rightblock->rightlink = b; deque->rightblock = b; deque->rightindex = -1; } Py_INCREF(item); deque->len++; deque->rightindex++; deque->rightblock->data[deque->rightindex] = item; Py_RETURN_NONE; }
看了 append 實現后,我們可以自行腦補一下 pop 和 popleft 的實現。
小結
deque 內部將一組內存塊組織成雙向鏈表的形式,每個內存塊可以看成一個 Python 對象的數組, 這個數組與普通數據不同,它是從數組中部往頭尾兩邊填充數據,而平常所見數組大都是從頭往后。 得益於 deque 這樣的結構,它的 pop/popleft/append/appendleft 四種操作的時間復雜度均是 O(1), 用它來實現隊列、棧數據結構會非常方便和高效。但也正因為這樣的設計, 它不能像數組那樣通過 index 來訪問、移除元素。鏈表 + 數組、或者鏈表 + 字典 這樣的設計在實踐中有很廣泛的應用,比如 LRUCache, LFUCache,有興趣的同鞋可以繼續探索。
- PS1: LRUCache 在面試中不要太常見
- PS2: 出 LFUCache 題的面試官都是變態
- PS3: 頭圖來自 quora ,圖文不怎么有關系列