memcached源碼剖析系列之內存存儲機制（三）

本文轉載自查看原文 2012-05-21 16:09 3087 memcached

在memcached內存存儲機制剖析的前兩篇文章中，已分析過memcached的內存管理器初始化機制及slab的管理分配機制。接下來我們就來探討下對象item的分配管理及LRU機制。

1 item關鍵數據結構

（1）item結構體原型

typedef struct _stritem {

    struct _stritem *next;

    struct _stritem *prev;

    struct _stritem *h_next;    /* hash chain next */

    rel_time_t      time;       /* least recent access */

    rel_time_t      exptime;    /* expire time */

    int             nbytes;     /* size of data */

    unsigned short  refcount;

    uint8_t         nsuffix;    /* length of flags-and-length string */

    uint8_t         it_flags;   /* ITEM_* above */

    uint8_t         slabs_clsid;/* which slab class we're in */

    uint8_t         nkey;       /* key length, w/terminating null and padding */

    /* this odd type prevents type-punning issues when we do

     * the little shuffle to save space when not using CAS. */

    union {

        uint64_t cas;

        char end;

    } data[];

    /* if it_flags & ITEM_CAS we have 8 bytes CAS */

    /* then null-terminated key */

    /* then " flags length\r\n" (no terminating null) */

    /* then data with terminating \r\n (no terminating null; it's binary!) */

} item;

（2）全局數組

static item *heads[LARGEST_ID];

保存各個slab class所對應的item鏈表的表頭。

static item *tails[LARGEST_ID];

保存各個slab class所對應的item鏈表的表尾。

static unsigned int sizes[LARGEST_ID];

保存各個slab class所對應的items數目。

2 item分配機制的函數實現

（1）LRU機制

　　在前面的分析中已介紹過，memcached不會釋放已分配的內存。記錄超時后，客戶端就無法再看見該記錄（invisible，透明），其存儲空間即可重復使用。Memcached采用的是Lazy Expiration,即memcached內部不會監視記錄是否過期，而是在get時查看記錄的時間戳，檢查記錄是否過期。這種技術被稱為lazy（惰性）expiration。因此，memcached不會在過期監視上耗費CPU時間。

　　memcached會優先使用已超時的記錄的空間，但即使如此，也會發生追加新記錄時空間不足的情況，此時就要使用名為 Least Recently Used（LRU）機制來分配空間，即刪除“最近最少使用”的記錄。

（2）函數實現

Item的分配在函數do_item_alloc()中實現，函數原型為：

item *do_item_alloc(char *key, const size_t nkey, const int flags, const rel_time_t exptime, const int nbytes)；

參數含義：

* key - The key

* nkey - The length of the key

* flags - key flags

*exptime –item expired time

* nbytes - Number of bytes to hold value and addition CRLF terminator

函數的具體實現如下，由於do_item_alloc()太長，這里只貼出部分關鍵代碼：

item *do_item_alloc(char *key, const size_t nkey, const int flags, const rel_time_t exptime, const int nbytes) {

    uint8_t nsuffix;

    item *it = NULL;

    char suffix[40];

    size_t ntotal = item_make_header(nkey + 1, flags, nbytes, suffix, &nsuffix);

         //settings.use_cas:?cas"是一個存儲檢查操作，用來檢查臟數據的存操作。

         if (settings.use_cas) {

        ntotal += sizeof(uint64_t);

    }

    unsigned int id = slabs_clsid(ntotal);//獲得slabclass索引值

    if (id == 0)

        return 0;

    /* do a quick check if we have any expired items in the tail.. */

    int tries = 50;

    item *search;

         //在item鏈表中遍歷過期item

    for (search = tails[id];

         tries > 0 && search != NULL;

         tries--, search=search->prev) {

        if (search->refcount == 0 &&

            (search->exptime != 0 && search->exptime < current_time)) {

           …….

}

    }

         //沒有過期數據時,采用LRU算法，淘汰老數據

    if (it == NULL && (it = slabs_alloc(ntotal, id)) == NULL) {

        /*

        ** Could not find an expired item at the tail, and memory allocation

        ** failed. Try to evict some items!

        */

        tries = 50;

        /* If requested to not push old items out of cache when memory runs out,

         * we're out of luck at this point...

         */

                   // 當內存存滿時，是否淘汰老數據。默認為真。可用-M修改為否。此時內容耗盡時，新插入數據時將返回失敗。

        　　……

        it = slabs_alloc(ntotal, id); //返回新分配的slab的第一個item

                   //item分配失敗,做最后一次努力

        if (it == 0) {

            itemstats[id].outofmemory++;

            /* Last ditch effort. There is a very rare bug which causes

             * refcount leaks. We've fixed most of them, but it still happens,

             * and it may happen in the future.

             * We can reasonably assume no item can stay locked for more than

             * three hours, so if we find one in the tail which is that old,

             * free it anyway.

             */

            tries = 50;

            for (search = tails[id]; tries > 0 && search != NULL; tries--, search=search->prev) {

                                     //search->time:最近一次訪問的時間

                                     if (search->refcount != 0 && search->time + TAIL_REPAIR_TIME < current_time) {

　　　　　　　　　　　　　　　　　　　　……

            }

            it = slabs_alloc(ntotal, id);

            if (it == 0) {

                return NULL;

            }

        }

    }

　　　　…….

    it->next = it->prev = it->h_next = 0;

    it->refcount = 1;     /* the caller will have a reference */

    DEBUG_REFCNT(it, '*');

    it->it_flags = settings.use_cas ? ITEM_CAS : 0;

    it->nkey = nkey;

    it->nbytes = nbytes;

         //零長數組

    memcpy(ITEM_key(it), key, nkey);

    it->exptime = exptime;

    memcpy(ITEM_suffix(it), suffix, (size_t)nsuffix);

    it->nsuffix = nsuffix;

    return it;

}

　　該函數首先調用item_make_header()函數計算出該item的總長度，如果臟數據檢查標志設置的話，添加sizeof(uint64_t)的長度，以便從slabclass獲得索引值（使用slabs_clsid()函數返回）。接着從后往前遍歷item鏈表，注意全局數組heads[LARGEST_ID]和tails[LARGEST_ID]保存了slabclass對應Id的鏈表頭和表尾。

　　從源碼中我們可以看出，有三次遍歷循環，每次最大遍歷次數為50（tries表示），//在item鏈表中遍歷過期item，如果某節點的item設置了過期時間並且該item已過期，則回收該item，，調用do_item_unlink()把它從鏈表中取出來。

　　若向前查找50次都沒有找到過期的item，則調用slabs_alloc()分配內存，如果alloc失敗，接着從鏈表尾開始向前找出一些沒有人用的refcount=0的item，調用do_item_unlink()，再用slabs_alloc()分配內存，如果還失敗，只能從鏈表中刪除一些正在引用但過期時間小於current_time – CURRENT_REPAIR_TIME的節點，這個嘗試又從尾向前嘗試50次，OK，再做最后一次嘗試再去slabs_alloc()分配內存，如果這次還是失敗，那就徹底放棄了，內存分配失敗。

　　Memcached的內存管理方式是非常精巧和高效的，它很大程度上減少了直接alloc系統內存的次數，降低函數開銷和內存碎片產生幾率，雖然這種方式會造成一些冗余浪費，但是這種浪費在大型系統應用中是微不足道的。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Redis源碼剖析之內存淘汰策略(Evict) jvm原理之內存機制 Handler系列之內存泄漏【JVM】JVM系列之內存模型（六） spark存儲模塊之內存存儲--MemeoryStore iOS開發系列之內存泄漏分析（下） Golang之內存讀寫 Netty之內存泄露 jdk源碼剖析二: 對象內存布局、synchronized終極原理 spark 源碼分析之十七 -- Spark磁盤存儲剖析