Netty內存池及命中緩存的分配

本文轉載自查看原文 2019-08-02 12:33 845 Netty

內存池的內存規格：

　　在前面的源碼分析過程中，關於內存規格大小我們應該還有些印象。其實在Netty 內存池中主要設置了四種規格大小的內存：tiny 是指0-512Byte 之間的規格大小，small 是指512Byte-8KB 之間的規格大小，normal 是指8KB-16MB 之間的規格大小，huge 是指16MB 以上。為什么Netty 會選擇這些值作為一個分界點呢？其實在Netty 底層還有一個內存單位的封裝，為了更高效地管理內存，避免內存浪費，把每一個區間的內存規格由做了細分。默認情況下，Netty將內存規格划分為4 個部分。Netty 中所有的內存申請是以Chunk 為單位向內存申請的，大小為16M，后續的所有內存分配都是在這個Chunk 里面的操作。8K 對應的是一個Page，一個Chunk 會以Page 為單位進行切分，8K 對應Chunk被划分為2048 個Page。小於8K 的對應的是SubPage。例如：我們申請的一段內存空間只有1K，卻給我們分配了一個Page，顯然另外7K 就會被浪費，所以就繼續把Page 進行划分，來節省空間。如下圖所示：

　　至此，小伙伴們應該已經基本清楚Netty 的內存池緩存管理機制了。

命中緩存的分配：

　　前面我們簡單分析了directArena 內存分配大概流程, 知道其先命中緩存, 如果命中不到, 則區分配一款連續內存。現在開始帶大家剖析命中緩存的相關邏輯。前面我們也講到PoolThreadCache 中維護了三個緩存數組(實際上是六個, 這里僅僅以Direct 為例, Heap 類型的邏輯是一樣的): tinySubPageDirectCaches, smallSubPageDirectCaches, 和normalDirectCaches 分別代表tiny 類型, small 類型和normal 類型的緩存數組）。這三個數組保存在PoolThreadCache的成員變量中，其實是在構造方法中進行了初始化:

final class PoolThreadCache {

    final PoolArena<byte[]> heapArena;
    final PoolArena<ByteBuffer> directArena;
　　 static final int numTinySubpagePools = 512 >>> 4;// 32
    private final MemoryRegionCache<byte[]>[] tinySubPageHeapCaches;
    private final MemoryRegionCache<byte[]>[] smallSubPageHeapCaches;
    private final MemoryRegionCache<ByteBuffer>[] tinySubPageDirectCaches;
    private final MemoryRegionCache<ByteBuffer>[] smallSubPageDirectCaches;
    private final MemoryRegionCache<byte[]>[] normalHeapCaches;
    private final MemoryRegionCache<ByteBuffer>[] normalDirectCaches;
　　 ......//參數來自PooledByteBufAllocator的屬性
    PoolThreadCache(PoolArena<byte[]> heapArena, PoolArena<ByteBuffer> directArena,
                    int tinyCacheSize, int smallCacheSize, int normalCacheSize,
                    int maxCachedBufferCapacity, int freeSweepAllocationThreshold) {
        .......if (directArena != null) {
            tinySubPageDirectCaches = createSubPageCaches(
                    tinyCacheSize, PoolArena.numTinySubpagePools, SizeClass.Tiny);
            smallSubPageDirectCaches = createSubPageCaches(
                    smallCacheSize, directArena.numSmallSubpagePools, SizeClass.Small);

            numShiftsNormalDirect = log2(directArena.pageSize);
            normalDirectCaches = createNormalCaches(
                    normalCacheSize, maxCachedBufferCapacity, directArena);

            directArena.numThreadCaches.getAndIncrement();
        } else {
            // No directArea is configured so just null out all caches
            tinySubPageDirectCaches = null;
            smallSubPageDirectCaches = null;
            normalDirectCaches = null;
            numShiftsNormalDirect = -1;
        }
        if (heapArena != null) {
            // Create the caches for the heap allocations
            tinySubPageHeapCaches = createSubPageCaches(
                    tinyCacheSize, PoolArena.numTinySubpagePools, SizeClass.Tiny);
            smallSubPageHeapCaches = createSubPageCaches(
                    smallCacheSize, heapArena.numSmallSubpagePools, SizeClass.Small);

            numShiftsNormalHeap = log2(heapArena.pageSize);
            normalHeapCaches = createNormalCaches(
                    normalCacheSize, maxCachedBufferCapacity, heapArena);

            heapArena.numThreadCaches.getAndIncrement();
        } else {
            // No heapArea is configured so just null out all caches
            tinySubPageHeapCaches = null;
            smallSubPageHeapCaches = null;
            normalHeapCaches = null;
            numShiftsNormalHeap = -1;
        }

        // The thread-local cache will keep a list of pooled buffers which must be returned to
        // the pool when the thread is not alive anymore.
        ThreadDeathWatcher.watch(thread, freeTask);
    }
}

　　我這以tiny 類型為例跟到createSubPageCaches 方法中:

private static <T> MemoryRegionCache<T>[] createSubPageCaches(
        int cacheSize, int numCaches, SizeClass sizeClass) {
    if (cacheSize > 0) {
        @SuppressWarnings("unchecked")
        MemoryRegionCache<T>[] cache = new MemoryRegionCache[numCaches];
        for (int i = 0; i < cache.length; i++) {
            // TODO: maybe use cacheSize / cache.length
            cache[i] = new SubPageMemoryRegionCache<T>(cacheSize, sizeClass);
        }
        return cache;
    } else {
        return null;
    }
}

　　從代碼中看出，其實就是創建了一個緩存數組, 這個緩存數組的長度，也就是numCaches, 在不同的類型, 這個長度不一樣, tiny 類型長度是32, small 類型長度為4, normal 類型長度為3。我們知道, 緩存數組中每個節點代表一個緩存對象, 里面維護了一個隊列, 隊列大小由PooledByteBufAllocator 類中的tinyCacheSize, smallCacheSize,normalCacheSize 屬性決定的。其中每個緩存對象, 隊列中緩存的ByteBuf 大小是固定的, netty 將每種緩沖區類型分成了不同長度規格, 而每個緩存中的隊列緩存的ByteBuf 的長度, 都是同一個規格的長度, 而緩沖區數組的長度, 就是規格的數量。

　　比如：在tiny 類型中，Netty 將其長度分成32 個規格, 每個規格都是16 的整數倍, 也就是包含0Byte, 16Byte,32Byte, 48Byte, 64Byte, 80Byte, 96Byte......496Byte 總共32 種規格, 而在其緩存數組tinySubPageDirectCaches 中, 這每一種規格代表數組中的一個緩存對象緩存的ByteBuf 的大小, 我們以tinySubPageDirectCaches[1]為例(這里下標選擇1 是因為下標為0 代表的規格是0Byte, 其實就代表一個空的緩存, 這里不進行舉例), 在tinySubPageDirectCaches[1]的緩存對象中所緩存的ByteBuf 的緩沖區長度是16Byte, 在tinySubPageDirectCaches[2]中緩存的ByteBuf 長度都為32Byte, 以此類推, tinySubPageDirectCaches[31]中緩存的ByteBuf 長度為496Byte。其具體類型規則的配置如下(可以通過Dbug驗證):

tiny:總共32 個規格, 均是16 的整數倍, 0Byte, 16Byte, 32Byte, 48Byte, 64Byte, 80Byte, 96Byte......496Byte；
small:4 種規格, 512Byte, 1KB, 2KB, 4KB；
nomal:3 種規格, 8KB, 16KB，32KB。

　　如此，我們得出結論PoolThreadCache 中緩存數組的數據結構如下圖所示：

　　在基本了解緩存數組的數據結構之后, 我們再繼續剖析在緩沖中分配內存的邏輯，回到PoolArena 的allocate()方法中:

private void allocate(PoolThreadCache cache, PooledByteBuf<T> buf, final int reqCapacity) {
　　　　　//規格化 reqCapacity=256 
        final int normCapacity = normalizeCapacity(reqCapacity);
        if (isTinyOrSmall(normCapacity)) { // capacity < pageSize
            int tableIdx;
            PoolSubpage<T>[] table;
　　　　　　  //判斷是不是tiny
            boolean tiny = isTiny(normCapacity);
            if (tiny) { // < 512//緩存分配
                if (cache.allocateTiny(this, buf, reqCapacity, normCapacity)) {
                    // was able to allocate out of the cache so move on
                    return;
                }//通過tinyIdx 拿到tableIdx
                tableIdx = tinyIdx(normCapacity);
　　　　　　　　　 //subpage 的數組
                table = tinySubpagePools;
            } else {
                if (cache.allocateSmall(this, buf, reqCapacity, normCapacity)) {
                    // was able to allocate out of the cache so move on
                    return;
                }
                tableIdx = smallIdx(normCapacity);
                table = smallSubpagePools;
            }
　　　　　　  //拿到對應的節點
            final PoolSubpage<T> head = table[tableIdx];
            synchronized (head) {
                final PoolSubpage<T> s = head.next;
　　　　　　　　　 //默認情況下, head 的next 也是自身
                if (s != head) {
                    assert s.doNotDestroy && s.elemSize == normCapacity;
                    long handle = s.allocate();
                    assert handle >= 0;
                    s.chunk.initBufWithSubpage(buf, handle, reqCapacity);

                    if (tiny) {
                        allocationsTiny.increment();
                    } else {
                        allocationsSmall.increment();
                    }
                    return;
                }
            }
            allocateNormal(buf, reqCapacity, normCapacity);
            return;
        }
        if (normCapacity <= chunkSize) {
　　　　　　　//首先在緩存上進行內存分配
            if (cache.allocateNormal(this, buf, reqCapacity, normCapacity)) {
                // was able to allocate out of the cache so move on
                return;
            }//分配不成功, 做實際的內存分配
            allocateNormal(buf, reqCapacity, normCapacity);
        } else {//大於這個值, 就不在緩存上分配
            // Huge allocations are never served via the cache so just call allocateHuge
            allocateHuge(buf, reqCapacity);
        }
    }

　　首先通過normalizeCapacity 方法進行內存規格化，我們跟到normalizeCapacity()方法中:

int normalizeCapacity(int reqCapacity) {
　　　　　// reqCapacity = 256
        if (reqCapacity < 0) {
            throw new IllegalArgumentException("capacity: " + reqCapacity + " (expected: 0+)");
        }
        if (reqCapacity >= chunkSize) {
            return reqCapacity;
        }
　　　　 // 如果 >tiny
        if (!isTiny(reqCapacity)) { // >= 512
            // Doubled 256 
　　　　　　　// 找一個2 的冪次方的數值, 確保數值大於等於reqCapacity
            int normalizedCapacity = reqCapacity;
            normalizedCapacity --;
            normalizedCapacity |= normalizedCapacity >>>  1;
            normalizedCapacity |= normalizedCapacity >>>  2;
            normalizedCapacity |= normalizedCapacity >>>  4;
            normalizedCapacity |= normalizedCapacity >>>  8;
            normalizedCapacity |= normalizedCapacity >>> 16;
            normalizedCapacity ++;
            if (normalizedCapacity < 0) {
                normalizedCapacity >>>= 1;
            }
            return normalizedCapacity;
        }
        // Quantum-spaced 如果是16 的倍數
        if ((reqCapacity & 15) == 0) {
            return reqCapacity;
        }
　　　　 // 不是16 的倍數, 變成最大小於當前值的值+16
        return (reqCapacity & ~15) + 16;
    }

　　上面代碼中if (!isTiny(reqCapacity)) 代表如果大於tiny 類型的大小, 也就是512, 則會找一個2 的冪次方的數值, 確保這個數值大於等於reqCapacity。如果是tiny, 則繼續往下if ((reqCapacity & 15) == 0) 這里判斷如果是16 的倍數, 則直接返回。如果不是16 的倍數, 則返回(reqCapacity & ~15) + 16 , 也就是變成最小大於當前值的16 的倍數值。從上面規格化邏輯看出, 這里將緩存大小規格化成固定大小, 確保每個緩存對象緩存的ByteBuf 容量統一。回到allocate()方法： if(isTinyOrSmall(normCapacity)) 這里是根據規格化后的大小判斷是否tiny 或者small 類型, 我們跟進去：

// capacity < pageSize
boolean isTinyOrSmall(int normCapacity) {
    return (normCapacity & subpageOverflowMask) == 0;
}

　　這個方法是判斷如果normCapacity 小於一個page 的大小, 也就是8k 代表其實tiny 或者small。繼續看allocate()方法，如果當前大小是tiny 或者small, 則isTiny(normCapacity)判斷是否是tiny 類型, 跟進去：

// normCapacity < 512
static boolean isTiny(int normCapacity) {
    return (normCapacity & 0xFFFFFE00) == 0;
}

　　這個方法是判斷如果小於512, 則認為是tiny。再繼續看allocate()方法：如果是tiny, 則通過cache.allocateTiny(this, buf, reqCapacity, normCapacity)在緩存上進行分配。我們就以tiny 類型為例, 分析在緩存上分配ByteBuf 的流：allocateTiny 是緩存分配的入口。我們跟進去, 進入到了PoolThreadCache 的allocateTiny()方法中：

/**
 * Try to allocate a tiny buffer out of the cache. Returns {@code true} if successful {@code false} otherwise
 */
boolean allocateTiny(PoolArena<?> area, PooledByteBuf<?> buf, int reqCapacity, int normCapacity) {
    return allocate(cacheForTiny(area, normCapacity), buf, reqCapacity);
}

　　這里有個方法cacheForTiny(area, normCapacity), 這個方法的作用是根據normCapacity 找到tiny 類型緩存數組中的一個緩存對象。我們跟進到cacheForTiny()方法：

private MemoryRegionCache<?> cacheForTiny(PoolArena<?> area, int normCapacity) {
    int idx = PoolArena.tinyIdx(normCapacity);
    if (area.isDirect()) {
        return cache(tinySubPageDirectCaches, idx);
    }
    return cache(tinySubPageHeapCaches, idx);
}

　　PoolArena.tinyIdx(normCapacity)是找到tiny 類型緩存數組的下標。繼續跟tinyIdx()方法：

static int tinyIdx(int normCapacity) {
    return normCapacity >>> 4;
}

　　這里相當於直接將normCapacity 除以16, 通過前面的內容我們知道, tiny 類型緩存數組中每個元素規格化的數據都是16 的倍數, 所以通過這種方式可以找到其下標, 參考圖5-2, 如果是16Byte 會拿到下標為1 的元素, 如果是32Byte 則會拿到下標為2 的元素。

　　回到cacheForTiny()方法中： if (area.isDirect()) 這里判斷是否是分配堆外內存, 因為我們是按照堆外內存進行舉例, 所以這里為true。再繼續跟到cache(tinySubPageDirectCaches, idx)方法：

private static <T> MemoryRegionCache<T> cache(MemoryRegionCache<T>[] cache, int idx) {
    if (cache == null || idx > cache.length - 1) {
        return null;
    }
    return cache[idx];
}

　　這里我們看到直接通過下標的方式拿到了緩存數組中的對象，回到PoolThreadCache 的allocateTiny()方法中：

private boolean allocate(MemoryRegionCache<?> cache, PooledByteBuf buf, int reqCapacity) {
    if (cache == null) {
        // no cache found so just return false here
        return false;
    }
    boolean allocated = cache.allocate(buf, reqCapacity);
    if (++ allocations >= freeSweepAllocationThreshold) {
        allocations = 0;
        trim();
    }
    return allocated;
}

　　看到cache.allocate(buf, reqCapacity) 進行繼續進行分配。再繼續往里跟, 來到內部類MemoryRegionCache 的allocate(PooledByteBuf<T> buf, int reqCapacity)方法：

public final boolean allocate(PooledByteBuf<T> buf, int reqCapacity) {
        Entry<T> entry = queue.poll();
        if (entry == null) {
            return false;
        }
        initBuf(entry.chunk, entry.handle, buf, reqCapacity);
        entry.recycle();

        // allocations is not thread-safe which is fine as this is only called from the same thread all time.
        ++ allocations;
        return true;
}

　　在這個方法中，首先通過queue.poll()這種方式彈出一個entry, 我們之前的小節分析過, MemoryRegionCache 維護着一個隊列, 而隊列中的每一個值是一個entry。我們簡單看下Entry 這個類:

static final class Entry<T> {
      final Handle<Entry<?>> recyclerHandle;
      PoolChunk<T> chunk;
      long handle = -1;

      Entry(Handle<Entry<?>> recyclerHandle) {
          this.recyclerHandle = recyclerHandle;
      }

      void recycle() {
          chunk = null;
          handle = -1;
          recyclerHandle.recycle(this);
      }
}

　　我們重點關注chunk 和handle 的這兩個屬性, chunk 代表一塊連續的內存, 我們之前簡單介紹過, netty 是通過chunk為單位進行內存分配的, 我們后面會對chunk 進行詳細剖析。handle 相當於一個指針, 可以唯一定位到chunk 里面的一塊連續的內存, 之后也會詳細分析。這樣, 通過chunk 和handle 就可以定位ByteBuf 中指定一塊連續內存, 有關ByteBuf 相關的讀寫, 都會在這塊內存中進行。

　　彈出entry 之后, 通過initBuf(entry.chunk, entry.handle, buf, reqCapacity)這種方式給ByteBuf 初始化, 這里參數傳入當前Entry 的chunk 和hanle 。因為我們知道之前在初始化tiny數組的時候緩存對象類型是SubPageMemoryRegionCache 類型, 所以我們繼續跟到SubPageMemoryRegionCache 類的initBuf(entry.chunk,entry.handle, buf, reqCapacity)方法中：

@Override
protected void initBuf(
       PoolChunk<T> chunk, long handle, PooledByteBuf<T> buf, int reqCapacity) {
    chunk.initBufWithSubpage(buf, handle, reqCapacity);
}
//PoolChunk 類中的方法
void initBufWithSubpage(PooledByteBuf<T> buf, long handle, int reqCapacity) {
   initBufWithSubpage(buf, handle, bitmapIdx(handle), reqCapacity);
}

　　上面代碼中，調用了bitmapIdx()方法，有關bitmapIdx(handle)相關的邏輯, 會在后續的章節進行剖析, 這里繼續往里跟，看initBufWithSubpage()的邏輯：

private void initBufWithSubpage(PooledByteBuf<T> buf, long handle, int bitmapIdx, int reqCapacity) {
    assert bitmapIdx != 0;

    int memoryMapIdx = memoryMapIdx(handle);

    PoolSubpage<T> subpage = subpages[subpageIdx(memoryMapIdx)];
    assert subpage.doNotDestroy;
    assert reqCapacity <= subpage.elemSize;

    buf.init(
        this, handle,
        runOffset(memoryMapIdx) + (bitmapIdx & 0x3FFFFFFF) * subpage.elemSize, reqCapacity, subpage.elemSize,
        arena.parent.threadCache());
}

　　我們先關注init 方法, 因為我們是以PooledUnsafeDirectByteBuf 為例, 所以這里走的是PooledUnsafeDirectByteBuf的init()方法。進入init()方法：

void init(PoolChunk<ByteBuffer> chunk, long handle, int offset, int length, int maxLength,
          PoolThreadCache cache) {
    super.init(chunk, handle, offset, length, maxLength, cache);
    initMemoryAddress();
}

　　首先調用了父類的init 方法, 繼續跟進去：

void init(PoolChunk<T> chunk, long handle, int offset, int length, int maxLength, PoolThreadCache cache) {
　　　　　//初始化
        assert handle >= 0;
        assert chunk != null;
        //在哪一塊內存上進行分配的
        this.chunk = chunk;
　　　　　//這一塊內存上的哪一塊連續內存
        this.handle = handle;
        memory = chunk.memory;
        this.offset = offset;
        this.length = length;
        this.maxLength = maxLength;
        tmpNioBuf = null;
        this.cache = cache;
    }

　　上面的代碼就是將PooledUnsafeDirectByteBuf 的各個屬性進行了初始化。this.chunk = chunk 這里初始化了chunk, 代表當前的ByteBuf 是在哪一塊內存中分配的。this.handle = handle 這里初始化了handle, 代表當前的ByteBuf 是這塊內存的哪個連續內存。有關offset 和length, 我們會在之后再分析, 在這里我們只需要知道, 通過緩存分配ByteBuf, 我們只需要通過一個chunk 和handle, 就可以確定一塊內存，以上就是通過緩存分配ByteBuf 對象的全過程。現在，我們回到MemoryRegionCache 的allocate(PooledByteBuf<T> buf, int reqCapacity)方法：

public final boolean allocate(PooledByteBuf<T> buf, int reqCapacity) {
    Entry<T> entry = queue.poll();
    if (entry == null) {
        return false;
    }
    initBuf(entry.chunk, entry.handle, buf, reqCapacity);
    entry.recycle();

    // allocations is not thread-safe which is fine as this is only called from the same thread all time.
    ++ allocations;
    return true;
}

　　再繼續往下看：entry.recycle()這步是將entry 對象進行回收, 因為entry 對象彈出之后沒有再被引用, 可能gc 會將entry 對象回收, netty 為了將對象進行循環利用, 就將其放在對象回收站進行回收。我們跟進recycle()方法：

void recycle() {
   chunk = null;
   handle = -1;
    recyclerHandle.recycle(this);
}

　　chunk = null 和handle = -1 表示當前Entry 不指向任何一塊內存。recyclerHandle.recycle(this) 將當前entry 回收。以上就是命中緩存的流程, 因為這里我們是假設緩中有值的情況下進行分配的, 如果第一次分配, 緩存中是沒有值的,最后，我們簡單總結一下MemoryRegionCache 對象的基本結構，如下圖所示：

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Netty之SubPage級別的內存分配 Netty 中的內存分配淺析 Netty內存池ByteBuf 內存回收 Netty源碼解析 -- 內存池與PoolArena Netty源碼分析之ByteBuf(二)—內存分配器ByteBufAllocator 深入理解JVM內存分配和常量池如何提高緩存命中率 InnoDB緩存讀命中率、使用率、臟塊率(%) 緩沖池的讀命中率(%) 緩沖池的利用率(%) 緩沖池臟塊的百分率(%) 高命中緩存設計流程整理篇 MySQL緩存命中率概述