LRU(Least Recently Used)算法是緩存技術中的一種常見思想,顧名思義,最近最少使用,也就是說有兩個維度來衡量,一個是時間(最近),一個頻率(最少)。如果需要按優先級來對緩存中的K-V實體進行排序的話,需要考慮這兩個維度,在LRU中,最近使用頻率最高的排在前面,也可以簡單的說最近訪問的排在前面。這就是LRU的大體思想。
在操作系統中,LRU是用來進行內存管理的頁面置換算法,對於在內存中但又不用的數據塊(內存塊)叫做LRU,操作系統會根據哪些數據屬於LRU而將其移出內存而騰出空間來加載另外的數據。
wikipedia對LRU的描述:
In computing, cache algorithms (also frequently called cache replacement algorithms or cache replacement policies) are optimizinginstructions—or algorithms—that a computer program or a hardware-maintained structure can follow in order to manage a cache of information stored on the computer. When the cache is full, the algorithm must choose which items to discard to make room for the new ones.
Least Recently Used (LRU)
Discards the least recently used items first. This algorithm requires keeping track of what was used when, which is expensive if one wants to make sure the algorithm always discards the least recently used item. General implementations of this technique require keeping "age bits" for cache-lines and track the "Least Recently Used" cache-line based on age-bits. In such an implementation, every time a cache-line is used, the age of all other cache-lines changes. LRU is actually a family of caching algorithms with members including 2Q by Theodore Johnson and Dennis Shasha,[3] and LRU/K by Pat O'Neil, Betty O'Neil and Gerhard Weikum.[4]
LRUCache的分析實現
1.首先可以先實現一個FIFO的版本,但是這樣只是以插入順序來確定優先級的,沒有考慮訪問順序,並沒有完全實現LRUCache。
用Java中的LinkedHashMap實現非常簡單。
private int capacity; private java.util.LinkedHashMap<Integer, Integer> cache = new java.util.LinkedHashMap<Integer, Integer>() { @Override protected boolean removeEldestEntry(Map.Entry<Integer, Integer> eldest) { return size() > capacity; }
};
程序中重寫了removeEldestEntry()方法,如果大小超過了設置的容量就刪除優先級最低的元素,在 FIFO版本中優先級最低的為最先插入的元素。
2.如果足夠了解LinkedHashMap,實現LRUCache也是非常簡單的。在LinkedHashMap中提供了可以設置容量、裝載因子和順序的構造方法。如果要實現LRUCache就可以把順序的參數設置成true,代表訪問順序,而不是默認的FIFO的插入順序。這里把裝載因子設置為默認的0.75。並且還要重寫removeEldestEntry()方法來維持當前的容量。這樣一來可以有兩種方法來實現LinkedHashMap版本的LRUCache。一種是繼承一種是組合。
繼承:
package lrucache.one; import java.util.LinkedHashMap; import java.util.Map; /** *LRU Cache的LinkedHashMap實現,繼承。 *@author wxisme *@time 2015-10-18 上午10:27:37 */ public class LRUCache extends LinkedHashMap<Integer, Integer>{ private int initialCapacity; public LRUCache(int initialCapacity) { super(initialCapacity,0.75f,true); this.initialCapacity = initialCapacity; } @Override protected boolean removeEldestEntry( Map.Entry<Integer, Integer> eldest) { return size() > initialCapacity; } @Override public String toString() { StringBuilder cacheStr = new StringBuilder(); cacheStr.append("{"); for (Map.Entry<Integer, Integer> entry : this.entrySet()) { cacheStr.append("[" + entry.getKey() + "," + entry.getValue() + "]"); } cacheStr.append("}"); return cacheStr.toString(); } }
組合:
package lrucache.three; import java.util.LinkedHashMap; import java.util.Map; /** *LRU Cache 的LinkedHashMap實現,組合 *@author wxisme *@time 2015-10-18 上午11:07:01 */ public class LRUCache { private final int initialCapacity; private Map<Integer, Integer> cache; public LRUCache(final int initialCapacity) { this.initialCapacity = initialCapacity; cache = new LinkedHashMap<Integer, Integer>(initialCapacity, 0.75f, true) { @Override protected boolean removeEldestEntry( Map.Entry<Integer, Integer> eldest) { return size() > initialCapacity; } }; } public void put(int key, int value) { cache.put(key, value); } public int get(int key) { return cache.get(key); } public void remove(int key) { cache.remove(key); } @Override public String toString() { StringBuilder cacheStr = new StringBuilder(); cacheStr.append("{"); for (Map.Entry<Integer, Integer> entry : cache.entrySet()) { cacheStr.append("[" + entry.getKey() + "," + entry.getValue() + "]"); } cacheStr.append("}"); return cacheStr.toString(); } }
測試代碼:
public static void main(String[] args) { LRUCache cache = new LRUCache(5); cache.put(5, 5); cache.put(4, 4); cache.put(3, 3); cache.put(2, 2); cache.put(1, 1); System.out.println(cache.toString()); cache.put(0, 0); System.out.println(cache.toString()); }
運行結果:
{[5,5][4,4][3,3][2,2][1,1]} {[4,4][3,3][2,2][1,1][0,0]}
可見已經實現了LRUCache的基本功能。
3.如果不用Java API提供的LinkedHashMap該如何實現LRU算法呢?首先我們要確定操作,LRU算法中的操作無非是插入、刪除、查找並且要維護一定的順序,這樣我們有很多種選擇,可以用數組,鏈表,棧,隊列,Map中的一種或幾種。先看棧和隊列,雖然可以明確順序實現FIFO或者FILO,但是LRU中是需要對兩端操作的,既需要刪除tail元素又需要移動head元素,可以想象效率是不理想的。我們要明確一個事實,數組和Map的只讀操作復雜度為O(1),非只讀操作的復雜度為O(n)。鏈式結構則相反。這么一來我們如果只使用其中的一種必定在只讀或非只讀操作上耗時過多。那我們大可以選擇鏈表+Map組合結構。如果選擇單向鏈表在對鏈表兩端操作的時候還是要耗時O(n)。綜上考慮,雙向鏈表+Map結構應該是最好的。
在這種實現方式中,用雙向鏈表來維護優先級順序,也就是訪問順序。實現非只讀操作。用Map存儲K-V值,實現只讀操作。訪問順序:最近訪問(插入也是一種訪問)的移動到鏈表頭部,如果達到上限則刪除鏈表尾部的元素。
1 package lrucache.tow; 2 3 import java.util.HashMap; 4 import java.util.Map; 5 6 /** 7 *LRUCache鏈表+HashMap實現 8 *@author wxisme 9 *@time 2015-10-18 下午12:34:36 10 */ 11 public class LRUCache<K, V> { 12 13 private final int initialCapacity; //容量 14 15 private Node head; //頭結點 16 private Node tail; //尾結點 17 18 private Map<K, Node<K, V>> map; 19 20 public LRUCache(int initialCapacity) { 21 this.initialCapacity = initialCapacity; 22 map = new HashMap<K, Node<K, V>>(); 23 } 24 25 /** 26 * 雙向鏈表的節點 27 * @author wxisme 28 * 29 * @param <K> 30 * @param <V> 31 */ 32 private class Node<K, V> { 33 public Node pre; 34 public Node next; 35 public K key; 36 public V value; 37 38 public Node(){} 39 40 public Node(K key, V value) { 41 this.key = key; 42 this.value = value; 43 } 44 45 } 46 47 48 /** 49 * 向緩存中添加一個K,V 50 * @param key 51 * @param value 52 */ 53 public void put(K key, V value) { 54 Node<K, V> node = map.get(key); 55 56 //node不在緩存中 57 if(node == null) { 58 //此時,緩存已滿 59 if(map.size() >= this.initialCapacity) { 60 map.remove(tail.key); //在map中刪除最久沒有use的K,V 61 removeTailNode(); 62 } 63 node = new Node(); 64 node.key = key; 65 } 66 node.value = value; 67 moveToHead(node); 68 map.put(key, node); 69 } 70 71 /** 72 * 從緩存中獲取一個K,V 73 * @param key 74 * @return v 75 */ 76 public V get(K key) { 77 Node<K, V> node = map.get(key); 78 if(node == null) { 79 return null; 80 } 81 //最近訪問,移動到頭部。 82 moveToHead(node); 83 return node.value; 84 } 85 86 /** 87 * 從緩存中刪除K,V 88 * @param key 89 */ 90 public void remove(K key) { 91 Node<K, V> node = map.get(key); 92 93 map.remove(key); //從hashmap中刪除 94 95 //在雙向鏈表中刪除 96 if(node != null) { 97 if(node.pre != null) { 98 node.pre.next = node.next; 99 } 100 if(node.next != null) { 101 node.next.pre = node.pre; 102 } 103 if(node == head) { 104 head = head.next; 105 } 106 if(node == tail) { 107 tail = tail.pre; 108 } 109 110 //除去node的引用 111 node.pre = null; 112 node.next = null; 113 node = null; 114 } 115 116 } 117 118 119 /** 120 * 把node移動到鏈表頭部 121 * @param node 122 */ 123 private void moveToHead(Node node) { 124 125 //切斷node 126 127 if(node == head) return ; 128 129 if(node.pre !=null) { 130 node.pre.next = node.next; 131 } 132 if(node.next != null) { 133 node.next.pre = node.pre; 134 } 135 if(node == tail) { 136 tail = tail.pre; 137 } 138 139 if(tail == null || head == null) { 140 tail = head = node; 141 return ; 142 } 143 144 145 //把node移送到head 146 node.next = head; 147 head.pre = node; 148 head = node; 149 node.pre = null; 150 151 } 152 153 /** 154 * 刪除鏈表的尾結點 155 */ 156 private void removeTailNode() { 157 if(tail != null) { 158 tail = tail.pre; 159 tail.next = null; 160 } 161 } 162 163 164 @Override 165 public String toString() { 166 167 StringBuilder cacheStr = new StringBuilder(); 168 cacheStr.append("{"); 169 //因為元素的訪問順序是在鏈表里維護的,這里要遍歷鏈表 170 Node<K, V> node = head; 171 while(node != null) { 172 cacheStr.append("[" + node.key + "," + node.value + "]"); 173 node = node.next; 174 } 175 176 cacheStr.append("}"); 177 178 return cacheStr.toString(); 179 } 180 181 }
測試數據:
public static void main(String[] args) { LRUCache<Integer, Integer> cache = new LRUCache<Integer, Integer>(5); cache.put(5, 5); cache.put(4, 4); cache.put(3, 3); cache.put(2, 2); cache.put(1, 1); System.out.println(cache.toString()); cache.put(0, 0); System.out.println(cache.toString()); }
運行結果:
{[1,1][2,2][3,3][4,4][5,5]} {[0,0][1,1][2,2][3,3][4,4]}
也實現了LRUCache的基本操作。
等等!一樣的測試數據為什么結果和上面LinkedHashMap實現不一樣!
細心觀察可能會發現,雖然都實現了LRU,但是雙向鏈表+HashMap確實是訪問順序,而LinkedHashMap卻還是一種插入順序?
深入源碼分析一下:
private static final long serialVersionUID = 3801124242820219131L; /** * The head of the doubly linked list. */ private transient Entry<K,V> header; /** * The iteration ordering method for this linked hash map: <tt>true</tt> * for access-order, <tt>false</tt> for insertion-order. * * @serial */ private final boolean accessOrder;
/** * LinkedHashMap entry. */ private static class Entry<K,V> extends HashMap.Entry<K,V> { // These fields comprise the doubly linked list used for iteration. Entry<K,V> before, after; Entry(int hash, K key, V value, HashMap.Entry<K,V> next) { super(hash, key, value, next); }
private transient Entry<K,V> header; private static class Entry<K,V> extends HashMap.Entry<K,V> { Entry<K,V> before, after; …… }
從上面的代碼片段可以看出,LinkedHashMap也是使用了雙向鏈表,而且使用了Map中的Hash算法。LinkedHashMap是繼承了HashMap,實現了Map的。
/** * Constructs an empty <tt>LinkedHashMap</tt> instance with the * specified initial capacity, load factor and ordering mode. * * @param initialCapacity the initial capacity * @param loadFactor the load factor * @param accessOrder the ordering mode - <tt>true</tt> for * access-order, <tt>false</tt> for insertion-order * @throws IllegalArgumentException if the initial capacity is negative * or the load factor is nonpositive */ public LinkedHashMap(int initialCapacity, float loadFactor, boolean accessOrder) { super(initialCapacity, loadFactor); this.accessOrder = accessOrder; }
上面的代碼是我們使用的構造方法。
public V get(Object key) { Entry<K,V> e = (Entry<K,V>)getEntry(key); if (e == null) return null; e.recordAccess(this); return e.value; }
void recordAccess(HashMap<K,V> m) { LinkedHashMap<K,V> lm = (LinkedHashMap<K,V>)m; if (lm.accessOrder) { lm.modCount++; remove(); addBefore(lm.header); } }
void recordRemoval(HashMap<K,V> m) {
remove();
}
這是實現訪問順序的關鍵代碼。
/** * Inserts this entry before the specified existing entry in the list. */ private void addBefore(Entry<K,V> existingEntry) { after = existingEntry; before = existingEntry.before; before.after = this; after.before = this; }
void addEntry(int hash, K key, V value, int bucketIndex) { createEntry(hash, key, value, bucketIndex); // Remove eldest entry if instructed, else grow capacity if appropriate Entry<K,V> eldest = header.after; if (removeEldestEntry(eldest)) { removeEntryForKey(eldest.key); } else { if (size >= threshold) resize(2 * table.length); } } /** * This override differs from addEntry in that it doesn't resize the * table or remove the eldest entry. */ void createEntry(int hash, K key, V value, int bucketIndex) { HashMap.Entry<K,V> old = table[bucketIndex]; Entry<K,V> e = new Entry<K,V>(hash, key, value, old); table[bucketIndex] = e; e.addBefore(header); size++; }
通過這兩段代碼我們可以知道,出現上面問題的原因是實現訪問順序的方式不一樣,鏈表+HashMap是訪問順序優先級從前往后,而LinkedHashMap中是相反的。
拓展一下:
public HashMap(int initialCapacity, float loadFactor) { if (initialCapacity < 0) throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity); if (initialCapacity > MAXIMUM_CAPACITY) initialCapacity = MAXIMUM_CAPACITY; if (loadFactor <= 0 || Float.isNaN(loadFactor)) throw new IllegalArgumentException("Illegal load factor: " + loadFactor); // Find a power of 2 >= initialCapacity int capacity = 1; while (capacity < initialCapacity) capacity <<= 1; this.loadFactor = loadFactor; threshold = (int)(capacity * loadFactor); table = new Entry[capacity]; init(); }
上面這段代碼是HashMap的初始化代碼,可以知道,初始容量是設置為1的,然后不斷的加倍知道大於設置的容量為止。這是一種節省存儲的做法。如果設置了裝載因子,在后續的擴充操作中容量是初始設置容量和裝載因子之積。
上面的所有實現都是單線程的。在並發的情況下不適用。可以使用java.util.concurrent包下的工具類和Collections工具類進行並發改造。
JDK中的LinkedHashMap實現效率還是很高的。可以看一個LeetCode的中的應用:http://www.cnblogs.com/wxisme/p/4888648.html
參考資料:
http://www.cnblogs.com/lzrabbit/p/3734850.html#f1
https://en.wikipedia.org/wiki/Cache_algorithms#LRU
http://zhangshixi.iteye.com/blog/673789
如有錯誤,敬請指正。
轉載請指明來源:http://www.cnblogs.com/wxisme/p/4889846.html