A simple LRU cache implemented in Python

  • 2020-04-02 14:10:46
  • OfStack

Cause: my colleague needs a fixed size cache. If the record is in the cache, read it directly from the cache. Otherwise, read it from the database. Python's dict is a very simple cache, but due to the large amount of data, memory is likely to grow too large, so you need to limit the number of records and discard the old records with the LRU algorithm. The key is an integer and the value is a python object of about 10KB

Analysis:

1) as you can imagine, in the cache, we need to maintain key - > The value of the relationship

2) in order to realize LRU, we need a time-based priority queue for maintenance     timestamp   - > (key, value)

3) when the number of records in the cache reaches an upper bound maxsize, the minimum (key,value) of the timestamp needs to be removed from the queue

4) when a (key, value) is hit, we actually need to remove it from the queue and insert it at the end of the queue.

From the analysis, it can be seen that our cache needs to satisfy the above four functions to achieve optimal performance. For the quick removal and insertion of the queue list, the linked list is obviously the optimal choice.

The following is implemented in python:


#encoding=utf-8 class LRUCache(object):
    def __init__(self, maxsize):
        # cache The maximum number of records
        self.maxsize = maxsize
        # For real storage data
        self.inner_dd = {}
        # The list - The head pointer
        self.head = None
        # The list - The tail pointer
        self.tail = None     def set(self, key, value):
        # To the specified size      
        if len(self.inner_dd) >= self.maxsize:
            self.remove_head_node()         node = Node()
        node.data = (key, value)
        self.insert_to_tail(node)
        self.inner_dd[key] = node     def insert_to_tail(self, node):
        if self.tail is None:
            self.tail = node
            self.head = node
        else:
            self.tail.next = node
            node.pre = self.tail
            self.tail = node     def remove_head_node(self):
        node = self.head
        del self.inner_dd[node.data[0]]
        node = None
        self.head = self.head.next
        self.head.pre = None
    def get(self, key):
        if key in self.inner_dd:
            # If hit , The corresponding node needs to be moved to the end of the queue
            node = self.inner_dd.get(key)
            self.move_to_tail(node)
            return node.data[1]
        return None     def move_to_tail(self, node):
        # Just deal with the cases at the head and in the middle of the queue
        if not (node == self.tail):
            if node == self.head:
                self.head = node.next
                self.head.pre = None
                self.tail.next = node
                node.pre = self.tail
                node.next = None
                self.tail = node
            else:
                pre_node = node.pre
                next_node = node.next
                pre_node.next = next_node
                next_node.pre = pre_node                 self.tail.next = node
                node.pre = self.tail
                node.next = None
                self.tail = node class Node(object):
    def __init__(self):
        self.pre = None
        self.next = None
        # (key, value)
        self.data = None     def __eq__(self, other):
        if self.data[0] == other.data[0]:
            return True
        return False
    def __str__(self):
       return str(self.data) if __name__ == '__main__':
    cache = LRUCache(10)
    for i in xrange(1000):
        cache.set(i, i+1)
        cache.get(2)
    for key in cache.inner_dd:
        print key, cache.inner_dd[key]


Related articles: