Saturday, November 28, 2015

LRU Cache in Java - Part 2

Introduction

In the previous post I described LRU cache notion and showed how it can be easily implemented using LinkedHashMap data structure. But sometimes there is a requirement to implement LRU cache in Java without using such a convenient thing as LinkedHashMap. A good example of this is an interview question. So let's take a look at how this can be done.

Implementation

This time I'm going to use plain old HashMap to guarantee constant-time access/modification performance and build doubly linked list out of the cache elements manually to speed-up least recently used element tracking.

To begin with let's define class Node which represents an element in a doubly linked list.
class Node<K, V> {
    K key;
    V value;

    Node next;
    Node prev;

    Node(final K key, final V value) {
        this.key = key;
        this.value = value;
    }
}
Since this class is an implementation detail of our LRU cache abstraction, let's make it a private nested class. Should it be a static nested class or non-static one (a.k.a. inner class)? It's a good question. Let's take a look at pros and cons of either approach.

Node as Inner Class

The following code snippet shows how class Node can be implemented as an inner class of enclosing LRUCache class:
public class LRUCache<K, V> {
    // ...
    private Node head;
    private Node tail;
    // ...
    private class Node {
        K key;
        V value;

        Node next;
        Node prev;

        Node(final K key, final V value) {
            this.key = key;
            this.value = value;
        }
    }
    // ...
}
Class Node is non-static, hence generic type parameters of class LRUCache (K and V) are accessible from it. So there's no need to implement class Node as a generic one. This makes code a bit more compact, but introduces additional memory cost per Node instance for tracking a reference to enclosing LRUCache instance (usually 4 bytes for 32-bit machines and 8 bytes for 64-bit machines). For futher information about inner classes and enclosing instances I suggest reading this section of JLS.

Obviously, additional memory costs introduced by the usage of inner classes are not suitable for designing a high-performance data structure like cache. So let's take a look at the other approach.

Node as Static Nested Class

The following code snippet shows how class Node can be implemented as a static nested class of LRUCache class:
public class LRUCache<K, V> {
    // ...
    private Node<K, V> head;
    private Node<K, V> tail;
    // ...
    private static class Node<K1, V1> {
        K1 key;
        V1 value;

        Node<K1, V1> next;
        Node<K1, V1> prev;

        Node(final K1 key, final V1 value) {
            this.key = key;
            this.value = value;
        }
    }
    // ...
}
Note that class Node is made generic with its own set of type parameters (K1 and V1) because type parameters of class LRUCache (K and V) are not accessible from static context. Strictly speaking,  type parameters of class Node could also be named K and V. Compiler always knows which one is meant from the context. I decided to name them differently to increase code readability.

This approach is a bit more verbose but much more efficient when it comes to memory usage. So let's stick to it in our LRUCache implementation.

Operations on Doubly Linked List

In order to implement LRUCache we need to be able to do the following operations on doubly linked list:
  • insert new node into the head position of the list (head of the list represents the most recently used element in the cache);
  • remove a node from any position in the list (when a node is accessed it should be removed from its current position in the list and placed in the head position as the most recently used one).
Here's how the aforementioned operations can be implemented. Implementations are pretty self-explanatory, so I won't describe them in details:
/**
 * Inserts a node to the beginning of the list (in head position).
 * @param node new node to be inserted
 */
private void insertNode(final Node<K, V> node) {
    node.next = head;
    node.prev = null;

    if (head != null) {
        head.prev = node;
    }

    head = node;

    if (tail == null) {
        tail = head;
    }
}

/**
 * Removes a node from the list.
 * @param node a node to be removed
 */
private void removeNode(final Node<K, V> node) {
    if (node.prev != null) {
        node.prev.next = node.next;
    } else {
        head = node.next;
    }

    if (node.next != null) {
        node.next.prev = node.prev;
    } else {
        tail = node.prev;
    }
}

Operations on Cache

Cache abstraction needs two operations in the simplest case:
  • get value from the cache corresponding to a certain key;
  • insert/update value in the cache corresponding to a certain key.
Here they are:
/**
 * Returns a value corresponding to the provided key.
 * @param key key to be searched
 * @return value corresponding to the key, or null if the key is missing
 */
public V get(final K key) {
    if (map.containsKey(key)) {
        final Node<K, V> node = map.get(key);

        // remove node from its current position
        // and insert it to the beginning of the list
        removeNode(node);
        insertNode(node);

        return node.value;
    }

    return null;
}

/**
 * Inserts key-value pair in the cache. If cache size exceeds capacity
 * then the least recently used element is evicted from cache.
 * @param key key to be inserted
 * @param value value corresponding to the key
 */
public void set(final K key, final V value) {
    if (map.containsKey(key)) {
        // replace old value with the new one
        final Node<K, V> node = map.get(key);
        node.value = value;

        // remove node from its current position
        // and insert it to the beginning of the list
        removeNode(node);
        insertNode(node);
    } else {
        final Node<K, V> node = new Node<>(key, value);

        // remove least recently used node
        if (map.size() >= capacity) {
            map.remove(tail.key);
            removeNode(tail);
        }

        // insert new node in the beginning of the list
        insertNode(node);
        map.put(key, node);
    }
}

All Together

Now it is time to take all the pieces together. Let's add missing instance fields (capacity and map), create a constructor and build a complete LRUCache implementation from start to end:
/**
 * This class is an implementation of LRU (least recently used) cache.
 * It is NOT thread-safe and uses HashMap under the hood.
 * @param <K> key type
 * @param <V> value type
 */
public class LRUCache<K, V> {
    private final int capacity;
    private final Map<K, Node<K, V>> map = new HashMap<>();

    private Node<K, V> head;
    private Node<K, V> tail;

    /**
     * Creates new LRUCache instance of the specified capacity.
     * @param capacity cache size
     */
    public LRUCache(final int capacity) {
        this.capacity = capacity;
    }

    /**
     * Returns a value corresponding to the provided key.
     * @param key key to be searched
     * @return value corresponding to the key, or null if the key is missing
     */
    public V get(final K key) {
        if (map.containsKey(key)) {
            final Node<K, V> node = map.get(key);

            // remove node from its current position
            // and insert it to the beginning of the list
            removeNode(node);
            insertNode(node);

            return node.value;
        }

        return null;
    }

    /**
     * Inserts key-value pair in the cache. If cache size exceeds capacity
     * then the least recently used element is evicted from cache.
     * @param key key to be inserted
     * @param value value corresponding to the key
     */
    public void set(final K key, final V value) {
        if (map.containsKey(key)) {
            // replace old value with the new one
            final Node<K, V> node = map.get(key);
            node.value = value;

            // remove node from its current position
            // and insert it to the beginning of the list
            removeNode(node);
            insertNode(node);
        } else {
            final Node<K, V> node = new Node<>(key, value);

            // remove least recently used node
            if (map.size() >= capacity) {
                map.remove(tail.key);
                removeNode(tail);
            }

            // insert new node in the beginning of the list
            insertNode(node);
            map.put(key, node);
        }
    }

    /**
     * Represents a node in a doubly linked list which is used to track
     * least recently used element in the cache.
     * @param <K1> key type
     * @param <V1> value type
     */
    private static class Node<K1, V1> {
        K1 key;
        V1 value;

        Node<K1, V1> next;
        Node<K1, V1> prev;

        Node(final K1 key, final V1 value) {
            this.key = key;
            this.value = value;
        }
    }

    /**
     * Inserts a node to the beginning of the list (in head position).
     * @param node new node to be inserted
     */
    private void insertNode(final Node<K, V> node) {
        node.next = head;
        node.prev = null;

        if (head != null) {
            head.prev = node;
        }

        head = node;

        if (tail == null) {
            tail = head;
        }
    }

    /**
     * Removes a node from the list.
     * @param node a node to be removed
     */
    private void removeNode(final Node<K, V> node) {
        if (node.prev != null) {
            node.prev.next = node.next;
        } else {
            head = node.next;
        }

        if (node.next != null) {
            node.next.prev = node.prev;
        } else {
            tail = node.prev;
        }
    }
}
Thank you for reading!
I hope you enjoyed the article :)

1 comment:

  1. Very informative article.Thank you author for posting this kind of article .

    http://www.wikitechy.com/view-article/nested-classes-in-java-with-example



    Both are really good,.
    Cheers,
    Venkat

    ReplyDelete