Describes the high performance Java cache library Caffeine in detail

  • 2021-01-18 06:24:49
  • OfStack

1, the introduction

In this article, we'll take a look at Caffeine - a high-performance Java cache library.

One fundamental difference between caching and Map is that caching can reclaim stored item.

The collection policy is about which objects are deleted at a specified time. This policy directly affects the cache hit ratio, an important feature of the cache library.

Caffeine provides a near optimal hit ratio due to the Window TinyLfu recovery strategy.

2, rely on

We need to add caffeine dependencies to pom.xml:


<dependency>
  <groupId>com.github.ben-manes.caffeine</groupId>
  <artifactId>caffeine</artifactId>
  <version>2.5.5</version>
</dependency>

You can find the latest version of caffeine on Maven Central.

3. Fill the cache

Let's take a look at three cache population strategies for Caffeine: manual, synchronous, and asynchronous loading.

First, we write a class for the type of value we want to store in the cache:


class DataObject {
  private final String data;
 
  private static int objectCounter = 0;
  // standard constructors/getters
   
  public static DataObject get(String data) {
    objectCounter++;
    return new DataObject(data);
  }
}

3.1. Manual filling

In this strategy, we manually place values in the cache before retrieving them.

Let's initialize the cache:


Cache<String, DataObject> cache = Caffeine.newBuilder()
 .expireAfterWrite(1, TimeUnit.MINUTES)
 .maximumSize(100)
 .build();

Now, we can use the ES46en method to get some values from the cache. If this value does not exist in the cache, this method will return null:


String key = "A";
DataObject dataObject = cache.getIfPresent(key);
 
assertNull(dataObject);

We can manually populate the cache using the put method:


cache.put(key, dataObject);
dataObject = cache.getIfPresent(key);
 
assertNotNull(dataObject);

We can also get the value using the get method, which passes in a single Function with an key parameter. If the key does not exist in the cache, this function is used to provide the fallback value, which is inserted into the cache after calculation:


dataObject = cache
 .get(key, k -> DataObject.get("Data for A"));
 
assertNotNull(dataObject);
assertEquals("Data for A", dataObject.getData());

The get method can perform calculations atomically. This means that you only evaluate once -- even if multiple threads request the value at the same time. This is why using get is superior to getIfPresent.

Sometimes we need to manually invalidate some cached values:


cache.invalidate(key);
dataObject = cache.getIfPresent(key);
 
assertNull(dataObject);

3.2 Synchronous loading

This approach to loading the cache uses the get approach, which is a manual strategy similar to the Function approach used to initialize values. Let's see how to use it.

First, we need to initialize the cache:


LoadingCache<String, DataObject> cache = Caffeine.newBuilder()
 .maximumSize(100)
 .expireAfterWrite(1, TimeUnit.MINUTES)
 .build(k -> DataObject.get("Data for " + k));

Now we can retrieve the value using the get method:


DataObject dataObject = cache.get(key);
 
assertNotNull(dataObject);
assertEquals("Data for " + key, dataObject.getData());

We can also use the getAll method to get a set of values:


Map<String, DataObject> dataObjectMap 
 = cache.getAll(Arrays.asList("A", "B", "C"));
 
assertEquals(3, dataObjectMap.size());

Retrieves the value from the underlying back-end initialization function passed to the ES86en method. This makes it possible to use the cache as the primary facade for access values (ES87en).

3.3. Asynchronous loading

This policy does the same as before, but performs the operation asynchronously and returns 1 CompletableFuture with a value:


class DataObject {
  private final String data;
 
  private static int objectCounter = 0;
  // standard constructors/getters
   
  public static DataObject get(String data) {
    objectCounter++;
    return new DataObject(data);
  }
}
0

We can use the get and getAll methods in the same way, considering that they return CompletableFuture:


class DataObject {
  private final String data;
 
  private static int objectCounter = 0;
  // standard constructors/getters
   
  public static DataObject get(String data) {
    objectCounter++;
    return new DataObject(data);
  }
}
1

CompletableFuture There are many useful API versions, and you can read more about them in this article.

4. Value recovery

Caffeine has three value recovery policies: size based, time based, and reference.

4.1. Size based collection

This collection approach assumes that a collection will occur when a configured cache size limit is exceeded. There are two ways to get the size: count the object in the cache, or get the weight.

Let's look at how to calculate objects in the cache. When the cache is initialized, its size is equal to zero:


class DataObject {
  private final String data;
 
  private static int objectCounter = 0;
  // standard constructors/getters
   
  public static DataObject get(String data) {
    objectCounter++;
    return new DataObject(data);
  }
}
2

When we add 1 value, the size increases significantly:


cache.get("A");
 
assertEquals(1, cache.estimatedSize());

We can add the second value to the cache, which causes the first value to be deleted:


class DataObject {
  private final String data;
 
  private static int objectCounter = 0;
  // standard constructors/getters
   
  public static DataObject get(String data) {
    objectCounter++;
    return new DataObject(data);
  }
}
4

It's worth mentioning that we called the cleanUp method before getting the cache size. This is because cache collection is performed asynchronously, which helps to wait for the collection to complete.

We can also pass 1 weigher Function to get the cache size:


class DataObject {
  private final String data;
 
  private static int objectCounter = 0;
  // standard constructors/getters
   
  public static DataObject get(String data) {
    objectCounter++;
    return new DataObject(data);
  }
}
5

When weight exceeds 10, the value is removed from the cache:


class DataObject {
  private final String data;
 
  private static int objectCounter = 0;
  // standard constructors/getters
   
  public static DataObject get(String data) {
    objectCounter++;
    return new DataObject(data);
  }
}
6

4.2 Time-based recovery

This recovery strategy is based on the expiration date of an item and comes in three types:

Expiration after access - The entry has expired since the last read or write. Write Expiration - The entry has expired since the last write occurred Custom Policy - Expires are calculated separately by the Expiry implementation

Let's configure the post-access expiration policy using the expireAfterAccess method:


class DataObject {
  private final String data;
 
  private static int objectCounter = 0;
  // standard constructors/getters
   
  public static DataObject get(String data) {
    objectCounter++;
    return new DataObject(data);
  }
}
7

To configure the write-to-expiration policy, we use the expireAfterWrite method:


cache = Caffeine.newBuilder()
 .expireAfterWrite(10, TimeUnit.SECONDS)
 .weakKeys()
 .weakValues()
 .build(k -> DataObject.get("Data for " + k));

To initialize the custom policy, we need to implement the Expiry interface:


class DataObject {
  private final String data;
 
  private static int objectCounter = 0;
  // standard constructors/getters
   
  public static DataObject get(String data) {
    objectCounter++;
    return new DataObject(data);
  }
}
9

4.3 Reference based collection

We can configure the cache to enable garbage collection of cache key values. To do this, we configure key and value as weak references, and we can configure only soft references for garbage collection.

Using WeakRefence enables garbage collection of objects when there are no strong references to them. SoftReference allows objects to be garbage collected according to the global least recently used (Least-Recently-Used) policy of JVM. For more details about Java references, see here.

We should enable each option using Caffeine.weakKeys(), Caffeine.weakValues(), and Caffeine.softValues() :


LoadingCache<String, DataObject> cache = Caffeine.newBuilder()
 .expireAfterWrite(10, TimeUnit.SECONDS)
 .weakKeys()
 .weakValues()
 .build(k -> DataObject.get("Data for " + k));
 
cache = Caffeine.newBuilder()
 .expireAfterWrite(10, TimeUnit.SECONDS)
 .softValues()
 .build(k -> DataObject.get("Data for " + k));

5, refresh

Caches can be configured to automatically refresh entries after a defined period of time. Let's see how to use the refreshAfterWrite method:


Caffeine.newBuilder()
 .refreshAfterWrite(1, TimeUnit.MINUTES)
 .build(k -> DataObject.get("Data for " + k));

Here we should understand the difference between expireAfter and refreshAfter. When an expired entry is requested, execution is blocked until build evaluates the new value.

However, if the entry can be refreshed, the cache will return 1 old value and asynchronously reload the value.

6, statistics

Caffeine has one way of recording cache usage:


LoadingCache<String, DataObject> cache = Caffeine.newBuilder()
 .maximumSize(100)
 .recordStats()
 .build(k -> DataObject.get("Data for " + k));
cache.get("A");
cache.get("A");
 
assertEquals(1, cache.stats().hitCount());
assertEquals(1, cache.stats().missCount());

We may also pass in recordStats supplier to create an implementation of StatsCounter. This object is pushed every time a statistics-related change is made.

7, the conclusion

In this article, we became familiar with the Caffeine cache library for Java. We saw how to configure and populate the cache, and how to choose the appropriate expiration or refresh policy depending on our needs.

The source code for the examples in this article can be found on Github.


Related articles: