An in depth analysis of LFU algorithm in Redis

  • 2020-06-15 10:29:31
  • OfStack

preface

In the LRU algorithm of Redis, it is said that LRU has a defect in the following cases:

[

~~~~~A~~~~~A~~~~~A~~~~A~~~~~A~~~~~A~~|
~~B~~B~~B~~B~~B~~B~~B~~B~~B~~B~~B~~B~|
~~~~~~~~~~C~~~~~~~~~C~~~~~~~~~C~~~~~~|
~~~~~D~~~~~~~~~~D~~~~~~~~~D~~~~~~~~~D|

]

Data D is mistaken for the data most likely to be accessed in the future.

The author of Redis once wanted to improve the LRU algorithm, but found that the LRU algorithm of Redis was subject to the random sampling number maxmemory_samples. When maxmemory_samples was equal to 10, it was very close to the ideal performance of LRU algorithm, that is to say, the LRU algorithm itself could hardly make another step.

Therefore, the idea is to return to the original point. The intention of the elimination algorithm is to retain the data that is most likely to be accessed again in the future, while the LRU algorithm only predicts that the recently accessed data is most likely to be accessed in the future. We can change our thinking and adopt an LFU(Least Frequently Used) algorithm, that is, the most frequently accessed data is most likely to be accessed in the future. In the above case, the retention priority can be determined based on frequent visits: B > A > C = D.

LFU idea in Redis

In the LFU algorithm, 1 counter can be maintained for each key. The counter increases each time key is accessed. The larger the counter, the more frequently it is accessed.

There are two problems with the above simple algorithm:

In the LRU algorithm, it is possible to maintain a bi-directional linked list, and then simply move the accessed node to the beginning of the list, but it is not feasible in LFU. Nodes must be sorted strictly according to the counter, and the time complexity may reach O(N) when adding nodes or updating node positions. Simply increasing the counter isn't perfect. The access pattern is subject to frequent changes. key1 that is frequently accessed within a period of time may be rarely accessed after a period of time. Merely increasing the counter does not reflect this trend.

The first problem is easy to solve, we can learn from the experience of LRU implementation, maintain 1 pool to be phased out of key. The solution to the second problem is to record the last time that key was accessed and then lower the counter over time.

The structure of the Redis object is as follows:


typedef struct redisObject {
  unsigned type:4;
  unsigned encoding:4;
  unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
              * LFU data (least significant 8 bits frequency
              * and most significant 16 bits access time). */
  int refcount;
  void *ptr;
} robj;

In LRU algorithm, lru of 24 bits is used to record LRU time. This field can also be used in LFU, but it is divided into 16 bits and 8 bits:


      16 bits   8 bits
   +----------------+--------+
   + Last decr time | LOG_C |
   +----------------+--------+

High 16 bits is used to record the time of the last counter drop in minutes and low 8 bits to record counter value counter.

LFU configuration

Two LFU patterns were added to the maxmemory_policy elimination strategy after Redis4.0:

volatile-lfu: LFU elimination algorithm is used for key with expiration time allkeys-lfu: LFU elimination algorithm is adopted for all key

There are also two configurations to adjust the LFU algorithm:


lfu-log-factor 10
lfu-decay-time 1

lfu-log-factor can adjust the growth rate of the counter counter. The larger the ES118en-ES119en-ES120en is, the slower the growth rate of counter is.

lfu-decay-time is a value in minutes that adjusts the reduction rate of counter

The source code to achieve

In lookupKey:


robj *lookupKey(redisDb *db, robj *key, int flags) {
  dictEntry *de = dictFind(db->dict,key->ptr);
  if (de) {
    robj *val = dictGetVal(de);

    /* Update the access time for the ageing algorithm.
     * Don't do it if we have a saving child, as this will trigger
     * a copy on write madness. */
    if (server.rdb_child_pid == -1 &&
      server.aof_child_pid == -1 &&
      !(flags & LOOKUP_NOTOUCH))
    {
      if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
        updateLFU(val);
      } else {
        val->lru = LRU_CLOCK();
      }
    }
    return val;
  } else {
    return NULL;
  }
}

When LFU is adopted, updateLFU updates lru:


/* Update LFU when an object is accessed.
 * Firstly, decrement the counter if the decrement time is reached.
 * Then logarithmically increment the counter, and update the access time. */
void updateLFU(robj *val) {
  unsigned long counter = LFUDecrAndReturn(val);
  counter = LFULogIncr(counter);
  val->lru = (LFUGetTimeInMinutes()<<8) | counter;
}

Reduce LFUDecrAndReturn

First, LFUDecrAndReturn reduces counter by:


/* If the object decrement time is reached decrement the LFU counter but
 * do not update LFU fields of the object, we update the access time
 * and counter in an explicit way when the object is really accessed.
 * And we will times halve the counter according to the times of
 * elapsed time than server.lfu_decay_time.
 * Return the object frequency counter.
 *
 * This function is used in order to scan the dataset for the best object
 * to fit: as we check for the candidate, we incrementally decrement the
 * counter of the scanned objects if needed. */
unsigned long LFUDecrAndReturn(robj *o) {
  unsigned long ldt = o->lru >> 8;
  unsigned long counter = o->lru & 255;
  unsigned long num_periods = server.lfu_decay_time ? LFUTimeElapsed(ldt) / server.lfu_decay_time : 0;
  if (num_periods)
    counter = (num_periods > counter) ? 0 : counter - num_periods;
  return counter;
}

The function first takes the last reduction time of high 16 bits ldt and low 8 bits counter counter, and then calculates how much should be reduced based on the configured lfu_decay_time.

LFUTimeElapsed is used to calculate the difference between the current time and ldt:


/* Return the current time in minutes, just taking the least significant
 * 16 bits. The returned time is suitable to be stored as LDT (last decrement
 * time) for the LFU implementation. */
unsigned long LFUGetTimeInMinutes(void) {
  return (server.unixtime/60) & 65535;
}

/* Given an object last access time, compute the minimum number of minutes
 * that elapsed since the last access. Handle overflow (ldt greater than
 * the current 16 bits minutes time) considering the time as wrapping
 * exactly once. */
unsigned long LFUTimeElapsed(unsigned long ldt) {
  unsigned long now = LFUGetTimeInMinutes();
  if (now >= ldt) return now-ldt;
  return 65535-ldt+now;
}

Specifically, after the current time is converted into minutes, take the lower 16 bits, and then calculate the difference between now-ES169en and ldt. When ldt > When now, the default value is 65535-ldt+now after 1 cycle (16 bits, maximum 65535).

Then divide the difference with the configuration lfu_decay_time, LFUTimeElapsed(ldt)/server.lfu_ES186en, there are already 187en lfu_decay_time, then reduce n, ES193en-ES194en_time.

Growth LFULogIncr

The growth function LFULogIncr is as follows:


/* Logarithmically increment a counter. The greater is the current counter value
 * the less likely is that it gets really implemented. Saturate it at 255. */
uint8_t LFULogIncr(uint8_t counter) {
  if (counter == 255) return 255;
  double r = (double)rand()/RAND_MAX;
  double baseval = counter - LFU_INIT_VAL;
  if (baseval < 0) baseval = 0;
  double p = 1.0/(baseval*server.lfu_log_factor+1);
  if (r < p) counter++;
  return counter;
}

counter is not simply +1 for every visit, but USES an p factor between 0 and 1 to control the growth. The maximum value of counter is 255. A random number between 0 and 1, r, is compared with p when r < When p is added, counter is added, which is similar to the strategy of controlling output in Bitcoin. p depends on the current counter value and lfu_log_factor factor, the larger the counter value and lfu_log_factor factor, the smaller the p, r < The smaller the probability of p, the smaller the probability of counter growing. The growth is as follows:

[

+--------+------------+------------+------------+------------+------------+
| factor | 100 hits | 1000 hits | 100K hits | 1M hits | 10M hits |
+--------+------------+------------+------------+------------+------------+
| 0 | 104 | 255 | 255 | 255 | 255 |
+--------+------------+------------+------------+------------+------------+
| 1 | 18 | 49 | 255 | 255 | 255 |
+--------+------------+------------+------------+------------+------------+
| 10 | 10 | 18 | 142 | 255 | 255 |
+--------+------------+------------+------------+------------+------------+
| 100 | 8 | 11 | 49 | 143 | 255 |
+--------+------------+------------+------------+------------+------------+

]

It can be seen that the growth of counter and the number of visits show a logarithmic growth trend. As the number of visits increases, the growth of counter slows down.

Freshman key strategy

Another problem is that when creating a new object, if the object's counter is 0, it will easily be eliminated. There is also a need to set an initial counter, createObject for the new key:


robj *createObject(int type, void *ptr) {
  robj *o = zmalloc(sizeof(*o));
  o->type = type;
  o->encoding = OBJ_ENCODING_RAW;
  o->ptr = ptr;
  o->refcount = 1;

  /* Set the LRU to the current lruclock (minutes resolution), or
   * alternatively the LFU counter. */
  if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
    o->lru = (LFUGetTimeInMinutes()<<8) | LFU_INIT_VAL;
  } else {
    o->lru = LRU_CLOCK();
  }
  return o;
}

counter will be initialized to LFU_INIT_VAL, default 5.

pool

pool algorithm and LRU algorithm 1:


    if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) ||
      server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL)

The calculation of idle is different:


      16 bits   8 bits
   +----------------+--------+
   + Last decr time | LOG_C |
   +----------------+--------+
0

255-ES291en (o) was used as the sorting basis.

Refer to the link

Random notes on improving the Redis LRU algorithm Using Redis as an LRU cache

conclusion


Related articles: