Redis's new feature lazy delete Lazy Free details

  • 2020-06-19 11:02:01
  • OfStack

preface

Redis4.0 adds a very practical feature called lazy free, which addresses the risk of Big Key(Key) deletion. The author also encountered several usability and performance failures caused by Big Key deletion in redis operation and maintenance.

This paper is divided into the following sections to illustrate redis lazy free:

Definition of lazy free Why do we need lazy free Use of lazy free Monitoring of lazy free Simple analysis of the lazy free implementation

Definition of lazy free

lazy free can be translated as lazy delete or delayed release; redis provides the ability to release key memory asynchronously when deleting the key, placing the key release operation in a separate sub-thread of bio(Background I/O) and reducing the blocking of the redis main thread by deleting big key. Effectively avoid the performance and availability issues associated with removing big key.

Why do we need lazy free

Redis is the ES58en-ES59en program (except for a small number of bio tasks). When running a time-consuming request, it causes all requests to queue up and redis cannot respond to other requests, causing performance problems and even cluster failover.
redis falls into this category of time-consuming requests when deleting large collection keys. According to the test, it takes about 1000ms to delete a collection key of 1 million elements.

The following test takes 1360ms to delete the hash key for 1 million fields. During the processing of this DEL request, other requests are blocked completely.


 delete 1 a 100 All the fields hash key 
127.0.0.1:6379> HLEN hlazykey
(integer) 1000000
127.0.0.1:6379> del hlazykey
(integer) 1
(1.36s)
127.0.0.1:6379> SLOWLOG get
1) 1) (integer) 0
2) (integer) 1501314385
3) (integer) 1360908
4) 1) "del"
2) "hlazykey"
5) "127.0.0.1:35595"
6)  " "

Test estimate for reference; It depends on the hardware environment, Redis version, and load

Key类型 Item数量 耗时
Hash ~100万 ~1000ms
List ~100万 ~1000ms
Set ~100万 ~1000ms
Sorted Set ~100万 ~1000ms

Before redis4.0, there is no lazy free function. DBA can only delete 100 elements at a time by a clever method, similar to scan big key. But in the case of a "passive" delete key, this tricky delete is not going to work.

For example, when we produce a large cluster of Redis Cluster, the business slowly writes an Hash key with more than 20 million fields of TTL. When this key expires, redis starts to clean it passively, causing redis to be blocked for more than 20 seconds. The current sharding master node cannot process the request for more than 20 seconds, and the main library fails to switch.

After es100EN4.0 has lazy free function, the time of such active or passive deletion of big key and 1 O(1) instruction is one sample, and the sub-millisecond level is returned; The time-consuming action of actually releasing redis elements is performed by the bio background task.

Use of lazy free

The use of lazy free falls into two categories: the first is the active deletion corresponding to DEL command; the second is the expired key deletion and maxmemory key elimination deletion.

Active delete key USES lazy free

UNLINK command

The UNLINK command is an lazy free implementation that removes the key function like DEL1.

The only difference is that when UNLINK deletes the collection class key, if the number of elements in the collection key is greater than 64 (more on this later), the actual memory release operation is given to a separate bio.

Here's an example: Use the UNLINK command to remove one big key, mylist, which contains 2 million elements in a time of 0.03 milliseconds


127.0.0.1:7000> LLEN mylist
(integer) 2000000
127.0.0.1:7000> UNLINK mylist
(integer) 1
127.0.0.1:7000> SLOWLOG get
1) 1) (integer) 1
2) (integer) 1505465188
3) (integer) 30
4) 1) "UNLINK"
2) "mylist"
5) "127.0.0.1:17015"
6) ""

Note: the DEL command is also a concurrent blocked delete operation

FLUSHALL/FLUSHDB ASYNC

By adding the ASYNC asynchronous cleanup option to FLUSHALL/FLUSHDB, redis operates asynchronously when cleaning the entire instance or DB.


127.0.0.1:7000> DBSIZE
(integer) 1812295
127.0.0.1:7000> flushall // Synchronously clean up instance data, 180 m key Time consuming 1020 ms 
OK
(1.02s)
127.0.0.1:7000> DBSIZE
(integer) 1812637
127.0.0.1:7000> flushall async // Asynchronously clean up instance data, 180 m key It took about 9 ms 
OK
127.0.0.1:7000> SLOWLOG get
1) 1) (integer) 2996109
2) (integer) 1505465989
3) (integer) 9274 // Instruction run time 9.2 ms 
4) 1) "flushall" 
2) "async"
5) "127.0.0.1:20110"
6) ""

The passive delete key USES lazy free

lazy free is applied to passive deletion. There are currently 4 scenarios, with 1 configuration parameter for each scenario. Turn off by default.


lazyfree-lazy-eviction no
lazyfree-lazy-expire no
lazyfree-lazy-server-del no
slave-lazy-flush no

Note: From the test, the efficiency of lazy free memory recovery is relatively high. But in the production environment please combine the actual situation, open the passive deletion

lazy free observe the memory usage of redis.

lazyfree-lazy-eviction

When maxmeory memory usage is reached and there is an exit strategy set for redis; In the case of passive elimination key, whether lazy free mechanism is adopted;
Because lazy free is enabled in this scenario, the memory used with the knockout key may not be released in time, causing redis memory to be overused, exceeding the maxmemory limit. When this scenario is used, combine it with business tests.

lazyfree-lazy-expire --todo verifies whether such operations are synchronized to the slave library DEL or UNLINK.

Whether to use lazy free mechanism when the key with TTL set expires and is cleaned and deleted by redis;
This scenario is recommended because TTL itself is adaptive to adjust the speed.

lazyfree-lazy-server-del

For some instructions, an operation with an implicit DEL key when processing an existing key. As with the rename command,redis first deletes the target key when the target key already exists. If the target key is 1 big key, then the performance problem of blocking deletion is introduced. This parameter setting is the solution to this problem and is recommended to be turned on.

slave-lazy-flush

Full data synchronization for slave. slave will run flushall to clean up its own data scenario before loading master's RDB file.
Parameter Settings determine whether to use the exception flush mechanism. If memory changes little, it is recommended to turn it on. This reduces the total synchronization time, thus reducing the memory usage growth of the main library due to output buffer bursts.

lazy free monitoring

lazy free can monitor only one data indicator: lazyfree_pending_objects, indicating the number of keys that redis performs lazy free and is waiting for the actual content to be recovered. It does not reflect the number of elements in a single key or the size of memory waiting for lazy free to be reclaimed.

Therefore, this value has a definite reference value, which can monitor the efficiency of redis lazy free or the number of stacking keys; For example, in the flushall async scenario there will be a small amount of accumulation.

Simple analysis of lazy free implementation

antirez modified many underlying structures and key functions to realize the lazy free function. This section only introduces the functional implementation logic of lazy free; The code is mainly in the source files ES279en.c and ES281en.c.

UNLINK command

The unlink command entry function unlinkCommand() and del call the same function delGenericCommand() to delete KEY, using lazy to identify whether the call is lazyfree. If it is lazyfree, the dbAsyncDelete() function is called.

However, unlink does not enable lazy free every time. redis determines the cost of KEY (cost) to be released. lazy free is not released until cost is greater than LAZYFREE_THRESHOLD.

Release key cost calculation function lazyfreeGetFreeEffort(), collection type key, and meet the corresponding code, cost is the number of the collection key, otherwise cost is 1.
For example:

An list key containing 100 elements, whose free cost is 100 1 string key for 512MB, its free cost is 1

Therefore, it can be seen that lazy free's cost calculated the main time complexity correlation.

lazyfreeGetFreeEffort() function code


size_t lazyfreeGetFreeEffort(robj *obj) {
if (obj->type == OBJ_LIST) { 
quicklist *ql = obj->ptr;
return ql->len;
} else if (obj->type == OBJ_SET && obj->encoding == OBJ_ENCODING_HT) {
dict *ht = obj->ptr;
return dictSize(ht);
} else if (obj->type == OBJ_ZSET && obj->encoding == OBJ_ENCODING_SKIPLIST){
zset *zs = obj->ptr;
return zs->zsl->length;
} else if (obj->type == OBJ_HASH && obj->encoding == OBJ_ENCODING_HT) {
dict *ht = obj->ptr;
return dictSize(ht);
} else {
return 1; /* Everything else is a single allocation. */
}
}

Part of the code for dbAsyncDelete() function


#define LAZYFREE_THRESHOLD 64 // According to the FREE1 a key the cost If more than 64 , used to determine whether to proceed lazy free call 
int dbAsyncDelete(redisDb *db, robj *key) {
/* Deleting an entry from the expires dict will not free the sds of
* the key, because it is shared with the main dictionary. */
if (dictSize(db->expires) > 0) dictDelete(db->expires,key->ptr); // from expires Direct delete key
dictEntry *de = dictUnlink(db->dict,key->ptr); // for unlink Handle, but do not carry out the actual free operation 
if (de) {
robj *val = dictGetVal(de);
size_t free_effort = lazyfreeGetFreeEffort(val); // assessment free The current key The price of 
/* If releasing the object is too much work, let's put it into the
* lazy free list. */
if (free_effort > LAZYFREE_THRESHOLD) { // if free The current key cost>64,  Put it in lazy free the list,  use bio Child threads do the actual work free Operation, not running through the main thread 
atomicIncr(lazyfree_objects,1); // To deal with lazyfree Number of objects plus 1 Through the info Commands can be viewed 
bioCreateBackgroundJob(BIO_LAZY_FREE,val,NULL,NULL); 
dictSetVal(db->dict,de,NULL);
}
}
}

The actual call to the lazyfreeFreeObjectFromBioThread() function in bio frees key


void lazyfreeFreeObjectFromBioThread(robj *o) {
decrRefCount(o); // Update the corresponding reference, calling different, depending on the type free function 
atomicDecr(lazyfree_objects,1); // complete key the free, Update to be processed lazyfree The number of keys 
}

flushall/flushdb async command

When flushall/flushdb takes async, the function emptyDb() calls emptyDbAsync() to do the entire instance or lazy free logic processing for DB.
emptyDbAsync processing logic is as follows:


/* Empty a Redis DB asynchronously. What the function does actually is to
* create a new empty set of hash tables and scheduling the old ones for
* lazy freeing. */
void emptyDbAsync(redisDb *db) {
dict *oldht1 = db->dict, *oldht2 = db->expires; // the db Of the two hash tables Store up 
db->dict = dictCreate(&dbDictType,NULL); // for db Create two empty ones hash tables
db->expires = dictCreate(&keyptrDictType,NULL);
atomicIncr(lazyfree_objects,dictSize(oldht1)); // Update to be processed lazyfree The number of keys, plus db the key The number of 
bioCreateBackgroundJob(BIO_LAZY_FREE,NULL,oldht1,oldht2);// To join the bio list
}

The actual call to the lazyfreeFreeDatabaseFromBioThread function in bio frees db


void lazyfreeFreeDatabaseFromBioThread(dict *ht1, dict *ht2) {
size_t numkeys = dictSize(ht1);
dictRelease(ht1);
dictRelease(ht2);
atomicDecr(lazyfree_objects,numkeys);// Complete the entire DB the free, Update to be processed lazyfree The number of keys  
}

The passive delete key USES lazy free

Four scenarios were passively deleted. When called in each scenario, redis would judge whether the corresponding parameter was turned on or not. If the parameter was turned on, the corresponding lazy free function above would be called to process the logic implementation.

conclusion

Since Redis is a single main thread processing, antirez1 directly emphasizes "Lazy Redis better Redis".

The essence of lazy free is to delete some cost(the main time complex system, occupying cpu time slice of the main thread) from redis main thread, and let bio subthreads handle it, which greatly reduces the blocking time of the main thread. This reduces performance and stability problems caused by deletions.


Related articles: