redis's two persistence schemes go into detail

  • 2020-06-19 12:00:34
  • OfStack

preface

Redis is an advanced ES5en-ES6en database. It is similar to memcached, but the data can be persisted and the data types it supports are rich. There are strings, linked lists, set and ordered sets. Support to compute the union, intersection and complement (difference) of the collection on the server side, and also support a variety of sorting functions. So Redis can also be seen as a data structure server.

All Redis data is stored in memory and asynchronously to disk (this is called semi-persistent mode). You can also write every data change to an append only file(aof) (this is called "full persistence mode").

Life is a series of toss about, the network, multithreaded series of blog owners continue to toss about will not give up. This is a new series. The knowledge of cache is not only about adding, deleting, modifying and checking. I think it is necessary to study 1 wave thoroughly. Record the learning process and understanding.

RDB

What is a RDB

Data in redis is persisted periodically by setting check intervals and backup triggers in the configuration file

The advantages of RDB persistence

RDB generates multiple data files, each of which represents the data of redis at a given moment. This method of multiple data files is very suitable for cold backup. RDB has very little impact on the read and write services provided by redis, which can keep redis high performance, because the main redis process only needs fork1 children, which can perform disk IO operations for RDB persistence Compared to the AOF persistence mechanism, it is faster to restart and restore the redis process directly based on the RDB data file

Disadvantages of RDB persistence

If you want to lose as little data as possible when redis fails, RDB is not as good as AOF. Generally, RDB data snapshot files are generated every 5 minutes or longer, at which point you have to accept that once the redis process goes down, the last 5 minutes of data will be lost. This problem, which is also the biggest disadvantage of rdb, is that it is not suitable for the first priority recovery scheme. If you rely on RDB for the first priority recovery scheme, there will be more data loss RDB generates RDB snapshot data files every time the fork subprocess performs the RDB snapshot data file generation. If the data file is too large, it may cause the service provided by the client to be suspended for several milliseconds, or even for several seconds. Otherwise, the generated RDB file is too large and may affect the performance of redis itself

How do I configure RDB persistence for redis

redis.conf file to configure persistence

save 60 1000

Every 60s, if more than 1000 key have changed, a new ES80en.rdb file is generated, which is the complete snapshot of the data in the current redis memory. This operation is also called snapshotting snapshot

You can also manually invoke the save or bgsave commands to perform rdb snapshot generation synchronously or asynchronously. (save will block the redis main thread while generating the ES90en.rdb file. bgsave will not block the redis main thread.)

save can be set to multiple, that is, multiple snapshotting checkpoints. Each checkpoint will go to check1, if any specified number of key has changed, and if so, a new ES101en.rdb file will be generated

AOF

What AOF

The AOF mechanism logs each write command and writes to one log file in append-ES115en mode. On the redis restart, the entire data set can be rebuilt by playing back the write instruction in the AOF log.

The advantages of AOF persistence

AOF provides better protection against data loss. AOF typically performs an fsync operation every 1 second from a background thread. (fsync makes sure that all changes are properly synchronized to the hard disk. This call blocks and waits until the device reports that IO is complete.) Every 1 second, the fsync operation is performed to ensure that the data in oscache is written to disk and the redis process is suspended and the data is lost for a maximum of 1 second. The AOF log files are written in append-only mode, so there is no disk addressing overhead, write performance is very high, and the file is not easily broken, even if the end of the file is broken, it is easy to repair. Even when the AOF log file is too large, the background rewrite will not affect the client's read-write. At rewritelog, the instructions are compressed to create a minimum log of the data to be recovered. When a new log file is created, the old log file is still written. When the new ready after the log file ready, then exchange the old and new log files. The command for the AOF log file is recorded in a very readable manner, which is a great feature for emergency recovery for catastrophic deletion. For example, someone accidentally USES THE flushall command to clear all the data. As long as the background rewrite has not occurred at this time, then you can immediately copy the AOF file, delete the last flushall command, and then put the AOF file back, you can automatically restore all the data through the recovery mechanism.

Disadvantages of the AOF persistence mechanism

For the same data, the AOF log file is usually larger than the RDB data snapshot file. When AOF is turned on, the supported write QPS will be lower than the supported write QPS because AOF1 will normally be configured to log fsync1 per second. Although fsync per second is still very high performance, if you want to ensure that not a single piece of data is lost, fsync for AOF is set to not write a single piece of data, fsync1, that's the end, redis QPS will be even lower. Before AOF happened bug, that is, when the data was recovered through the log recorded by AOF, no data of module 1 and sample 1 was recovered. Therefore, the more complex command-based logging /merge/ playback approach like AOF is more fragile than the RDB approach of persisting one complete data snapshot at a time. However, AOF is to avoid bug caused by THE rewrite process, so each time rewrite is not based on the old instruction log, but based on the data in memory at that time, it is much more robust. The only big disadvantage of the first one is that it is slow to do data recovery, and it is not convenient to do cold backup and regular backup, so you may have to write complex scripts by yourself to do cold backup, which is not appropriate. RDB recovery log, is a data file, when the recovery, directly loaded into memory. AOF, on the other hand, does data recovery by playing back and executing all instruction logs to recover all data in memory.

How do I configure AOF persistence for redis

AOF persistence, which is turned off by default, is turned on by default

appendonly yes, you can turn on the AOF persistence mechanism. In production, AOF is generally turned on, unless you say it doesn't matter if you throw away a few minutes of data. When AOF persistence is enabled, every time redis receives a write, it writes to the log file, first to os cache and then to fsync1 at regular intervals.

If both AOF and RDB are turned on, when redis is restarted, AOF is preferred for data recovery because aof data is complete

The fsync policy of AOF can be configured. There are three policies to choose from:

always: each time one piece of data is written, the corresponding log fsync is immediately sent to the disk. The performance is very, very poor and throughput is very low. Make sure you don't lose a single piece of data in redis, so that's it everysec: Transfer fsync data from os cache to disk per second. This is the most common configuration used in production environment 1 no: Just redis writes the data to os cache and then os has its own strategy of flushing the data to disk from time to time

conclusion


Related articles: