A few tips for getting Redis to play a bigger role in your system

  • 2020-05-10 23:07:21
  • OfStack

Redis is different from other database solutions in many ways: it USES memory to provide primary storage support, while it only USES a hard disk for persistent storage; Its data model is very unique and USES a single thread. Another big difference is that you can use Redis's capabilities in your development environment, but you don't need to switch to Redis.

Switching to Redis is certainly desirable; many developers have Redis as their preferred database from 1; But imagine that if your development environment is already set up and your application is already running on it, it's not as easy to change the database framework. In addition, Redis is not suitable for applications requiring large data sets, because its data sets do not exceed the available memory of the system. So if you have big data applications that are mostly in read access mode, then Redis is not the right choice.

One thing I like about Redis, however, is that you can integrate it into your system, which can solve a lot of problems, such as tasks that your existing database feels slow to handle. You can use Redis to optimize or create new features for your application. In this article, I want to explore how to add Redis to an existing environment and use its primitive commands to solve common problems in a traditional environment. In none of these examples, Redis is the preferred database.

Displays the latest list of items

The following statement is often used to show the latest project, and as the data grows, the query will no doubt get slower and slower.


SELECT * FROM foo WHERE ... ORDER BY time DESC LIMIT 10

Queries like "list the latest replies" are common in Web applications, and this often leads to scalability issues. This is frustrating because the project was created in this order, but you have to sort to get the order out.

Similar problems can be solved with Redis. For example, one of our Web apps wants to list the last 20 comments posted by users. We have a "show all" link next to the latest comment, click to get more comments.

We assume that each comment in the database has a 1-only incrementing ID field.

We can use paging to make home and comment pages, using Redis's templates:

- every time a new comment is published, we add its ID to a list of Redis:

      LPUSH latest.comments < ID >

- we cropped the list to the specified length, so Redis only needs to save the latest 5000 comments:

      LTRIM latest.comments 0 5000

- every time we need to get the latest reviewed project scope, we call 1 function to do it (using pseudo-code) :


    FUNCTION get_latest_comments(start,num_items):
       id_list = redis.lrange("latest.comments",start,start+num_items-1)
       IF id_list.length < num_items
           id_list = SQL_DB("SELECT ... ORDER BY time LIMIT ...")
       END
       RETURN id_list
    END

So what we're doing here is very simple. In Redis our latest ID USES a resident cache, which is a direct update of 1. However, we have made a restriction that we cannot exceed 5000 ID, so our get ID function will ask 1 directly for Redis. Access to the database is only required if the start/count parameter is outside this range.

Our system does not "refresh" the cache in the traditional way, and the information in the Redis instance is always 1 to 1. The SQL database (or any other type of database on the hard drive) is only triggered when the user needs to fetch "far away" data, and the home page or first comment page does not bother the hard drive database.

Delete and filter

We can use LREM to delete comments. If the deletions are minimal, another option is to simply skip the entry to the comment entry, reporting that the comment no longer exists.

Sometimes you want to attach different filters to different lists. If the number of filters is limited, you can simply use a different Redis list for each different filter. After all, each list has only 5,000 items, but Redis can handle millions of items with very little memory.

Leaderboard correlation

Another common requirement is that the data from the various databases is not stored in memory, so the performance of the databases is not ideal for sorting by score and updating in real time, which requires updating almost every second.

Typically, for example, a leaderboard for an online game, such as an Facebook game, based on the score you usually want:

- list the top 100 high score players

- lists a user's current global ranking

These operations are a small dish for Redis, and even if you have millions of users, you will get millions of new points every minute.

The pattern is that every time we get a new score, we use a code like this:

      ZADD leaderboard < score > < username >

Depending on how you design it, you may use userID instead of username.

Getting the top 100 is easy: ZREVRANGE leaderboard 0 99.

The user's global ranking is also similar, requiring only: ZRANK leaderboard < username > .

Order by user vote and time

One common variant of the ranking pattern, as used by Reddit or Hacker News, is that the news is ranked by score in a similar way to the following formula:

      score = points / time^alpha

So users' votes will dig up the news accordingly, but the news will be buried at an index of 1. Here is our model, but the algorithm is up to you.

The pattern is to start by looking at what might be the most recent project, such as the 1,000 news items on the front page that are candidates, so let's ignore the rest, which is easy to implement.

- each time a new news post is made, we add ID to the list, using LPUSH + LTRIM to make sure that only the latest 1000 items are pulled out.

- there is a background task to get the list and continuously calculate the final score for each of the 1000 stories. The results are populated with the generated list in the new order by the ZADD command, and the old news is cleared. The key idea here is that sorting is done by background tasks.

Overdue item handling

Another commonly used item ordering is by time. We use unix time as the score.

The pattern is as follows:

- every time a new item is added to our non-Redis database, we add it to the sorted collection. In this case we use the time properties, current_time and time_to_live.

- another background task using ZRANGE... SCORES queries sorted collections to fetch the latest 10 items. If it is found that unix time has expired, the entry is deleted from the database.

count

Redis is a good counter, thanks to INCRBY and other similar commands.

I'm sure you've tried many times to add new counters to your database to get statistics or to display new information, only to have to give them up because of write sensitivity.

Well, now you don't have to worry about using Redis. With atomic increment (atomic increment), you can safely add various counts, reset them with GETSET, or let them expire.

For example, do this:


    INCR user:<id> EXPIRE
    user:<id> 60

You can calculate how many page views a user has recently paused between pages for no more than 60 seconds, and when the count reaches, say, 20, you can display some banner, or whatever you want to display.

A specific project at a specific time

Another thing that is difficult for other databases, but easy for Redis to do, is to count how many specific users have accessed a particular resource during a particular period of time. For example, I want to know how many specific registered users or IP addresses actually access a particular article.

Every time I get a new page view I just do this:

      SADD page:day1: < page_id > < user_id >

Of course you might want to replace day1 with unix time, time()-(time()%3600*24), etc.

Want to know the number of specific users? Just use SCARD page:day1: < page_id > .

Need to test whether a particular user is accessing the page? SISMEMBER page: day1: < page_id > .

Real-time analysis of what is happening, for data statistics and prevent spam, etc

We've only done a few examples, but if you look at Redis's command set and combine it with 1, you can get a lot of real-time analysis, which is very efficient and very labor-saving. Using the Redis primitive command makes it easier to implement spam filtering systems or other real-time tracking systems.

Pub/Sub

Pub/Sub is very, very simple, stable and fast. Supports pattern matching, can subscribe and cancel channels in real time.

The queue

You've already noticed that Redis commands like list push and list pop are very handy for performing queue operations, but there's more to them than that: Redis, for example, and a variant of list pop, which blocks queues when the list is empty.

The cache

The cache section of Redis is worth writing a new article, but I'll just say 1. Instead of memcached, Redis takes your cache from being only able to store data to being able to update data, so you don't have to regenerate data every time.

Redis can solve your problems!

Now you can use Redis to make it easier for users, make your system less complex, and make your site more responsive. You don't need to change the existing database structure to use Redis to bring something new to your framework to accomplish tasks that were previously thought impossible/difficult or too expensive.

The original address: http: / / antirez com post/take - advantage - of - redis - adding - it - to - your - stack. html


Related articles: