redis distributed locking and problem solving

  • 2020-10-07 18:56:04
  • OfStack

1. Main principles of redis to achieve distributed locking:

1. The lock

The easiest way is to use the setnx command. key is the only identifier for the lock, named according to the business. For example, if you want to lock a product's kill activity, you could name key "lock_sale_ product ID". And what is value set to? We could just set it to 1. The pseudocode for locking is as follows:
setnx (key, 1)

When a thread executes setnx, it returns 1, indicating that key does not exist and the thread successfully obtains the lock. When a thread executes setnx, it returns 0, indicating that key already exists. The thread fails to grab the lock.

2. To unlock

To add a lock is to unlock it. When the thread that acquired the lock has finished executing the task, the lock needs to be released so that other threads can enter. The simplest way to release the lock is to execute the del instruction, with the following pseudocode:

del (key)

After the lock is released, other threads can continue to execute the setnx command to acquire the lock.

3. Lock timeouts

What does locktimeout mean? If a thread that acquired the lock dies during the execution of the task without explicitly releasing the lock, the resource will be locked forever and no other thread will be allowed to enter.

Therefore, key of setnx must set a timeout period to ensure that the lock is automatically released after a fixed time even if it is not explicitly released. setnx does not support timeout parameters, so additional instructions are required. The pseudocode is as follows:

[

expire (key, 30)

]

2. Lock code


/**
 *  Try to obtain a distributed lock 
 * @param jedis Redis The client 
 * @param lockKey  The lock 
 * @param requestId  The request id 
 * @param expireTime  Beyond the time 
 * @return  Success or not 
 */
public static void wrongGetLock1(Jedis jedis, String lockKey, String requestId, int expireTime) {

 Long result = jedis.setnx(lockKey, requestId);
 if (result == 1) {
  //  If the program crashes suddenly here, the expiration time cannot be set and a deadlock will occur 
  jedis.expire(lockKey, expireTime);
 }

}

One of the fatal problems with the above code is that locking and setting expiration times are not atomic operations.

Then there are two extremes:

In the first case, two threads execute setnx at the same time, and the result is 1, so both threads get the lock at the same time.

The other is that after setnx is executed, the program crashes without expiration time, and the lock is never released, resulting in a deadlock, as shown in the code comments.

Some people do this because the lower version of jedis does not support the multi-parameter set() method. The correct code is as follows:


/**
 *  Try to obtain a distributed lock 
 * @param jedis Redis The client 
 * @param lockKey  The lock 
 * @param requestId  The request id 
 * @param expireTime  Beyond the time 
 * @return  Success or not 
 */
public static boolean tryGetDistributedLock(Jedis jedis,String lockKey, String requestId, int expireTime) {

    String result = jedis.set(lockKey, requestId, "NX", "PX", expireTime);
    if ("OK".equals(result)) {
      return true;
    }
    return false;

}

The set() method 1 has 5 formal parameters:

The first one is key. We use key as the lock because key is the only one.

The second one is value, and the second one is requestId. Many children's shoes may not understand, but key is enough as a lock. Why use value? The reason is that by assigning value to requestId, we know which request added the lock, so we can have a basis for unlocking it. requestId can be generated using the UUID.randomUUID ().toString () method.

The third parameter is nxxx. For this parameter, we fill in NX, which means SET IF NOT EXIST. In other words, when key does not exist, we perform set operation. If key already exists, it does nothing;

The fourth parameter is expx. We pass PX for this parameter, which means that we will add an expired setting to this key. The specific time is determined by the fifth parameter.

The fifth is time, which corresponds to the fourth parameter and represents the expiration time of key.

In summary, performing the set() method above results in only two outcomes: 1. If there is no lock (key does not exist), the lock is locked and an expiration date is set for the lock, while value represents the locked client. 2. The lock exists, do not do any operation.

2. Code to unlock


public static void wrongReleaseLock1(Jedis jedis, String lockKey) {
  jedis.del(lockKey);
}

The problem with this code is that it is easy to mistakenly delete if a thread successfully acquires the lock and sets a timeout of 30 seconds. If for some reason thread A executes very slowly, and after 30 seconds, the lock expires and the lock is released automatically, thread B acquires the lock.

Then thread A completes the task and thread A then executes the del instruction to release the lock. But at this point thread B has not finished executing, thread A actually removes the lock added by thread B.

How do you avoid this? Before del releases the lock, you can make a decision to verify that the current lock is self-locking.

As for the specific implementation, you can treat the current thread ID as value when locking, and verify that the corresponding value of key is ID of its own thread before deleting.


public static void wrongReleaseLock2(Jedis jedis, String lockKey, String requestId) {
    
  //  Determine whether lock and unlock are the same 1 A client 
  if (requestId.equals(jedis.get(lockKey))) {
    //  If, at this point, the lock is suddenly not the client, the lock is misinterpreted 
    jedis.del(lockKey);
  }

}

However, this implies a new problem: the lock and the lock are independent operations, not atomicity.

The solution is to use the lua script and turn it into an atomic operation, as follows:


public class RedisTool {

  private static final Long RELEASE_SUCCESS = 1L;

  /**
   *  Release distributed locks 
   * @param jedis Redis The client 
   * @param lockKey  The lock 
   * @param requestId  The request id 
   * @return  Whether the release was successful 
   */
  public static boolean releaseDistributedLock(Jedis jedis, String lockKey, String requestId) {

    String script = "if redis.call('get', KEYS[1]) == ARGV[1] then return redis.call('del', KEYS[1]) else return 0 end";
    Object result = jedis.eval(script, Collections.singletonList(lockKey), Collections.singletonList(requestId));

    if (RELEASE_SUCCESS.equals(result)) {
      return true;
    }
    return false;

  }

}

3. Renewal

Is the final code with the lock on top perfect? Imagine a scenario where the expiration time is 30S and the A thread is over 30S and has not yet finished executing, but has automatically expired. At this point the B thread picks up the lock again, resulting in two threads holding the lock at the same time. This problem can be attributed to the "renewal" issue, that is, the renewal of A should expire before the execution of A is completed, so that the lock can be released. What to do? We can have the thread that acquired the lock open a daemon thread to "renew" a lock that is about to expire.

In fact, after the unlock appears to delete the non-self lock, also belongs to the "renewal" issue.

4. Cluster synchronization delay problem

The service for redis certainly cannot be standalone, since stand-alone is not highly available and 1 quantity is useless if the entire distributed lock is suspended.

In the cluster scenario, if A gets the lock at master, master fails without synchronizing the data to slave. B takes the lock again and takes the lock from slave, and gets it. Two more threads get the lock at the same time.

Based on the above considerations, the author of Redis also considers this problem and proposes an algorithm of RedLock.

The algorithm goes something like this: suppose the deployment mode of Redis is Redis Cluster, with a total of five Master nodes.

Obtain 1 lock by following these steps:

Gets the current timestamp in milliseconds. Take turns trying to create locks on each Master node, with the expiration time set to be a few milliseconds. Try to set up a lock on most nodes. For example, 5 nodes should be 3 nodes (n / 2 +1). The client calculates the time it took to set up the lock. If the time to set up the lock is less than the timeout, the lock is considered a success. If the lock establishment fails, remove the lock in turn. As soon as someone creates a distributed lock, you have to poll and poll to try to get the lock.

However, such an algorithm is still controversial, and there may be a lot of problems, there is no guarantee that the locking process 1 is correct.

The root cause of this problem is that the cluster of redis belongs to AP and the distributed lock belongs to CP. It is impossible to implement CP with AP.

5. Redisson

Redisson is an Java in-memory data grid (ES200en-ES201en Data Grid) based on Redis. Based on the interface commonly used in the Java utility toolkit, a series of commonly used tool classes with distributed features are provided for users.

Redisson solved the atomicity problem above with the lua script and the renewal problem with the watchdog, but it should not solve the synchronization delay problem in the cluster.

conclusion

redis distributed locking scheme, no matter which way to implement will have the renewal problem and cluster synchronization delay problem. On the whole, it is an unlikely solution. If high accuracy is sought, this scheme cannot be adopted.

However, it also has the advantage that it is relatively simple and can be used in some non-strict scenarios, such as social system 1, and transaction system 1 is not recommended if there is no duplicate transaction.


Related articles: