The Redis SETNX command is used to implement distributed locking

2020-05-14 05:22:35
OfStack

Distributed locking can be implemented using the SETNX command of Redis, which is described below.

Introduction to the SETNX command

The command format


SETNX key value

Set the value of key to value if and only if key does not exist.

If a given key already exists, SETNX does nothing.

SETNX is short for SET if Not eXists.

The return value

Returns an integer, specifically

-1, when the value of key is set

-0, when key is not set

example


redis> SETNX mykey  " hello "  
(integer) 1 
redis> SETNX mykey  " hello "  
(integer) 0 
redis> GET mykey 
 " hello "  
redis>

Implement distributed locking using SETNX

Multiple processes execute the following Redis command:


SETNX lock.foo <current Unix time + lock timeout + 1>

If SETNX returns 1, the process acquired the lock, and SETNX sets the value of the key lock.foo to the timeout of the lock (the current time + the lock's effective time).

If SETNX returns 0, the other process has acquired the lock and cannot enter the critical section. The process can keep trying the SETNX operation in a loop to get the lock.

To solve the deadlock

Consider a situation in which if the process is disconnected from Redis after acquiring the lock (perhaps the process is dead, or the network is interrupted), if there is no effective mechanism to release the lock, then all other processes will be in a state of 1 straight wait, that is, "deadlock" occurs.

Above, when using SETNX to obtain the lock, we set the value of key lock.foo as the effective time of the lock. After the process obtains the lock, other processes will continuously detect whether the lock has timed out. If the timeout occurs, the waiting process will also have the opportunity to acquire the lock.

However, when the lock times out, we cannot simply use the DEL command to delete the key lock.foo to release the lock. Consider the following case where the process P1 has first acquired the lock lock.foo, and then the process P1 hangs. Processes P2 and P3 are continuously checking whether the lock has been released or timed out. The execution process is as follows:

The P2 and P3 processes read the value of the key lock.foo and detect whether the lock has timed out (this is determined by comparing the current time with the value of the key lock.foo) The P2 and P3 processes found that the lock lock.foo has timed out P2 executes the DEL lock.foo command P2 executes the SETNX lock.foo command and returns 1, which means P2 gets the lock P3 executes DEL lock.foo to delete the key

2 has just set,

.foo (this is because P3 has just detected a lock timeout) P3 executes the SETNX lock.foo command and returns 1, which means P3 gets the lock Both P2 and P3 obtained locks

As you can see from the above, after a locktimeout is detected, the process cannot simply perform the DEL delete key to obtain the lock.

In order to solve the problem that multiple processes may acquire locks at the same time, let's look at the following algorithm.

We also assume that process P1 has first acquired the lock lock.foo, and then process P1 has died. Here's what happens:

1. Process P4 executes SETNX lock.foo to try to acquire the lock

2. Because process P1 has acquired the lock, P4 executes SETNX lock.foo returns 0, which means the lock acquisition failed

3. P4 executes GET lock. foo to check whether the lock has timed out

4. If P4 detects that the lock has timed out, that is, the current time is greater than the value of the key lock.foo, P4 will do the following


GETSET lock.foo <current Unix timestamp + lock timeout + 1>

5. Since the operation GETSET will return the old value of the key while setting the value of the key, it can be determined whether the process has acquired the lock by comparing whether the old value of the key lock.foo is less than the current time

6. If another process, P5, also detects that the lock has timed out, and GETSET was executed before P4, then the GETSET operation of P4 returns a timestamp greater than the current time, so that P4 will not get the lock and continue to wait. Note that it does not matter if P4 then sets the value of the key lock.foo to a larger value than P5.

Also, it is worth noting that the process needs to determine whether the lock has timed out before releasing the lock, which is DEL lock.foo. If the lock has timed out, the lock may have been acquired by another process, and the direct DEL lock.foo operation results in the release of the lock acquired by another process.

The program code

Use the Python code below to implement the above algorithm using the SETNX command for distributed locking.


LOCK_TIMEOUT = 3
lock = 0
lock_timeout = 0
lock_key = 'lock.foo'

#  Acquiring a lock 
while lock != 1:
  now = int(time.time())
  lock_timeout = now + LOCK_TIMEOUT + 1
  lock = redis_client.setnx(lock_key, lock_timeout)
  if lock == 1 or (now > int(redis_client.get(lock_key))) and now > int(redis_client.getset(lock_key, lock_timeout)):
    break
  else:
    time.sleep(0.001)

#  Has been locked 
do_job()

#  Release the lock 
now = int(time.time())
if now < lock_timeout:
  redis_client.delete(lock_key)