Use JAVA to achieve high concurrency lockless database operation steps sharing

  • 2020-04-01 02:25:40
  • OfStack

1. How to lock in concurrency.
A very simple idea to turn concurrency into a single thread. Java Disruptor is a good example. If you do it with Java's concurrentCollection class, the idea is to start a thread, run a Queue, and when concurrent, the tasks are pushed into the Queue, the threads rotate to read the Queue, and then execute them sequentially.
In this design mode, any concurrency becomes a single threaded operation, and very fast. Today's node.js, or more generally ARPG servers, are designed with this design, the "big loop" architecture.
In this way, our original system had two environments: the concurrency environment + the "big loop" environment
A concurrent environment is our traditional locked environment with poor performance.
The "big loop" environment is a powerful, single-threaded, lock-free environment that we created with Disruptor.

2. How to improve processing performance in a "large cycle" environment.
Once concurrency is transferred to a single thread, the performance problems of one of the threads inevitably slow down the entire process. So any operation in a single thread must not involve IO processing. What about database operations?
Increase the cache. This idea is very simple, read directly from memory, will be faster. For write and update operations, take a similar approach, commit the operations to a Queue, and then run a separate Thread to fetch libraries one by one. This ensures that IO operations are not involved in the big loop.

The problem arises again:
If our game has only one big loop it's easy to solve, because it provides perfect synchronization without locking.
But the actual game environment is concurrency and "big loop" coexist, namely the above two environments. So no matter how we design it, we're going to find a lock on the cache.

3. How do concurrency and "big loop" coexist and eliminate locks?
We know that if you want to avoid locking operations in a "big loop," you use "asynchronous," leaving the operations to the thread. Combining these two characteristics, I slightly changed the database schema.
In the original cache layer, there must be locks, such as:


public TableCache
{
  private HashMap<String, Object> caches = new ConcurrentHashMap<String, Object>();
}

This structure is necessary to ensure accurate caching of operations in a concurrent environment. But "big loop" cannot directly manipulate the cache to make changes, so a thread must be started to update the cache, for example:


private static final ExecutorService EXECUTOR = Executors.newSingleThreadExecutor();
EXECUTOR.execute(new LatencyProcessor(logs));
class LatencyProcessor implements Runnable
{
  public void run()
  { 
    //Here you can modify the memory data arbitrarily. Asynchronous is adopted.
  }
}

OK, it looks nice. But there was another problem. During high-speed access, it is very possible that the cache has not been updated before it is retrieved by another request and the old data is obtained.

4. How to ensure the unique and correct cache data in the concurrent environment?
We know that if there are only read operations and no write operations, then this behavior does not need to be locked.
I used this trick to add a layer of cache to the top of the cache to make it a "level 1 cache" and the original to make it a "level 2 cache". Kind of like a CPU, right?
Level 1 caches can only be modified by "big loop", but can be acquired concurrently and "big loop" at the same time, so there is no need to lock.
When the database changes, there are two cases:
1) for the database changes in the concurrent environment, we allow the existence of locks, so we directly operate the second-level cache, no problem.
2) database changes under the "big cycle" environment, we first store the change data in the first-level cache, and then send it to the asynchronous correction second-level cache, after which the first-level cache is deleted.
Thus, no matter in which environment the data is read, the first level cache is judged and the second level cache is not judged.
This architecture guarantees the absolute accuracy of the memory data.
And what's important is that we have an efficient lock-free space to implement our arbitrary business logic.

Finally, there are some tips to improve performance.
1. Now that our database operations have been processed asynchronously, there may be a lot of data to insert into the library at a certain time. By sorting the tables, primary keys, and operation types, we can remove some invalid operations. Such as:
A) multiple updates of the same table and the same primary key, taking the last time.
B) the same table with the same primary key, as long as there is a Delete, all previous operations are invalid.
2. Since we want to sort the operation, there must be a sort according to time, how to ensure that there is no lock? use
Private final static AtomicLong _seq = new AtomicLong(0);
It can guarantee that there is no lock and globally unique self-increment, as a time series.


Related articles: