Introduction to Mysql Isolation Levels Locks and MVCC

  • 2021-10-11 19:50:55
  • OfStack

The purpose of this paper is to clarify the relationship between these concepts and their functions. Find out the locking operation and MVCC version control when each sql executes when Mysql opens a transaction. To keep the discussion simple, GAP locks (gap locks, range locks) are ignored in this paper.

We are often highly concurrent and highly available. It is to evaluate from quality and quantity, and everything can be analyzed from these two angles. In the Mysql database, transactions are used to guarantee quality, and MVCC is used to guarantee quantity.

Affairs

We use transactions to ensure that the result execution of every 1 SQL statement meets our expectations. We say that transactions must have ACID features. The three in ACID: atomicity, uniformity and persistence all describe the same thing, which ensures the reliability of SQL execution results. Isolation is more complicated. Isolation describes the performance of the database in concurrent scenarios, but the concurrency is not fixed, and different services may have different requirements. In order to make the database adapt to different concurrent scenarios, great people have defined four isolation levels: Read Uncommited, Read Committed (RC), Repeatable Read (RR) and Serializable. With the improvement of database isolation level, the concurrency of data also decreases.

Isolation level

How the database will behave under the standard isolation level can be referred to https://www.ofstack.com/article/116477. htm. We only discuss the concepts of shared lock and exclusive lock, read plus shared lock and write plus exclusive lock:

Under the RC isolation level, the modified data will be locked exclusively, and the transaction will be released at the end, and other transactions will not be read, thus solving the dirty read problem. (Shared lock released on the spot)
Under RR isolation level, read data with shared lock, transaction end release, other transactions are not allowed to be modified, and unrepeatable reading is solved. (Shared Lock Transaction End Release)

Actually serializes the operation. Mysql optimizes it. When one transaction reads, other transactions cannot write, and when one transaction writes, other transactions cannot read? If I don't do this, I can still solve the problem of dirty reading and unrepeatable reading. MVCC appeared. (This also makes the problem more and more complicated, and different places begin to appear under the isolation level of RR. It happens that the default isolation level of Mysql is RR.)

MVCC

MVCC is multi-version concurrency control, which uses double version numbers to solve the problem of data isolation. ("create" has one version number, "delete" has one version number, and the modification operation is split into "delete" and "create") Each transaction will generate one version number when adding, deleting, modifying and checking each table, and each transaction can only find data with "create" smaller than this version number and "delete" larger than this version number. In this way, the addition, deletion and search operations can be carried out concurrently, and only the modification operation is 1 must be queued. In this way, even without a shared lock, the problem of unrepeatable reading is solved, because after other transactions are modified, the version number of the data is larger than mine, so I won't read it.

Concurrency of MVCC at RR Isolation Level

After the introduction of MVCC, it seems very beautiful. However, have you ever thought that two transactions update one piece of data one after another, and then two transactions read that piece of data, and what do they read respectively? Haha, this is simply impossible, because the modification operation is serial, and the other transaction must be modified by commit this transaction first. Ok, to change the question, two transactions do +1 operation on one piece of data one after another, and after the other transaction commits, this transaction then +1 and reads that piece of data again. Does this transaction read +1 or +2? If you read +2, isn't it breaking isolation and reading data committed by other transactions?

However, it is true that other transactions have been committed, and this transaction has modified that data, and then of course it has to read +2. Although it was originally 0, this transaction obviously only added 1, but it became 2 after reading, which is a bit uncomfortable. Indeed, under the standard RR isolation level, because all operations are serial, after this transaction reads 1 row of data, other transactions cannot modify this data, and this data is always operated by this transaction, so the isolation is strictly satisfied. However, RR of Mysql enhances the concurrency of reading and writing. Only when two transactions modify one piece of data at the same time, all other operations can be parallel. So this result is caused, as if there is unrepeatable reading. However, this non-repeatable reading is actually in line with our intuitive feelings. After the data is modified in this transaction, of course, the latest data should be read.

To analyze the process:

Data create version number is 0

Transaction 1 version number is 1, read data value=0

Transaction 2 version number is 2, modified data value+1=1, original data delete version number is 2, new data create version number is updated to 2, commit

Transaction 1 modifies the data value+1=2, (since the modification is the current read, the data with the largest version number is always read, so the value is read as 1) The modified delete version number is 1,

The new data create version number is 1

This transaction reads data value=2

In-depth analysis:

In fact, the above description is also vulnerable. What if there is a third transaction version number of 3? Because the version number is 3, can you directly read the uncommitted data of transactions 1 and 2? In fact, in MVCC, each transaction also has one minimum visible version low_limit_id (transaction number > = low_limit_id records are not visible to current transactions), filtering out transactions that are currently executing without commit. For example, Transaction 3, although the version number is 3, low_limit_id=1, so the modifications of Transaction 1 and Transaction 2 are not visible to 3.

Summarize

In order to solve the isolation problem, the stupid method of using full copy data is not used. Traditional databases use shared locks and exclusive locks to make read and write operations serial; Mysql uses MVCC and exclusive locks, and reads and writes can be parallel. Mysql is below the isolation level of RR, which is 1% different from the traditional way. At the isolation level of RR, it is different from the traditional way, which is reflected in the fact that after this transaction updates a piece of data, it can read the committed modifications of other transactions to this piece of data.


Related articles: