Instructions on mysql's handling of binlog

2020-05-10 23:03:47
OfStack
                    However, instead of describing the implementation details of one storage engine or the pros and cons of the various storage engines, I'm going to describe how mysql handles binlog and clarify a few confusing issues.

Binlog is important to mysql, mainly in terms of its functionality. The official Mysql documentation clearly states that the startup of binlog will probably increase the load on mysql by 1%, so binlog will not be a performance bottleneck for mysql in most cases.

Binlog is a log that mysql prints in base 2. It is unencrypted and uncompressed by default. Each normal binlog file header has a four-byte tag with a value of 0xfe 0x62 0x69 0x6e. LOG_EVENT is a unit in binlog, that is, binlog normally grows by LOG_EVENT. Without the header tag, binlog is a sequence of LOG_EVENT. Each LOG_EVENT is a separate unit with no reference to each other. It also has its own base 2 header, which mainly records time stamps, type markers and other descriptive information.

Mysql encapsulates the implementation of disk operation in the IO_CACHE structure, which also facilitates our study and description of binlog. Read/write binlog has the same meaning as read/write IO_CACHE unless otherwise specified.

To understand how mysql writes to binlog, trace the processing of an sql statement. In the case of update, for example, in the simplest case, mysql first calls the interface ha_update_row that is open for the storage engine, while binlog_query writes to binlog. The reason for this is that in the scenario of master-slave backup, if the master library writes binlog successfully first and crash successfully during the execution of update, the slave library may execute update successfully. At this time, after the master library is restarted, it will be different from the data of the slave library. If the update operation occurs on a transactional table, the open interface ha_autocommit_or_rollback is executed after writing binlog, and the storage engine determines the result of the operation.

In the scenario of master-slave backup, the master library is equivalent to server, and the slave library is equivalent to client. The two sides use tcp short connection. The master library receives the request, reads the local binlog, and sends it to the slave library. Logs are received from the library, and after simple validation, local logs are written, known as relay log. The flow from the slave library here is handled exclusively by one thread, called the io synchronization thread. There is also one thread from the slave library, called the synchronous sql thread. Its behavior is to periodically read relay log, parse and execute sql statements synchronously.

Here are some questions to answer:

1. The format of binlog?

Binary sequential storage, unencrypted, uncompressed

2. Does binlog use WAL?

No 

3. When the main library sends binlog, does it use copy in memory?

Unable to determine, most likely read 1 copy from disk and then send.

4. Does relaylog use WAL?

Yes. When the log is received from the library, relay log is written first

5. Is SQL 1 for binlog and relaylog?

On the premise of network transmission correctness and reliability, yes

One question:

Since binlog does not use WAL, in the master-slave scenario, after the mysql exception, will the master and slave libraries be different?

I didn't understand the first question:

Since mysql does the data operation first and then writes binlog, what happens to the data if it fails to write binlog and mysql and crash?

The answer is that the storage engine determines the data.

You can separate mysql from its storage engine, because mysql is just a framework, not an implementation.

binlog is mysql's own log, and transactions are guaranteed by the storage engine itself.

Taking update as an example, the things mysql does are simply divided into:

1. Modify data update

2. Write binlog

3. If the table currently being processed is a transactional table, commit or rollback

Note that both update and commit/rollback are implemented by storage engines, and mysql only understands these operations at a logical level.

For the transactional engine innodb, it has its own log to guarantee the 1-uniqueness of the data. In the implementation of innodb, before update modifies the data,

A new transaction is created and a rollback point is established. The commit/rollback interface provided by innodb commits/rolls back transactions.

So for innodb, each transaction for an SQL statement actually contains a write operation for binlog. Even so, however, innodb is not guaranteed

binlog and the 1 uniqueness of the data, because innodb crash after writing commit successfully, the rollback operation does not roll back binlog. According to the manual,

If -- innodb-support-xa is set to 1, and sync_binlog=1, then binlog of innodb is guaranteed to be equal to data 1.

For non-transactional engines myisam, there is no chance of commit/rollback, so in exceptional cases, the data will not correspond to binlog.

So a new question arises: what does myisam do with this non-1?