Backup the MySQL database using mydumper multithreading

  • 2020-06-01 11:11:17
  • OfStack

mysqldump: one of its characteristics is that the list needs to be locked during processing, so if we need to perform a backup during working hours, DML will block. However, the current MySQL has master and slave, and the backup is mostly carried out from the slave, so the problem of locking can be ignored. In this way, mydumper can perform backup tasks better.
Mydumper main features: Mydumper is a high performance multithreaded backup and recovery tool for MySQL and Drizzle. The developers are mainly from MySQL, Facebook, SkySQL.


1 : lightweight C Written language 
    2 : execution speed ratio mysqldump fast 10 times 
    3 : transactional and non-transactional tables 1 Send a snapshot of the ( Apply to 0.2.2 The above version )
    4 : fast file compression 
    5 : support export binlog
    6 : multithreaded recovery ( Apply to 0.2.1 The above version )
    7 : to work as a daemon, timed snapshot and sequential 2 Log into the system ( Apply to 0.5.0 The above version )
    8 : open source  (GNU GPLv3)

Download and install: environment: Ubuntu 12.04


wget https://launchpad.net/mydumper/0.5/0.5.2/+download/mydumper-0.5.2.tar.gz

Installation: after decompression, there are installation instructions


apt-get install libglib2.0-dev libmysqlclient15-dev zlib1g-dev libpcre3-dev g++
cmake .
make


root@dd:~/mydumper-0.5.2# cmake .
-- The CXX compiler identification is GNU
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Using mysql-config: /usr/bin/mysql_config
-- Found MySQL: /usr/include/mysql, /usr/lib/x86_64-linux-gnu/libmysqlclient_r.so;/usr/lib/x86_64-linux-gnu/libpthread.so;/usr/lib/x86_64-linux-gnu/libz.so;/usr/lib/x86_64-linux-gnu/libm.so;/usr/lib/x86_64-linux-gnu/librt.so;/usr/lib/x86_64-linux-gnu/libdl.so
CMake Warning at docs/CMakeLists.txt:9 (message):
  Unable to find Sphinx documentation generator

-- ------------------------------------------------
-- MYSQL_CONFIG = /usr/bin/mysql_config
-- CMAKE_INSTALL_PREFIX = /usr/local
-- BUILD_DOCS = ON
-- RUN_CPPCHECK = OFF
-- Change a values with: cmake -D<Variable>=<Value>
-- ------------------------------------------------
-- 
-- Configuring done
-- Generating done
-- Build files have been written to: /root/mydumper-0.5.2
root@dd:~/mydumper-0.5.2# make
Scanning dependencies of target mydumper
[ 20%] Building C object CMakeFiles/mydumper.dir/mydumper.c.o
[ 40%] Building C object CMakeFiles/mydumper.dir/binlog.c.o
[ 60%] Building C object CMakeFiles/mydumper.dir/server_detect.c.o
[ 80%] Building C object CMakeFiles/mydumper.dir/g_unix_signal.c.o
Linking C executable mydumper
[ 80%] Built target mydumper
Scanning dependencies of target myloader
[100%] Building C object CMakeFiles/myloader.dir/myloader.c.o
Linking C executable myloader
[100%] Built target myloader

Generate 2 tools: mydumper (backup), myloader (import)

Parameters:
mydumper:


root@dd:~/mydumper-0.5.2# ./mydumper --help
Usage:
  mydumper [OPTION...] multi-threaded MySQL dumping
Help Options:
  -?, --help                  Show help options
Application Options:
  -B, --database               Libraries that need to be backed up 
  -T, --tables-list            The tables that need to be backed up, separated by commas 
  -o, --outputdir              Directory of output files 
  -s, --statement-size         The number of bytes to generate the insert statement ,  The default  1000000
  -r, --rows                   Split into many row block tables 
  -c, --compress               Compressed output file 
  -e, --build-empty-files      Even if the table has no data, it is still generated 1 An empty file 
  -x, --regex                  Regular expression : 'db.table'
  -i, --ignore-engines         Ignore the storage engine, separated by commas 
  -m, --no-schemas             Table structures are not exported 
  -k, --no-locks               A Shared read lock is not executed   Warning: this will lead to no 1 To backup 
  -l, --long-query-guard       Set long query times , The default 60 seconds 
  --kill-long-queries         kill Missing long-running queries 
  -b, --binlogs                export binlog
  -D, --daemon                 Enable daemon mode 
  -I, --snapshot-interval     dump Snapshot interval time, default 60s That need to be in daemon mode 
  -L, --logfile                The log file 
  -h, --host                  The host to connect to
  -u, --user                  Username with privileges to run the dump
  -p, --password              User password
  -P, --port                  TCP/IP port to connect to
  -S, --socket                UNIX domain socket file to use for connection
  -t, --threads                Number of threads used by default 4
  -C, --compress-protocol      in mysql The compression protocol is used on the connection 
  -V, --version               Show the program version and exit
  -v, --verbose                More output , 0 = silent, 1 = errors, 2 = warnings, 3 = info, default 2

myloader:


root@dd:~/mydumper-0.5.2# ./myloader --help
Usage:
  myloader [OPTION...] multi-threaded MySQL loader
Help Options:
  -?, --help                        Show help options
Application Options:
  -d, --directory                    The directory in which the backup files are located 
  -q, --queries-per-transaction      Number of queries executed at a time ,  The default 1000
  -o, --overwrite-tables             If the table exists, drop it first. Note here that with this parameter, you need to back up the table structure when you need to, otherwise you will have a problem 
  -B, --database                     The database that needs to be restored 
  -e, --enable-binlog                To enable the 2 Base restore data 
  -h, --host                        The host to connect to
  -u, --user                        Username with privileges to run the dump
  -p, --password                    User password
  -P, --port                        TCP/IP port to connect to
  -S, --socket                      UNIX domain socket file to use for connection
  -t, --threads                      Number of threads used by default 4
  -C, --compress-protocol            The compression protocol is used on the connection 
  -V, --version                     Show the program version and exit
  -v, --verbose                      More output , 0 = silent, 1 = errors, 2 = warnings, 3 = info, default 2

Testing: testing basic usage

1: backup


./mydumper -u zjy -p ##### -h 192.168.220.245 -P 3306 -B chushihua -o /home/zhoujy/bak/

Backup analyzedxy database to /home/zhoujy/bak/ directory to see if there are multiple threads:


| 4937639 | zjy  | 192.168.200.25:34781  | NULL| Query   |      0 | NULL              | show processlist |
| 4937677 | zjy  | 192.168.200.25:34791  | NULL| Query   |     10 | Writing to net    | SELECT /*!40001 SQL_NO_CACHE */|
| 4937678 | zjy  | 192.168.200.25:34792  | NULL| Query   |      5 | Writing to net    | SELECT /*!40001 SQL_NO_CACHE */ |
| 4937679 | zjy  | 192.168.200.25:34793  | NULL| Query   |     10 | Writing to net    | SELECT /*!40001 SQL_NO_CACHE */ |
| 4937680 | zjy  | 192.168.200.25:34794  | NULL| Query   |     10 | Writing to net    | SELECT /*!40001 SQL_NO_CACHE */ |

The above shows that there are indeed 4 threads (by default) in the backup. View the backup file:


root@zhoujy:/home/zhoujy/bak# ls -lh
-rw-r--r-- 1 root root  322 2013-11-14 17:59 chushihua.dba_hospital_all_name-schema.sql
-rw-r--r-- 1 root root  16M 2013-11-14 17:59 chushihua.dba_hospital_all_name.sql
-rw-r--r-- 1 root root  221 2013-11-14 17:59 chushihua.dba_hospital-schema.sql
-rw-r--r-- 1 root root  658 2013-11-14 17:59 chushihua.dba_hospital.sql
-rw-r--r-- 1 root root  198 2013-11-14 17:59 chushihua.dba_jobTitle-schema.sql
-rw-r--r-- 1 root root  300 2013-11-14 17:59 chushihua.dba_jobTitle.sql
-rw-r--r-- 1 root root  261 2013-11-14 17:59 chushihua.dba_locatedCity-schema.sql
-rw-r--r-- 1 root root 202K 2013-11-14 17:59 chushihua.dba_locatedCity.sql

Analysis: mydumper backs up the data and table structure separately, and backs up the binary log into a separate file.


metadata: metadata   Record the backup start and end times, as well binlog Log file location. 
table data: Each table 1 A file 
table schemas: Table structure file 
binary logs:  To enable the --binlogs Options, 2 Base files are stored binlog_snapshot directory 
daemon mode: In this mode, yes 5 A directory 0 . 1 . binlogs . binlog_snapshot . last_dump . 
 The backup directory is 0 and 1 , interval backup if mydumper Fail for some reason and still have 1 A good snapshot, and when the snapshot is done, last_dump Point to the backup. 

2: restore: restore to another server, first set up the database to be restored (chushihua)


./myloader -u root -p 123456 -h 192.168.200.25 -P 3307 -B chushihua -d /home/zhoujy/bak/

And backup 1 like to see if multithreading:


| 19 | root        |      | NULL      | Query   |     0 | init  | show   processlist|
| 30 | root        |      | chushihua | Query   |     5 | update| INSERT INTO       |
| 31 | root        |      | chushihua | Query   |     5 | update| INSERT INTO       |
| 32 | root        |      | chushihua | Query   |     5 | update| INSERT INTO       |
| 33 | root        |      | chushihua | Query   |     5 | update| INSERT INTO       |

The above shows that there are indeed four threads (by default) in the restore.
Step 1 test: test 1 common parameters
1) : backup the specified table (-T), and do not export the table structure (-m)


./mydumper -u root-p 123456 -h 192.168.220.252 -P 3306 -m -B test -T b,a,c,d,e,g,f,h,i -o /home/zhoujy/bak/
zhoujy@zhoujy:~/bak$ ls -lh
-rw-rw-r-- 1 zhoujy zhoujy 3.4K 2013-11-14 20:57 test.a.sql
-rw-rw-r-- 1 zhoujy zhoujy 1.6M 2013-11-14 20:57 test.b.sql
-rw-rw-r-- 1 zhoujy zhoujy 7.8M 2013-11-14 20:57 test.c.sql
-rw-rw-r-- 1 zhoujy zhoujy 1.7M 2013-11-14 20:57 test.d.sql
-rw-rw-r-- 1 zhoujy zhoujy 303K 2013-11-14 20:57 test.e.sql
-rw-rw-r-- 1 zhoujy zhoujy 517K 2013-11-14 20:57 test.f.sql
-rw-rw-r-- 1 zhoujy zhoujy 646K 2013-11-14 20:57 test.g.sql
-rw-rw-r-- 1 zhoujy zhoujy 394K 2013-11-14 20:57 test.h.sql
-rw-rw-r-- 1 zhoujy zhoujy  34K 2013-11-14 20:57 test.i.sql
-rw-rw-r-- 1 zhoujy zhoujy   75 2013-11-14 20:57 metadata

2) compress backup file (-c), backup binlog (-b), regular expression backup table (-x)


 ./mydumper -u root -p 123456 -h 192.168.200.25 -P 3306 -m -c -b --regex=tmp.* -B test  -o /home/zhoujy/bak/
drwx------ 2 zhoujy zhoujy 4.0K 2013-11-14 21:16 binlog_snapshot
-rw-rw-r-- 1 zhoujy zhoujy  133 2013-11-14 21:16 metadata
-rw-rw-r-- 1 zhoujy zhoujy  94K 2013-11-14 21:16 test.tmp_0808.sql.gz
-rw-rw-r-- 1 zhoujy zhoujy  75K 2013-11-14 21:16 test.tmp_0809.sql.gz
-rw-rw-r-- 1 zhoujy zhoujy  25K 2013-11-14 21:16 test.tmp_0813.sql.gz
-rw-rw-r-- 1 zhoujy zhoujy 208K 2013-11-14 21:16 test.tmp_0826.sql.gz
-rw-rw-r-- 1 zhoujy zhoujy  915 2013-11-14 21:16 test.tmp_0827.sql.gz
-rw-rw-r-- 1 zhoujy zhoujy  901 2013-11-14 21:16 test.tmp_0912.sql.gz
-rw-rw-r-- 1 zhoujy zhoujy 2.1K 2013-11-14 21:16 test.tmp_0916.sql.gz
-rw-rw-r-- 1 zhoujy zhoujy 622K 2013-11-14 21:16 test.tmp_0918_a.sql.gz
-rw-rw-r-- 1 zhoujy zhoujy  28M 2013-11-14 21:16 test.tmp_0918_ff.sql.gz
 

As shown above, the backup file is already compressed (uncompressed with gzip-d), and all the tables that tmp.* match are backed up, and the binary log is backed up to binlog_snapshot file, and is also compressed.
3) restore, delete the table before it exists (-o) : it should be noted here that, with this parameter, backup files with table structure are required in the backup directory.


./myloader -u root -p 123456 -h 192.168.200.25 -P 3306 -o -B test -d /home/zhoujy/bak/

For more parameter effects, please test yourself.
Final test: compare mysqldump and mydumper.


#!/usr/bin/env python
#coding=utf-8
import MySQLdb
import os
import sys
import time
backup = os.system('')
def mysqldump_data():
    t1 = time.time()
    backup = os.system('mysqldump --no-defaults -uroot -p123456 -h192.168.200.25 --default-character-set=utf8 test > /home/zhoujy/test.bak')
    t2 = time.time()
    t = round(t2-t1)
    print "mysqldump Cost Time %s" %t
def mydumper_data():
    t1 = time.time()
    backup = os.system('mydumper -u root -p 123456 -h 192.168.200.25 -P 3306 -B test -o /home/zhoujy/bak/')
    t2 = time.time()
    t = round(t2-t1)
    print "mydumper Cost Time %s" %t
if __name__ =='__main__':
    mysqldump_data()
    mydumper_data()

Two databases were tested:

1:

mysqldump Cost Time: 162s
mydumper Cost Time: 61s

2:

mysqldump Cost Time: 483s
mydumper Cost Time: 337s

Judging from the time above, mydumper does promote backup data, as does restore.

Conclusion:
From the above test analysis, it can be seen that mydumper can improve the efficiency of backup and restore, and how much improvement is limited to the IO capability of the disk. Please evaluate IO before using it, and you can try to use this tool.


Related articles: