MySQL performance optimization

  • 2020-05-15 02:18:15
  • OfStack

1. Introduction

In the Web application architecture, the data persistence layer (typically a relational database) is a key core component that has a significant impact on the performance of the system. MySQL is currently the most used open source database, but the default Settings of MySQL database are very poor, only 1 toy database. Therefore, the necessary optimizations must be made to use the MySQL database in production.
Optimization is a complex task, this article describes MySQL related database design and query optimization, server-side optimization, storage engine optimization.

2. Database design and query optimization
In the performance tuning of MySQL Server, the first consideration is the Database Schema design, which is very important. A bad Schema design will show poor performance even if it runs on a performance-tuned MySQL Server; Similar to Schema, the design of query statements can affect MySQL's performance, and you should avoid writing inefficient SQL queries. This section discusses these two optimizations in detail.

2.1 Schema Design
The optimization of Schema depends on what query is going to run, and different query will have different Schema optimization schemes. Section 2.2 describes the optimization of Query Design. The Schema design is also influenced by the expected data set size. Schema was designed with standardization, data types, and indexing in mind.

2.1.1 standardization

Standardization is the process of organizing data in a database. These include creating tables based on design rules and establishing relationships between them; By eliminating the correlation between redundancy and non-1, this design rule can simultaneously protect data and improve the flexibility of data. Usually, database standardization is to make the database design conform to a level 1 paradigm, which usually satisfies the level 3 paradigm. There is also a fourth form (also known as Boyce Codd form, BCNF) and a fifth form, but little consideration is given to the actual design. Ignoring these rules may make the design of the database less than perfect, but this should not affect functionality.
Features of standardization:

1) all "objects" are in its own table, without redundancy.
2) the database is usually generated by the E-R diagram.
3) brevity, updating properties usually requires updating only a few records.
4) operation of Join is time-consuming.
5) there are few optimization measures for Select and sort.
6) suitable for OLTP applications.

Non-standardized features:

1) a lot of data is stored in a table, and the data is redundant.
2) it is expensive to update data. Updating one property may update many tables and records.
3) it is possible to lose data when deleting data.
4) Select and order have many optimized choices.
5) suitable for DSS applications.


Standardized and non-standardized have their own advantages and disadvantages, which can be used in a mixed manner in a database design. 1 part of the table is standardized, and 1 part of the table retains some redundant data:

1) use standardization for OLTP and non-standardization for DSS
2) use materialized views. MySQL does not support this database feature directly, but can be replaced by MyISAM tables.
3) redundancy 1 some data are in the table, for example, ref_id and name are stored in the same table. But be aware of updates.
4) for some simple objects, value is directly used as the building. For example, IP address, etc
5) Reference by PRIMARY/UNIQUE KEY. MySQL can optimize this operation, for example:

java code
select city_name
from city,state
where state_id= state.id and state.code = 'CA' "converted to" select city_name city state_id=12


2.1.2 data types
One of the most basic optimizations is to make the table take up as little space as possible on disk. This results in a significant performance improvement because the data is small, the disk reads in faster, and the table contents are processed in less memory during the query. At the same time, indexing on smaller columns takes up less resources.
You can use the following techniques to improve table performance and minimize storage space:

1) use the right type and do not store Numbers as strings.
2) use the most efficient (minimal) data type possible. MySQL has a number of specialized types that save disk space and memory.
3) use smaller integer types to make the table smaller if possible. For example, MEDIUMINT is often better than INT because the MEDIUMINT column USES 25 percent less space.
4) if possible, declare as NOT NULL. It makes everything faster and saves 1 bit per column. Note that if you do need NULL in your application, you should use it without a doubt, just avoid having it on all columns by default.
5) for the MyISAM table, if there are no variable-length columns (VARCHAR, TEXT, or BLOB columns), use a fixed-size record format. This is faster but unfortunately may waste some space. Even if you have the CREATE option VARCHAR column ROW_FORMAT=fixed, you can be prompted to use fixed-length rows.
6) use sample character set, e.g. latin1. Use utf-8 as little as possible, because utf-8 takes up three times as much space as latin1. You can use latin1 on fields that do not require utf-8, such as mail, url, etc.


2.1.3 index
All MySQL column types can be indexed. Using indexes on related columns is the best way to improve the performance of SELECT operations. The following points should be noted when using an index:

1) MySQL only USES prefixes, such as key(a, b)... where b=5 will not use the index.
2) use indexes selectively. It is not good to use indexes on columns that change little, such as gender columns.
3) define Unique index on the Unique column.
4) avoid building unused indexes.
5) in Btree index (InnoDB USES Btree), indexes can be built on the columns that need to be sorted.
6) avoid duplicate indexes.
7) avoid indexing on prefixes of existing indexes. For example, if index (a, b) exists, leave out index (a).
8) control the length of a single index. Index the first few characters of the data using key (name (8)).
9) the shorter the key, the better. integer is better.
10) to use the index in the query (see explain), you can reduce the number of disk reads and speed up data reading.
11) similar key values are better than random ones. Auto_increment is better than uuid.
12) Optimize table can compress and sort index, be careful not to run it frequently.
13) Analyze table can update data.

2.2 Designing queries
Optimization of query statement is a problem of Case by case. Different sql have different optimization schemes. Here I will only list some general techniques.

1) in the case of index, try to ensure that the correct index is used in the query. You can use EXPLAIN select... View the results and analyze the query.
2) use the matching type when querying. For example, select * from a where id=5. If id is a character type and index is present, this query will not use index and will do a full table scan, which will be slow. The correct answer is... where id= "5", with quotation marks to indicate that the type is a character.
3) use -- log-slow-queries, long-query-time =2 to view the slower queries. The query is then analyzed and optimized using explain.

3. Server-side optimization

3.1 MySQL installation
There are many distributions of MySQL, but it is best to use the base 2 version released by MySQL AB. You can also download the source code for compilation and installation, but the compiler and some of the class libraries bug may cause potential problems with the compiled MySQL.
If the MySQL server USES an Intel processor, you can use the intel c + + compiled version. As mentioned in Linux World2005, the MySQL query compiled with the intel C + + compiler is about 30% faster than the normal version. The Intel c + + build can be downloaded from the MySQL website.

3.2 server setup optimization
The default Settings for MySQL are very poor, so make some parameter adjustments. This section describes some general parameter adjustments, not specific storage engines (MyISAM, InnoDB, related optimizations are described in section 4).

-- character-set: use simple character set for a single language, such as latin1. Use Utf-8 as little as possible. utf-8 takes up more space.
--memlock: lock MySQL to run only in memory to avoid swapping, but errors may occur if memory is insufficient.
--max_allowed_packet: be large enough to accommodate large SQL queries without much impact on performance, mainly to avoid packet errors.
--max_connections: maximum connection allowed by server. If it's too big, you get out of memory.
--table_cache: the number of table that MySQL keeps open at the same time. Opening table is expensive. 1 is normally set to 512.
--query_cache_size: memory size used to cache queries.
--datadir: the root directory where mysql stores data, on separate disks from the installation files, can improve performance by 1 point.

4. Storage engine optimization

MySQL supports different storage engines, mainly MyISAM and InnoDB.

4.1 MyISAM
MyISAM manages non-transaction tables. It provides high-speed storage and retrieval, as well as full-text search capabilities. MyISAM is supported in all MySQL configurations and is the default storage engine unless MySQL is configured to use another engine by default.

4.4.1 MyISAM features
4.1.1.1 MyISAM Properties

1) the transaction is not supported and the downtime will destroy the table
2) use less memory and disk space
3) based on table locking, concurrent updating of data will cause serious performance problems
4) MySQL only caches Index, and the data is cached by OS

4.1.1.2 Typical MyISAM usages

1) logging system
2) read-only or mostly read-only applications
3) full table scanning
4) import data in batches
5) no low concurrent read/write of transactions

4.1.2 key points of MyISAM optimization

1) declare NOT NULL to reduce disk storage.
2) defragment using optimize table to recycle free space. Note that it only runs after very large data changes.
3) Deleting/updating/adding is not allowed to use index when there is a large amount of data. Use ALTER TABLE t DISABLE KEYS.
4) setting myisam_max_[extra]_sort_file_size is large enough to significantly improve the speed of repair table.

4.1.3 MyISAM Table Locks

1) avoid concurrent insert, update.
2) it is possible to use insert delayed, but data may be lost.
3) optimize the query statement.
4) horizontal partitioning.
5) vertical partition.
6) if all else fails, use InnoDB.

4.1.4 MyISAM Key Cache

1) set key_buffer_size variable. The primary cache setting of MyISAN is used to cache the index data of the MyISAM table. This parameter only affects MyISAM. Typically, 25-33% of the memory size is set in Server, which only USES MyISAM.
2) several different Key Caches can be used (for some hot data).

a) SET GLOBAL test.key_buffer_size=512*1024;
b) CACHE INDEX t1.i1, t2.i1, t3 IN test;

2) query speed can be improved in Preload index to Cache. Because preloading index is sequential, it's very fast.

a) LOAD INDEX INTO CACHE t1, t2 IGNORE LEAVES;

4.2 InnoDB
InnoDB provides the transaction security (ACID compatible) storage engine with commit, rollback, and crash recovery capabilities for MySQL. InnoDB provides row level lock and also provides a non-locked read to Oracle style 1 in SELECT statements. These features increase multi-user deployment and performance. There is no need to extend locking in InnoDB, because row level lock is suitable for very small Spaces in InnoDB. InnoDB also supports the FOREIGN KEY constraint. In SQL queries, you are free to mix tables of type InnoDB with other types of MySQL tables, even in the same query.
InnoDB is designed to achieve maximum performance when handling large amounts of data. Its CPU is very efficient.
The InnoDB storage engine is fully integrated with the MySQL server, and the InnoDB storage engine maintains its own buffer pool for caching data and indexes in memory. InnoDB stores its table & index in one table space, which can contain several files (or raw disk partitions). This is different from MyISAM tables, where each table exists in a separate file, for example. The InnoDB table can be any size, even on operating systems where the file size is limited to 2GB.
The InnoDB engine is used on many large database sites that require high performance. The famous Internet news site Slashdot.org runs on InnoDB. Mytrix, Inc. Stores data in excess of 1TB on InnoDB, and a number of other sites handle an average load of 800 inserts/updates per second on InnoDB.
2 InnoDB features
4.2.1.1 InnoDB Properties

1) support transactions, ACID, foreign keys.
2) Row level locks
3) support different isolation levels.
4) it requires more memory and disk space than MyISAM.
5) no key compression.
6) the data and indexes are cached in the hash table.

4.2.1.2 InnoDB Good For

1) the application of transactions is required.
2) high concurrency applications.
3) automatic recovery.
4) fast operation based on primary key.

4.2.2 optimization points of InnoDB

1) try to use short, integer's primary key.
2) when using Load/Insert data, press the primary key order. If the data is not sorted by primary key, sort it first and then do the database operation.
3) the data in Load is set to SET UNIQUE_CHECKS=0, SET FOREIGN_KEY_CHECKS=0, which can avoid the overhead of foreign key and exclusive constraint checking.
4) use prefix keys. Because InnoDB does not have key compression.

4.2.3 InnoDB server-side Settings

innodb_buffer_pool_size: this is the most important setting for InnoDB and has a decisive impact on the performance of InnoDB. The default setting is only 8M, so the performance of InnoDB below the default database setting is poor. On a database server with only the InnoDB storage engine, you can set 60-80% of the memory. To be more precise, set the memory size 10% larger than InnoDB tablespaces if memory capacity allows.

innodb_data_file_path: specifies the storage space for table data and indexes, which can be one or more files. The last data file must be automatically expanded, and only the last file is allowed to be automatically expanded. This way, when the space is used up, the auto-expansion data file will automatically grow (in units of 8MB) to accommodate the additional data. For example: innodb_data_file_path = / disk1 / ibdata1:900 M; / disk2 ibdata2:50 M: autoextend two data files on a different disk. The data is first placed in ibdata1, and when it reaches 900M, the data is placed in ibdata2. When 1 denier reaches 50MB, ibdata2 will automatically grow in units of 8MB. If the disk is full, you need to add a data file to the other disk.
innodb_autoextend_increment: the default is 8M. If the amount of insert data is large once, it can be increased appropriately.

innodb_data_home_dir: the directory where the tablespace data is placed, the default is in the mysql data directory, and setting it to a different partition from the MySQL installation file can improve performance.

innodb_log_file_size: this parameter determines recovery speed. If it is too large, recovery will be slow and too small, which will affect the query performance. If it is 256M, both the performance and the speed of recovery can be considered
.
innodb_log_buffer_size: the disk speed is very slow. Writing log directly to disk will affect the performance of InnoDB. This parameter sets the size of log buffer, which is 4M. If you have a large blob operation, you can increase it appropriately.

innodb_flush_logs_at_trx_commit=2: this parameter sets the processing of log information in memory at transaction commit time.

When 1) =1, when each transaction commits, the log buffer is written to the log file, and the log file is refreshed by disk operation. Truly ACID. Speed is slow.
2) =2, when each transaction commits, the log buffer is written to the file, but the log file is not flushed for disk operation. Only an operating system crash or power failure will delete the last second of the transaction, otherwise the transaction will not be lost.
3) when =0, the log buffer is written to the log file once per second, and the log file is refreshed by disk operation. The crash of any mysqld process removes the last second of the transaction before the crash

innodb_file_per_table: each InnoDB table and its index can be stored in its own file.

transaction-isolation = READ-COMITTED: if the application can run at the READ-COMMITED isolation level, there will be a definite performance improvement.

innodb_flush_method: how to set InnoDB to synchronize IO:

1) Default, using fsync ().
2) O_SYNC opens the file in sync mode, which is usually slow.
3) O_DIRECT, Direct IO is used on Linux. Can significantly improve speed, especially on the RAID system. Avoid extra data duplication and double buffering (mysql buffering and OS buffering).

innodb_thread_concurrency: InnoDB kernel maximum number of threads.

1) at least set as (num_disks+num_cpus)*2.
2) you can disable this restriction by setting it to 1000

5. The cache

There are many types of caching, and adding the appropriate caching strategy to your application can significantly improve your application's performance. Since application caching is a big topic, this part needs to be further investigated.

6. Reference
1) http://www.mysqlperformanceblog.com/
2) Advanced MySQL Performance Optimization, Peter Zaitsev, Tobias Asplund, MySQL Users Conference 2005
3) Improving MySQL Performance with Intel C++ Compiler, Peter Zaitsev, Linux World 2005
4) MySQL Performance Optimization, Peter Zaitsev, Percona Ltd, OPEN SOURCE DATABASE CONFERENCE 2006
5) MySQL Server Settings Tuning, Peter Zaitsev, co-founder, Percona Ltd, 2007
6) MySQL Reference Manual


Related articles: