In depth analysis of explain for MySQL query optimization
- 2020-05-17 06:48:20
- OfStack
It is also useful to consider the EXPLAIN keyword when analyzing query performance. The EXPLAIN keyword 1 is generally placed before the SELECT query statement to describe how MySQL performs the query operation and the number of rows that MySQL needs to execute to successfully return the result set. explain can help us analyze select statements, let us know the reasons for the inefficiency of the query, so as to improve our query, so that the query optimizer can work better.
1. How does the MySQL query optimizer work
The MySQL query optimizer has several goals, but the main one is to use indexes as much as possible and to eliminate as many rows of data as possible with the strictest indexes. The ultimate goal is to submit the SELECT statement to find the rows, not to exclude them. The reason the optimizer tries to exclude rows is that the faster it can exclude rows, the faster it can find rows that match the condition. Queries can be executed faster if the most rigorous testing can be done first.
Each output row of EXPLAIN provides information about one table, and each row includes the following columns:
The presence of the following two items in extra means that MYSQL cannot use the index at all, and efficiency will be significantly affected. This should be optimized as much as possible.
extra item instructions Using filesort Means that MySQL will use an external index to sort the results, rather than reading them in index order from the table. It may be sorted in memory or on disk. The sort operation in MySQL that cannot be done with an index is called "file sort" Using temporary Means that MySQL USES temporary tables when sorting query results. Common in sorting order by and grouping queries group by.
Here is an example to illustrate the use of explain.
Let's start with a table:
CREATE TABLE IF NOT EXISTS `article` (`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`author_id` int(10) unsigned NOT NULL,
`category_id` int(10) unsigned NOT NULL,
`views` int(10) unsigned NOT NULL,
`comments` int(10) unsigned NOT NULL,
`title` varbinary(255) NOT NULL,
`content` text NOT NULL,
PRIMARY KEY (`id`)
);
A few more data:
INSERT INTO `article`
(`author_id`, `category_id`, `views`, `comments`, `title`, `content`) VALUES
(1, 1, 1, 1, '1', '1'),
(2, 2, 2, 2, '2', '2'),
(1, 1, 3, 3, '3', '3');
Requirements:
If category_id is 1 and comments is greater than 1, article_id has the most views.
Check it out first:
EXPLAIN
SELECT author_id
FROM `article`
WHERE category_id = 1 AND comments > 1
ORDER BY views DESC
LIMIT 1\G
Look at some of the output:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: article
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 3
Extra: Using where; Using filesort
1 row in set (0.00 sec)
Obviously,type is ALL, which is the worst case. Using filesort also appears in Extra, which is also the worst case scenario. Optimization is a must.
Well, then the easiest solution is to index. Ok, let's try 1. In the condition of the query, category_id,comments and views are used after where. So it's easiest to have a joint index.
ALTER TABLE `article` ADD INDEX x ( `category_id` , `comments`, `views` );
The result was a definite improvement, but it was still bad:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: article
type: range
possible_keys: x
key: x
key_len: 8
ref: NULL
rows: 1
Extra: Using where; Using filesort
1 row in set (0.00 sec)
type becomes range, which is tolerable. But using Using filesort in extra is still unacceptable. But we already have an index, so why not? This is because, according to the way the BTree index works, category_id is ordered first, comments is ordered again if the same category_id is encountered, views is ordered again if the same comments is encountered. When the comments field is in the middle of the union index, comments is in the middle > The first condition is that there is a range value (so-called range), and MySQL cannot use the index to retrieve the views part after, that is, the index after the range type query field is invalid.
So we need to discard comments and delete the old index:
DROP INDEX x ON article;
Then create a new index:
ALTER TABLE `article` ADD INDEX y ( `category_id` , `views` ) ;
Then run the query:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: article
type: ref
possible_keys: y
key: y
key_len: 4
ref: const
rows: 1
Extra: Using where
1 row in set (0.00 sec)
As you can see,type becomes ref, Using filesort in Extra also disappears, and the result is very good.
Consider another example of a multi-table query.
First, define three tables, class and room.
CREATE TABLE IF NOT EXISTS `class` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`card` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE IF NOT EXISTS `book` (
`bookid` int(10) unsigned NOT NULL AUTO_INCREMENT,
`card` int(10) unsigned NOT NULL,
PRIMARY KEY (`bookid`)
);
CREATE TABLE IF NOT EXISTS `phone` (
`phoneid` int(10) unsigned NOT NULL AUTO_INCREMENT,
`card` int(10) unsigned NOT NULL,
PRIMARY KEY (`phoneid`)
) engine = innodb;
They then insert large amounts of data separately. php script to insert data:
INSERT INTO `article`
(`author_id`, `category_id`, `views`, `comments`, `title`, `content`) VALUES
(1, 1, 1, 1, '1', '1'),
(2, 2, 2, 2, '2', '2'),
(1, 1, 3, 3, '3', '3');
0
Then look at a left join query:
INSERT INTO `article`
(`author_id`, `category_id`, `views`, `comments`, `title`, `content`) VALUES
(1, 1, 1, 1, '1', '1'),
(2, 2, 2, 2, '2', '2'),
(1, 1, 3, 3, '3', '3');
1
The analysis results are as follows:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: class
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 20000
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: book
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 20000
Extra:
2 rows in set (0.00 sec)
Obviously, we need to optimize the second ALL.
Try creating an index:
ALTER TABLE `book` ADD INDEX y ( `card`);
INSERT INTO `article`
(`author_id`, `category_id`, `views`, `comments`, `title`, `content`) VALUES
(1, 1, 1, 1, '1', '1'),
(2, 2, 2, 2, '2', '2'),
(1, 1, 3, 3, '3', '3');
4
You can see that type in line 2 has been changed to ref, and rows has also been changed to 1741*18, which is an obvious optimization. This is determined by the left join feature. The LEFT JOIN condition is used to determine how to search for rows from the right table. There must be 1 on the left, so the right is our key point. 1 must be indexed.
Delete old index:
INSERT INTO `article`
(`author_id`, `category_id`, `views`, `comments`, `title`, `content`) VALUES
(1, 1, 1, 1, '1', '1'),
(2, 2, 2, 2, '2', '2'),
(1, 1, 3, 3, '3', '3');
5
Create a new index.
INSERT INTO `article`
(`author_id`, `category_id`, `views`, `comments`, `title`, `content`) VALUES
(1, 1, 1, 1, '1', '1'),
(2, 2, 2, 2, '2', '2'),
(1, 1, 3, 3, '3', '3');
6
The results of
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: class
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 20000
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: book
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 20000
Extra:
2 rows in set (0.00 sec)
Basically no change.
Then look at a right join query:
INSERT INTO `article`
(`author_id`, `category_id`, `views`, `comments`, `title`, `content`) VALUES
(1, 1, 1, 1, '1', '1'),
(2, 2, 2, 2, '2', '2'),
(1, 1, 3, 3, '3', '3');
8
The analysis results are as follows:
INSERT INTO `article`
(`author_id`, `category_id`, `views`, `comments`, `title`, `content`) VALUES
(1, 1, 1, 1, '1', '1'),
(2, 2, 2, 2, '2', '2'),
(1, 1, 3, 3, '3', '3');
9
The optimization is obvious. This is because the RIGHT JOIN condition is used to determine how to search for rows from the left table. There must be 1 on the right, so the left is our key point. 1 must be indexed.
Delete old index:
EXPLAIN
SELECT author_id
FROM `article`
WHERE category_id = 1 AND comments > 1
ORDER BY views DESC
LIMIT 1\G
0
Create a new index.
ALTER TABLE `book` ADD INDEX y ( `card`);
The results of
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: class
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 20000
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: book
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 20000
Extra:
2 rows in set (0.00 sec)
Basically no change.
Finally, let's look at inner join:
explain select * from class inner join book on class.card = book.card;
Results:
INSERT INTO `article`
(`author_id`, `category_id`, `views`, `comments`, `title`, `content`) VALUES
(1, 1, 1, 1, '1', '1'),
(2, 2, 2, 2, '2', '2'),
(1, 1, 3, 3, '3', '3');
9
Delete old index:
INSERT INTO `article`
(`author_id`, `category_id`, `views`, `comments`, `title`, `content`) VALUES
(1, 1, 1, 1, '1', '1'),
(2, 2, 2, 2, '2', '2'),
(1, 1, 3, 3, '3', '3');
5
The results of
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: class
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 20000
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: book
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 20000
Extra:
2 rows in set (0.00 sec)
Create a new index.
INSERT INTO `article`
(`author_id`, `category_id`, `views`, `comments`, `title`, `content`) VALUES
(1, 1, 1, 1, '1', '1'),
(2, 2, 2, 2, '2', '2'),
(1, 1, 3, 3, '3', '3');
6
The results of
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: class
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 20000
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: book
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 20000
Extra:
2 rows in set (0.00 sec)
To sum up,inner join and left join are similar, and both need to optimize the right table. While right join needs to optimize the left table.
Let's look at the 3 table query example again
Add 1 new index:
ALTER TABLE `phone` ADD INDEX z ( `card`);
ALTER TABLE `book` ADD INDEX y ( `card`);
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: article
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 3
Extra: Using where; Using filesort
1 row in set (0.00 sec)
0
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: article
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 3
Extra: Using where; Using filesort
1 row in set (0.00 sec)
1
The type of the last two lines is ref and the total rows is well optimized, which has a good effect.
The explain syntax in MySql helps us to rewrite queries, optimize table structure and index Settings, and maximize query efficiency. Of course, indexes are expensive to build and maintain on a large scale, often taking a long time and a large amount of space, which can be more expensive if indexes are built on different column combinations. Therefore, the index is best set in fields that require frequent queries.