Detailed Explanation of MySQL Combined Index and Leftmost Matching Principle

  • 2021-11-14 07:21:49
  • OfStack

Preface

I have seen a lot of articles about the leftmost prefix matching of mysql joint index on the Internet before, and I think I know its principle. Recently, I communicated with the interviewer during the interview and found that something was missing. Here, I will sort out this content myself.

When is the composite index created?

When our where query has multiple conditional queries, we need to create a composite index on the columns of the query

Why not create an index on no 1 column

Reduce overhead Override index High efficiency

Reduce overhead: If you create a combined index on col1, col2 and col3, it is equivalent to creating three indexes (col1), col1, col2), col1, col2 and col3)
Override index: If you query SELECT col1, col2, col3 FROM table name, because the query field exists in the index page, you can get it directly from the index, and you don't need to query back to the table

High efficiency: Indexes are created for col1, col2 and col33 columns respectively, and MySQL only selects one column with high recognition as index. Assuming that there is 100w data and 10% of the data is screened out by one index, then 10w data can be screened out; For a composite index, 100w*10% *10% *10% = 1000 pieces of data can be filtered

Leftmost matching principle

Suppose we create a composite index (col1, col2, col3), which is equivalent to sorting the columns of col1, that is, we create a composite index, which is the leftmost one. As long as the leftmost column is included in the query condition, the query will use the index

Create a test table


CREATE TABLE `student` (
 `id` int(11) NOT NULL,
 `name` varchar(10) NOT NULL,
 `age` int(11) NOT NULL,
 PRIMARY KEY (`id`),
 KEY `idx_id_name_age` (`id`,`name`,`age`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Fill in 100w test data


DROP PROCEDURE pro10;
CREATE PROCEDURE pro10()
BEGIN
	DECLARE i INT;
	DECLARE char_str varchar(100) DEFAULT 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
	DECLARE return_str varchar(255) DEFAULT '';
	DECLARE age INT;
	SET i = 1;
	WHILE i < 5000000 do
		SET return_str = substring(char_str, FLOOR(1 + RAND()*62), 8);
		SET i = i+1;
		SET age = FLOOR(RAND() * 100);
		INSERT INTO student(id, name, age) values(i, return_str, age);
	END WHILE;
END;

CALL pro10();

Scenario testing


EXPLAIN SELECT * FROM student WHERE id = 2;

You can see that the query uses the index


EXPLAIN SELECT * FROM student WHERE id = 2 AND name = 'defghijk';

You can see that the query uses the index


EXPLAIN SELECT * FROM student WHERE id = 2 AND name = 'defghijk' and age = 8;

You can see that the query uses the index


EXPLAIN SELECT * FROM student WHERE id = 2 AND age = 8;

You can see that the query uses the index


EXPLAIN SELECT * FROM student WHERE name = 'defghijk' AND age = 8;

It can be seen that the index is not used in this query, the type is index, the number of query rows is 4989449, and almost all tables are scanned. Because the combined index only sorts the leftmost columns, only all scans can be performed for name and age


EXPLAIN SELECT * FROM student WHERE name = 'defghijk' AND id = 2;

EXPLAIN SELECT * FROM student WHERE age = 8 AND id = 2;

EXPLAIN SELECT * FROM student WHERE name = 'defghijk' and age = 8 AND id = 2;

It can be seen that the index is also used in the above query, and the query results of id before and after are 1. MySQL will find out the one query method with the highest execution efficiency, that is, query according to id first

Summarize

As in the above test, you can see that whenever the column of the query condition contains the leftmost column of the composite index, no matter where the column is in the query condition, the index will be used for query.


Related articles: