An in depth analysis of the leftmost matching principle of Mysql joint index

  • 2021-10-27 09:24:05
  • OfStack

Preface

I have seen a lot of articles about the leftmost prefix matching of mysql joint index on the Internet before, and I think I know its principle. Recently, I communicated with the interviewer during the interview and found that something was missing. Here, I will sort out this content myself.

Leftmost prefix matching principle

When establishing a joint index in mysql, the principle of leftmost prefix matching will be followed, that is, leftmost priority, and matching will start from the leftmost side of the joint index when retrieving data. Example:
Create a joint index for columns col1, col2, and col3


KEY test_col1_col2_col3 on test(col1,col2,col3);

Union index test_col1_col2_col3 Actually established (col1)、(col1,col2)、(col,col2,col3) 3 indexes.


SELECT * FROM test WHERE col1= " 1 "  AND clo2= " 2 "  AND clo4= " 4 " 

The above query statement will be executed according to the leftmost prefix matching principle, and the index (col1, col2) will be used for data matching when retrieving.

Attention

The fields of the index can be in any order, such as:


SELECT * FROM test WHERE col1= " 1 "  AND clo2= " 2 " 
SELECT * FROM test WHERE col2= " 2 "  AND clo1= " 1 " 

These two query statements will use indexes (col1, col2). The rule for mysql to create a joint index is to sort the data in the leftmost field of the joint index, that is, the first field col1, and then sort the second field col2. In fact, it is equivalent to implementing a collation like order by col1 col2.

Some people may wonder that the second query statement does not match the leftmost prefix: first of all, it is certain that both query statements guarantee the two fields of col1 and col2 in the index (col1 and col2), but the order is not 1, the query condition is 1, and the final query result is definitely 1. Since the result is one, what order of query is the best? At this point, we can use the mysql query optimizer explain, and explain will correct the order in which the sql statements should be executed most efficiently before generating the real execution plan.

Why use federated indexes

Reduce overhead. Establishing a joint index (col1, col2 and col3) is actually equivalent to establishing three indexes (col1), col1, col2), col1, col2 and col3). Every additional index will increase the overhead of write operation and disk space. For tables with large amounts of data, using federated indexes will greatly reduce overhead!

Override the index. For the joint index (col1, col2, col3), if there are the following sql: select col1, col2, col3 from test where col1=1 and col2=2. Then MySQL can get data directly by traversing the index without returning to the table, which reduces a lot of random io operations. Reducing io operations, especially random io, is actually the main optimization strategy of dba. Therefore, in real practical applications, overlay index is one of the main optimization means to improve performance.

High efficiency. The more indexed columns, the less data filtered out by the index. A table with 1000W data has the following sql: select from table where col1=1 and col2=2 and col3=3, assuming that each condition can filter out 10% of the data. If there is only a single-value index, then 1000W10% = 100w data can be screened out through this index. Then go back to the table and find the data matching col2=2 and col3=3 from 100w data, and then sort and paginate again. If it is a joint index, 1000w 10% 10% * 10% = 1w is screened out through the index, and the efficiency improvement can be imagined!

Extension

For federated indexes (col1, col2, col3), the query statement SELECT * FROM test WHERE col2=2; Can indexes be triggered?
Most people will say NO, but it is actually YES.

Reason:


EXPLAIN SELECT * FROM test WHERE col2=2;
EXPLAIN SELECT * FROM test WHERE col1=1;

Observe the type field in the above two explain results. In the query, they are:

type: index type: ref

index: This type indicates that mysql scans the entire index. If you want to use this type of index, there is no special requirement for this index. As long as it is an index or a part 1 of a federated index, mysql may be scanned in the way of index type. However, the disadvantage is that the efficiency is not high. mysql will find the last data one by one from the first data in the index until it finds an index that meets the judgment conditions. Therefore, the above statement triggers the index.
ref: This type means that mysql will quickly find a qualified index according to a specific algorithm, instead of scanning every 1 data in the index for 11 times, which means that you usually understand that using index query will take out data faster. But in order to achieve this kind of search, the index is required. To achieve this fast search algorithm, the index must meet the specific data structure. Simply put, that is, the data in the index field must be orderly in order to realize this type of search and make use of the index.

Summarize


Related articles: