MySQL InnoDB Secondary Index Sorting Example Detailed Explanation

  • 2021-11-10 11:07:45
  • OfStack

Scheduling problem

Recently, I read "45 Lectures on MySQL Actual Combat" in geek time, which corrected one understanding of InnoDB Level 2 index in the past, and just summarized the relevant contents.

PS: All tests in this article are based on MySQL 8.0. 13.

Let's throw the question out first. The table created by SQL below has two query statements. Which index is not necessary?


CREATE TABLE `geek` (
 `a` int(11) NOT NULL,
 `b` int(11) NOT NULL,
 `c` int(11) NOT NULL,
 `d` int(11) NOT NULL,
 PRIMARY KEY (`a`,`b`),
 KEY `c` (`c`),
 KEY `ca` (`c`,`a`),
 KEY `cb` (`c`,`b`)
) ENGINE=InnoDB;

select * from geek where c=N order by a limit 1;
select * from geek where c=N order by b limit 1;

The answer given by the author is that the data models for indexing c and ca are identical, so ca is redundant. Why? ?

We know that the position of the primary key is not stored in the level 2 index, but the value of the primary key, and we also know that the index is ordered.

If c and ca have the same data model 1, then the leaf nodes of the level 2 index are required to be sorted not only by the index column, but also by the associated primary key value.

My previous understanding is that level 2 indexes are only sorted by index columns, and primary key values are not sorted.

I asked the columnist and got the answer: Index c is sorted according to cab, (level 2 index)) is guaranteed to count the primary key, or is it orderly. (PS: non-original words, obtained by asking three times before and after).

In line with the idea of asking whether it is first and then why, make a survey.

Isn't it?

If you can look directly at the data file of InnoDB, you can directly see whether this collation rule is followed. Unfortunately, it is a binary document, and there is no convenient tool to view and give up.

Later, I found the handler statement for MySQL, which supports tables for both MyISAM/InnoDB engines. The handler statement provides direct access to the table storage engine.

The following syntax means reading the first/first/next/last record of the specified index of the specified table.


handler table_name/table_name_alias read index_name first/pre/next/last;

Just use handler statement to verify, first build a simple table and insert several pieces of data:


create table t_simple (
 id int primary key,
 v int,
 key k_v (v)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

insert into t_simple values (1, 5);
insert into t_simple values (10, 5);
insert into t_simple values (4, 5);

In the above insertion statement, the values of the 2-level index columns are all 1-like, and the primary keys are not in order, so you can see whether the iteration is stored in order of primary keys.


mysql> handler t_simple open as ts;
Query OK, 0 rows affected (0.00 sec)

mysql> handler ts read k_v next;
+----+------+
| id | v |
+----+------+
| 1 | 5 |
+----+------+
1 row in set (0.00 sec)

mysql> handler ts read k_v next;
+----+------+
| id | v |
+----+------+
| 4 | 5 |
+----+------+
1 row in set (0.00 sec)

mysql> handler ts read k_v next;
+----+------+
| id | v |
+----+------+
| 10 | 5 |
+----+------+
1 row in set (0.00 sec)

It can be seen from the results that when the values of the traversed level 2 indexes are equal, traversing according to the order of primary keys can basically determine that the level 2 indexes are sorted not only by index columns, but also by primary key values.

Why?

I didn't see that MySQL had such a mechanism before, and I didn't know this when I asked the former company and the former company's DBA.

Finally, DBA colleagues found the index extension, Index Extensions, which has this description to explain:

InnoDB automatically extends each secondary index by appending the primary key columns to it. Consider this table definition:


CREATE TABLE t1 (
 i1 INT NOT NULL DEFAULT 0,
 i2 INT NOT NULL DEFAULT 0,
 d DATE DEFAULT NULL,
 PRIMARY KEY (i1, i2),
 INDEX k_d (d)
) ENGINE = InnoDB;

InnoDB automatically expands each level 2 index, appending the primary key value to the index column, and making the expanded combined column the index column of the index. For the k_v index of the t_simple table above, the expanded column is (v, id).

The optimizer determines how and whether to use the extended Level 2 index based on its primary key column. The optimizer can use extended level 2 indexes for index access of ref, range, index_merge type, loose index scanning, join and sort optimization, and min ()/max () optimization.

Can be used show variables like '%optimizer_switch%'; Check whether the index extension is turned on; Use SET optimizer_switch = 'use_index_extensions=on/off'; Turn it on or off, which only affects the current session.

After testing, even if the index extension of the current session is turned off, it still has the effect of sorting by primary key when accessed with handler.

Summarize


Related articles: