Some tips for optimizing straight_join in MySQL

2020-09-28 09:12:34
OfStack

There are many hint table joins that can be specified in oracle: ordered hint instructs oracle to join in table order following the from keyword; leading hint instructs the query optimizer to use the specified table as the first table to join, the driver table; use_nl hint instructs the query optimizer to join the specified table with other row sources in nested loops mode and will force the specified table to be inner table.
There is a corresponding straight_join in mysql. Since mysql only supports the connection mode of nested loops, straight_join here is similar to use_nl hint in oracle. mysql optimizer associated in dealing with multiple tables, are more likely to choose the wrong driver table for association, led to the increase the number of associated, thus making sql statement execution became very slow, this time need to be experienced DBA judgment, choose the right driver table, straight_join may play a role, this time let's have a look at using straight_join optimization of case:

1. User instance: spxxxxxx's 1 sql is executed very slowly, sql is as follows:


73871 | root      | 127.0.0.1:49665   | user_app_test  | Query    |   500 | Sorting result      |
SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rows
FROM test_log a,USER b
WHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4
GROUP BY DATE(practicetime)

2. Review the execution plan:


mysql> explain SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rows
FROM test_log a,USER b
WHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4
GROUP BY DATE(practicetime);
mysql> explain SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rows
-> FROM test_log a,USER b
-> WHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4
-> GROUP BY DATE(practicetime)\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: a
type: ALL
possible_keys: ix_test_log_userid
key: NULL
key_len: NULL
ref: NULL
rows: 416782
Extra: Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: b
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 96
ref: user_app_testnew.a.userid
rows: 1
Extra: Using where
2 rows in set (0.00 sec)

3. Check the index:


mysql> show index from test_log;
+ -- -- � + -- -- -- + -- -- -- -- -+ -- -- � + -- -- -- -+ -- � + -- -- -- -+ -- - -++
| Table    | Non_unique | Key_name        | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+ -- -- � + -- -- -- + -- -- -- -- -+ -- -- � + -- -- -- -+ -- � + -- -- -- -+ -- - -++
| test_log |     0 | ix_test_log_unique_ |      1 | unitid   | A     |     20 |   NULL | NULL  |   | BTREE   |     |
| test_log |     0 | ix_test_log_unique_ |      2 | paperid   | A     |     20 |   NULL | NULL  |   | BTREE   |     |
| test_log |     0 | ix_test_log_unique_ |      3 | qtid    | A     |     20 |   NULL | NULL  |   | BTREE   |     |
| test_log |     0 | ix_test_log_unique_ |      4 | userid   | A     |   400670 |   NULL | NULL  |   | BTREE   |     |
| test_log |     0 | ix_test_log_unique_ |      5 | serial   | A     |   400670 |   NULL | NULL  |   | BTREE   |     |
| test_log |     1 | ix_test_log_unit  |      1 | unitid   | A     |     519 |   NULL | NULL  |   | BTREE   |     |
| test_log |     1 | ix_test_log_unit  |      2 | paperid   | A     |    2023 |   NULL | NULL  |   | BTREE   |     |
| test_log |     1 | ix_test_log_unit  |      3 | qtid    | A     |    16694 |   NULL | NULL  |   | BTREE   |     |
| test_log |     1 | ix_test_log_serial |      1 | serial   | A     |   133556 |   NULL | NULL  |   | BTREE   |     |
| test_log |     1 | ix_test_log_userid |      1 | userid   | A     |    5892 |   NULL | NULL  |   | BTREE   |     |
+ -- -- � + -- -- -- + -- -- -- -- -+ -- -- � + -- -- -- -+ -- � + -- -- -- -+ -- - -+ - � + - + - -+

4. Adjust indexes. A table is optimized with covered indexes:


mysql>alter table test_log drop index ix_test_log_userid,add index ix_test_log_userid(userid,practicetime)

5. Review the execution plan:


mysql> explain SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rows
FROM test_log a,USER b
WHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4
GROUP BY DATE(practicetime)\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: a
type: index
possible_keys: ix_test_log_userid
key: ix_test_log_userid
key_len: 105
ref: NULL
rows: 388451
Extra: Using index; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: b
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 96
ref: user_app_test.a.userid
rows: 1
Extra: Using where
2 rows in set (0.00 sec)

After the adjustment, the implementation has a little effect, but it is not obvious, and the key has not been found yet:


SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rows
FROM test_log a,USER b
WHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4
GROUP BY DATE(practicetime);
 ........................... .
143 rows in set (1 min 12.62 sec)

6. The execution time is still very long, and the consumption of time is mainly consumed in Using filesort, and the amount of data involved in sorting is 38W, so it is necessary to transform the driven table; Try to use user table as driver table: use straight_join to force connection order:


mysql> explain SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rows
FROM USER b straight_join test_log a
WHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4
GROUP BY DATE(practicetime)\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: b
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 42806
Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: a
type: ref
possible_keys: ix_test_log_userid
key: ix_test_log_userid
key_len: 96
ref: user_app_test.b.userid
rows: 38
Extra: Using index
2 rows in set (0.00 sec)

The execution time has changed qualitatively, to 2.56 seconds;


mysql>SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rows
FROM USER b straight_join test_log a
WHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4
GROUP BY DATE(practicetime);
 ... ..
143 rows in set (2.56 sec)

7. Step 1 in the analysis of the implementation plan: Using where; Using temporary; Using filesort, user Table can also use overwritten index to avoid using where, so continue to adjust the index:


mysql> show index from user;
+ - -+ -- -- -- + -- -- -- + -- -- � + -- -- -- -+ -- � + -- -- -- -+ -- - -+ - � + - + -- -- -- + -- - +
| Table | Non_unique | Key_name     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+ - -+ -- -- -- + -- -- -- + -- -- � + -- -- -- -+ -- � + -- -- -- -+ -- - -+ - � + - + -- -- -- + -- - +
| user |     0 | PRIMARY     |      1 | userid   | A     |    43412 |   NULL | NULL  |   | BTREE   |     |
| user |     0 | ix_user_email  |      1 | email    | A     |    43412 |   NULL | NULL  |   | BTREE   |     |
| user |     1 | ix_user_username |      1 | username  | A     |     202 |   NULL | NULL  |   | BTREE   |     |
+ - -+ -- -- -- + -- -- -- + -- -- � + -- -- -- -+ -- � + -- -- -- -+ -- - -+ - � + - + -- -- -- + -- - +
3 rows in set (0.01 sec)

mysql>alter table user drop index ix_user_username,add index ix_user_username(username,isfree);
Query OK, 42722 rows affected (0.73 sec)
Records: 42722 Duplicates: 0 Warnings: 0

mysql>explain SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rows
FROM USER b straight_join test_log a
WHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4
GROUP BY DATE(practicetime);
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: b
type: index
possible_keys: PRIMARY
key: ix_user_username
key_len: 125
ref: NULL
rows: 42466
Extra: Using where; Using index; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: a
type: ref
possible_keys: ix_test_log_userid
key: ix_test_log_userid
key_len: 96
ref: user_app_test.b.userid
rows: 38
Extra: Using index
2 rows in set (0.00 sec)

8. Execution time reduced to 1.43 seconds:


mysql>SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rows
FROM USER b straight_join test_log a
WHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4
GROUP BY DATE(practicetime);
 . 
143 rows in set (1.43 sec)