Optimize mysql's limit offset example

  • 2020-05-15 02:18:40
  • OfStack

One problem I often encounter is that the offset of limit is too high, such as: limit 100000,20, so the system will query 100020, and then throw away the first 100,000. This is an expensive operation, resulting in a slow query. Assuming that all paginated pages are accessed at a frequency of 1, such queries scan an average of 1 half of the data in the table. The optimized approach either restricts access to the following pages or increases the efficiency of the high-offset query.

A simple optimization is to use an override query (covering index) and then do join for the entire row. Such as:


SQL>select * from user_order_info limit 1000000,5;

This statement can be optimized to:

select * from user_order_info inner join (select pin from user_order_info limit 1000000,5) as lim using(pin);
SQL>explain select * from user_order_info limit 1000000,5;
+----+-------------+-----------------+------+---------------+------+---------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+------+---------------+------+---------+------+----------+-------+
| 1 | SIMPLE | user_order_info | ALL | NULL | NULL | NULL | NULL | 23131886 | |
+----+-------------+-----------------+------+---------------+------+---------+------+----------+-------+
1 row in set (0.00 sec)
SQL>explain extended select * from user_order_info inner join (select pin from user_order_info limit 1000000,5) as lim using(pin);
+----+-------------+-----------------+--------+---------------+---------+---------+---------+----------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------+--------+---------------+---------+---------+---------+----------+----------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 5 | 100.00 | |
| 1 | PRIMARY | user_order_info | eq_ref | PRIMARY | PRIMARY | 42 | lim.pin | 1 | 100.00 | |
| 2 | DERIVED | user_order_info | index | NULL | PRIMARY | 42 | NULL | 23131886 | 100.00 | Using index |
+----+-------------+-----------------+--------+---------------+---------+---------+---------+----------+----------+-------------+
3 rows in set, 1 warning (0.66 sec)


According to the comparison of the two explain, it can be clearly found that the first unused index scanned 23131886 rows, and the second also scanned the same number of rows, but the efficiency was improved by using the index. This allows you to get the data directly using index instead of querying the table, and when you find the required data, get the other columns in the full table join.


Related articles: