Example of Optimization Techniques for Slow Subquery Efficiency of mysql in Statement

  • 2021-09-12 02:31:19
  • OfStack

The table structure is as follows, and there are only 690 articles.


 Article table article(id,title,content)
 Label list tag(tid,tag_name)
 Tag article intermediate table article_tag(id,tag_id,article_id)

One of the tags tid is 135, and the query tag tid is 135 for the article list.

690 articles, queried with the following statements, are extremely slow:


select id,title from article where id in(
select article_id from article_tag where tag_id=135
)

This one is very fast:


select article_id from article_tag where tag_id=135

The query result is 5 articles, and id is 428, 429, 430, 431, 432

It is also very fast to check articles with the following sql:


select id,title from article where id in(
428,429,430,431,432
)

Solution:


select id,title from article where id in(
select article_id from (select article_id from article_tag where tag_id=135) as tbt
)

Other Solutions: (Example)


mysql> select * from abc_number_prop where number_id in (select number_id from abc_number_phone where phone = '82306839');

In order to save space, the output content is omitted, the same below.

67 rows in set (12.00 sec)

Only 67 rows of data returned, but it took 12 seconds, and there may be many such queries in the system at the same time, so the system certainly can't bear it. Look at it with desc (note: explain is also acceptable)


mysql> desc select * from abc_number_prop where number_id in (select number_id from abc_number_phone where phone = '82306839');
+----+--------------------+------------------+--------+-----------------+-------+---------+------------+---------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------------+--------+-----------------+-------+---------+------------+---------+--------------------------+
| 1 | PRIMARY | abc_number_prop | ALL | NULL | NULL | NULL | NULL | 2679838 | Using where |
| 2 | DEPENDENT SUBQUERY | abc_number_phone | eq_ref | phone,number_id | phone | 70 | const,func | 1 | Using where; Using index |
+----+--------------------+------------------+--------+-----------------+-------+---------+------------+---------+--------------------------+
2 rows in set (0.00 sec)

As you can see, more than two million rows will be scanned when this query is executed. Is there no index created? Look at 1


mysql>show index from abc_number_phone;
+------------------+------------+-------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------------+------------+-------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| abc_number_phone | 0 | PRIMARY | 1 | number_phone_id | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 0 | phone | 1 | phone | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 0 | phone | 2 | number_id | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 1 | number_id | 1 | number_id | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 1 | created_by | 1 | created_by | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 1 | modified_by | 1 | modified_by | A | 36879 | NULL | NULL | YES | BTREE | | |
+------------------+------------+-------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
6 rows in set (0.06 sec)
mysql>show index from abc_number_prop;
+-----------------+------------+-------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+-------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| abc_number_prop | 0 | PRIMARY | 1 | number_prop_id | A | 311268 | NULL | NULL | | BTREE | | |
| abc_number_prop | 1 | number_id | 1 | number_id | A | 311268 | NULL | NULL | | BTREE | | |
| abc_number_prop | 1 | created_by | 1 | created_by | A | 311268 | NULL | NULL | | BTREE | | |
| abc_number_prop | 1 | modified_by | 1 | modified_by | A | 311268 | NULL | NULL | YES | BTREE | | |
+-----------------+------------+-------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.15 sec)

As you can see from the above output, these two tables create indexed on the number_id field.
See if there is any problem with the subquery itself.


mysql> desc select number_id from abc_number_phone where phone = '82306839';
+----+-------------+------------------+------+---------------+-------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------+------+---------------+-------+---------+-------+------+--------------------------+
| 1 | SIMPLE | abc_number_phone | ref | phone | phone | 66 | const | 6 | Using where; Using index |
+----+-------------+------------------+------+---------------+-------+---------+-------+------+--------------------------+
1 row in set (0.00 sec)

No problem, just scan a few lines of data, and the cable causes the effect.

Check it out and have a look:


mysql> select number_id from abc_number_phone where phone = '82306839';
+-----------+
| number_id |
+-----------+
| 8585 |
| 10720 |
| 148644 |
| 151307 |
| 170691 |
| 221897 |
+-----------+
6 rows in set (0.00 sec)

Put the data obtained from the subquery directly into the above query


select id,title from article where id in(
select article_id from article_tag where tag_id=135
)
0

The speed is also fast, so it seems that MySQL is not good enough to deal with subqueries. I tried this in both MySQL 5.1. 42 and MySQL 5.5. 19.

I searched the network for 1 time and found that many people have encountered this problem:

Reference 1: MySQL optimization using joins (join) instead of subqueries

Reference 2: MYSQL subquery and nested query optimization instance parsing

According to the suggestions of these materials on the Internet, try join instead.
Before modification:


select id,title from article where id in(
select article_id from article_tag where tag_id=135
)
1

After modification:


select id,title from article where id in(
select article_id from article_tag where tag_id=135
)
2

The effect is good, and the query time is almost 0. See how MySQL executes this query under 1


mysql>desc select a.* from abc_number_prop a inner join abc_number_phone b on a.number_id = b.number_id where phone = '82306839';
+----+-------------+-------+------+-----------------+-----------+---------+-----------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-----------------+-----------+---------+-----------------+------+--------------------------+
| 1 | SIMPLE | b | ref | phone,number_id | phone | 66 | const | 6 | Using where; Using index |
| 1 | SIMPLE | a | ref | number_id | number_id | 4 | eap.b.number_id | 3 | |
+----+-------------+-------+------+-----------------+-----------+---------+-----------------+------+--------------------------+
2 rows in set (0.00 sec)

Summary: When the sub-query speed is slow, JOIN can be used to rewrite the query under 1 for optimization.

There are also articles on the Internet saying that queries using JOIN statements are always faster than those using subqueries.

It is also mentioned in the mysql manual, and the specific original text is found in this chapter of the mysql document:
I.3. Restrictions on Subqueries
13.2.8. Subquery Syntax

Excerpts:

1) For subqueries using IN:

Subquery optimization for IN is not as effective as for the = operator or for IN(value_list) constructs.

A typical case for poor IN subquery performance is when the subquery returns a small number of rows but the outer query returns a large number of rows to be compared to the subquery result.

The problem is that, for a statement that uses an IN subquery, the optimizer rewrites it as a correlated subquery. Consider the following statement that uses an uncorrelated subquery:

SELECT ... FROM t1 WHERE t1.a IN (SELECT b FROM t2);

The optimizer rewrites the statement to a correlated subquery:

SELECT ... FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.b = t1.a);

If the inner and queries return M and N rows, respectively, the execution on the the order of O (M × N), rather than O (M + N) as it be be for

An implication is that an IN subquery can be much slower than a query written using an IN(value_list) construct that lists the same values that the subquery would return.

2) For converting subqueries to join:

The optimizer is more mature for joins than for subqueries, so in many cases a statement that uses a subquery can be executed more efficiently if you rewrite it as a join.

An exception occurs for the case where an IN subquery can be rewritten as a SELECT DISTINCT join. Example:

SELECT col FROM t1 WHERE id_col IN (SELECT id_col2 FROM t2 WHERE condition);

That statement can be rewritten as follows:

SELECT DISTINCT col FROM t1, t2 WHERE t1.id_col = t2.id_col AND condition;

But in this case, the join requires an extra DISTINCT operation and is not more efficient than the subquery

Summarize

The above is the whole content of this article about the optimization skill example of slow subquery efficiency of mysql in statement. Interested friends can refer to: Talking about the subquery union of mysql and the efficiency of in, the optimization introduction of enterprise production MySQL, etc. If you have any questions, you can leave a message. Welcome everyone to exchange reference.

I hope this article will be helpful to everyone.


Related articles: