Common Scenarios and Avoidance Methods of Index Failure in MySQL

  • 2021-12-19 07:09:54
  • OfStack

Preface

I have read many similar articles before, and mentioned that the improper use of sql statements will cause the index of MySQL to fail. There are also some MySQL "military regulations" or specifications that state that some sql cannot be written in this way, otherwise the index will be invalid.

Most of the content is approved by the author, but in some examples, the author thinks that the words are too absolute, and the reasons are not explained. Many people don't know why. Therefore, the author definitely reorganizes the common scenes of index failure in MySQL, and analyzes the reasons for your reference.

Of course, please remember that explain is a good habit!

Common Scenarios of MySQL Index Failure

When validating the following scenario, prepare enough data, because when the amount of data is small, the optimizer of MySQL sometimes decides that the full table scan is harmless and will not hit the index.

1. Indexes may be invalidated when or is included in the where statement

Using or does not mean that 1 will invalidate the index. You need to see if the query columns on the left and right sides of or hit the same index.

Assume that the column user_id in the USER table has an index and the column age has no index.

The following statement actually hits the index (it is said that the new version of MySQL can only be used. If you use the old version of MySQL, you can use explain to verify it).


select * from `user` where user_id = 1 or user_id = 2;

However, this statement cannot hit the index.


select * from `user` where user_id = 1 or age = 20;

Assuming that the age column also has an index, it still cannot hit the index.


select * from `user` where user_id = 1 or age = 20;

Therefore, it is suggested to avoid using or statement as much as possible, and use union all or in as much as possible according to the situation, and the execution efficiency of these two statements is better than that of or.

2. Indexed columns in the where statement use negative queries, which may invalidate the index

Negative queries include: NOT,! =, < > ,! < ,! > , NOT IN, NOT LIKE, etc.

A "military regulations" said that the use of negative query 1 will definitely invalidate the index. The author checked some articles, and some netizens refuted this point and gave evidence.

In fact, negative query does not absolutely invalidate the index, which depends on the judgment of MySQL optimizer, full table scanning or indexing which cost is lower.

3. The index field can be null, and using is null or is not null may invalidate the index

In fact, a single index field can hit the index when using is null or is not null, but the netizen said that when two different index fields are connected with or, the index will fail. The author thinks that the index is indeed invalid, but this pot should be recited by or, which belongs to the first scenario ~ ~

Assume that the user_id column in the USER table has an index and allows null, and the age column has an index and allows null.


select * from `user` where user_id is not null or age is not null;

However, it is emphasized in some "military regulations" and specifications that the fields should be set to not null and provide default values for a reason.

The columns of null make index/index statistics/value comparisons more complex and more difficult to optimize for MySQL. null This type of MySQL needs special processing internally, which increases the complexity of database processing records; Under the same conditions, when there are more empty fields in the table, the processing performance of the database will be greatly reduced. The null value requires more storage space, and the null columns in each row in both the table and the index require additional space to identify. When dealing with null, only is null or is not null can be used, but =, in, < , < > ,! =, not in. For example: where name! = 'shenjian', if there are records with name values of null, the query results will not contain records with name values of null.

4. Using built-in functions on index columns, 1 will definitely invalidate the index

For example, if a function is used on the index column login_time in the following statement, the index will fail:


select * from `user` where DATE_ADD(login_time, INTERVAL 1 DAY) = 7;

Optimization recommendations, try to calculate and transform in the application.

In fact, there are two index failure scenarios mentioned by netizens, which should be attributed to the use of functions in index columns.

4.1 Index invalidation due to implicit type conversion

For example, in the following statement, the index column user_id is of type varchar and does not hit the index:


select * from `user` where user_id = 12;

This is because MySQL does an implicit type conversion, calling the function to convert user_id.


select * from `user` where CAST(user_id AS signed int) = 12;

4.2 Index invalidation due to implicit character encoding conversion

When making an association query between two tables, if the character codes of the fields associated in the two tables are not 1, MySQL may call CONVERT function to implicitly convert different character codes to achieve unification 1. When applied to the associated field, the index will be invalidated.

For example, the following statement, where the d. tradeid character is encoded as utf8 and the l. tradeid character is encoded as utf8mb4. Because utf8mb4 is a superset of utf8, MySQL will use CONVERT to convert utf8 to utf8mb4. Simply put, CONVERT acts on d. tradeid, so the index is invalid.


select l.operator from tradelog l , trade_detail d where d.tradeid=l.tradeid and d.id=4;

There are generally two solutions to this situation.

Scenario 1: Encode the characters of the associated fields into system 1.

Scheme 2: When it is really impossible to unify 1 character coding, manually apply CONVERT function to the right side of correlation = to achieve the purpose of character coding unification 1. Here, utf8mb4 is forced to be converted into utf8. Of course, there is a risk of data truncation when converting from superset to subset. As follows:


select d.* from tradelog l , trade_detail d where d.tradeid=CONVERT(l.tradeid USING utf8) and l.id=2; 

5. Operate on the index column, 1 will definitely cause the index to fail

Operations such as +,-, *,/, etc. are as follows:


select * from `user` where age - 1 = 10;

For optimization, you should put the operation on the value, or calculate it directly in the application, such as:


select * from `user` where user_id = 1 or age = 20;
0

6. The like wildcard character may invalidate the index

An like query that starts with% invalidates the index. There are two solutions:

Move% to the back, such as:


select * from `user` where user_id = 1 or age = 20;
1

Use the override index to hit the index.


select * from `user` where user_id = 1 or age = 20;
2

7. In the union index, the index column in where violates the leftmost matching principle, and 1 will definitely lead to the index failure

When creating a joint index, such as (k1, k2, k3), it is equivalent to creating three indexes (k1), (k1, k2) and (k1, k2, k3), which is the leftmost matching principle.

For example, the following statement will not hit the index:


select * from t where k2=2;
select * from t where k3=3;
slect * from t where k2=2 and k3=3;

The following statement hits only the index (k1):


select * from `user` where user_id = 1 or age = 20;
4

8. Final choice of MySQL optimizer, no index

As mentioned above, even if it completely conforms to the scenario where the index takes effect, considering the actual data volume and other reasons, it depends on the judgment of MySQL optimizer whether to use the index in the end. Of course, you can also write in the sql statement that you force a certain index.

Some Suggestions on Optimizing Index

It is forbidden to establish indexes on attributes with frequent updates of 10 points and low discrimination. Updates will change the B + tree, and indexing frequently updated fields will greatly reduce database performance. "Gender", an attribute with little discrimination, is meaningless to establish an index, which can not filter data effectively, and its performance is similar to that of full table scanning. To build a composite index, you must put the highly differentiated fields first.

Summarize

Reference

"Why do these SQL statements have the same logic, but the performance is so different? "

Essential for Back-end Programmers: Top 10 Miscellaneous Diseases of Index Failure

[Interpretation of 30 Military Regulations in 58 Home Database]

or/in/union for MySQL and the Road to Index Optimization Architects


Related articles: