A brief discussion on 30 common methods of optimizing sql statement query in MySQL

  • 2020-05-27 07:23:54
  • OfStack

1. Optimize the query to avoid full table scans as much as possible, and first consider indexing the columns involved in where and order by.

2. Use in where clauses should be avoided as much as possible! = or < > Otherwise the engine will abandon the index for a full table scan.

3. null value judgment on the field in where clause should be avoided as far as possible, otherwise the engine will give up using the index and scan the whole table, such as:
select id from t where num is null
You can set the default value of 0 on num to ensure that the num column in the table does not have an null value, and then query:
select id from t where num=0

4. Try to avoid using or in the where clause to join conditions, as this will cause the engine to abandon the use of the index for a full table scan, such as:
select id from t where num=10 or num=20
You can query it like this:
select id from t where num=10
union all
select id from t where num=20

5. The following query will also result in a full table scan:
select id from t where name like '%abc%'
To improve efficiency, consider full-text search.

6.in and not in should also be used with caution, otherwise it will result in a full table scan, such as:
select id from t where num in(1,2,3)
For continuous values, between rather than in:
select id from t where num between 1 and 3

7. Using parameters in the where clause will also result in a full table scan. Because SQL resolves local variables only at run time, the optimizer cannot defer the selection of access plans to run time; It must be selected at compile time. However, if an access plan is established at compile time, the value of the variable is not known and therefore cannot be used as an input item for index selection. A full table scan is performed as follows:
select id from t where num=@num
You can force queries to use indexes instead:
select id from t with(index(index name)) where num= @num

8. Expression manipulation of fields in the where clause should be avoided as much as possible, which would cause the engine to abandon the use of indexes for a full table scan. Such as:
select id from t where num/2=100
Should be changed to:
select id from t where num=100*2

9. Functional manipulation of fields in the where clause should be avoided as far as possible, which would cause the engine to abandon the use of indexes for a full table scan. Such as:
select id from t where substring(name,1,3)='abc'--name id starting with abc
select id from t where datediff(day,createdate,'2005-11-30')=0-- id generated from '2005-11-30'
Should be changed to:
select id from t where name like 'abc%'
select id from t where createdate > ='2005-11-30' and createdate < '2005-12-1'

10. Do not perform a function, arithmetic operation, or other expression operation to the left of "=" in the where clause, otherwise the system may not be able to use the index properly.

11. When using an index field as a condition, if the index is a composite index, the system can only use the index if the first field in the index is used as a condition. Otherwise, the index will not be used, and the order of the fields should be the same as the order of the index as much as possible.

12. Do not write a meaningless query, such as generating an empty table structure:
select col1,col2 into #t from t where 1=0
If this type of code does not return any result sets, but consumes system resources, it should look like this:
create table #t(...)

13. It is a good choice to replace in with exists in many cases:
select num from a where num in(select num from b)
Replace it with the following statement:
select num from a where exists(select 1 from b where num=a.num)

14. Not all indexes are valid for the query. SQL is based on the data in the table for query optimization.

15. More indexes is not always better. Although indexes can improve the efficiency of select, they also reduce the efficiency of insert and update. The number of indexes in a table should not exceed 6, and if there are too many, you should consider whether it is necessary to build indexes on some infrequently used columns.

16. Updates to the clustered index data columns should be avoided as much as possible, because the order of the clustered index data columns is the physical storage order of the table records. Once the column value changes, the order of the entire table records will be adjusted, which will cost considerable resources. If your application needs to update the clustered index data column frequently, consider whether the index should be clustered.

17. Use numeric fields whenever possible. Do not use character fields that contain only numeric information. This will reduce query and join performance and increase storage overhead. This is because the engine processes queries and joins by comparing each character in the string one by one, whereas for numeric types it only needs to compare once.

18. Use varchar/nvarchar instead of char/nchar whenever possible, because first of all, the storage space of the variable field is small, which can save the storage space. Secondly, for the query, it is obviously more efficient to search in a relatively small field.

19. Do not use select * from t anywhere, replace "*" with a list of specific fields, and do not return any fields that are not available.

20. Use table variables instead of temporary tables whenever possible. If the table variables contain a large amount of data, note that the indexes are very limited (only primary key indexes).

Avoid frequent creation and deletion of temporary tables to reduce consumption of system table resources.

22. Temporary tables are not unusable, and their proper use can make some routines more efficient, for example, when a large table or a dataset in a common table needs to be repeatedly referenced. However, for a primary event, it is best to use an export table.

23. When creating a new temporary table, if a large amount of data is inserted once, select into can be used instead of create table to avoid creating a large amount of log, so as to improve the speed; If the amount of data is small, create table and then insert should be used to mitigate the resources of the system table.

24. If temporary tables are used, make sure that all temporary tables are explicitly deleted at the end of the stored procedure, truncate table, then drop table, to avoid prolonged locking of system tables.

25. Avoid cursors as much as possible because they are inefficient and should be considered for rewrites if they operate on more than 10,000 rows of data.

26. Before using a vernier or temporary table approach, look for a set-based solution to the problem. A set-based approach is usually more effective.

27. Unlike temporary table 1, cursors are not unusable. Using an FAST_FORWARD cursor on a small data set is generally preferable to other line-by-line processing methods, especially if you have to reference several tables to get the required data. Routines that include "summing" in the result set are generally faster to execute than those that use cursors. If development time allows, both the cursor-based method and the set-based method can try 1 to see which one works better.

28. Set SET NOCOUNT ON at the beginning of all stored procedures and triggers, and SET NOCOUNT OFF at the end. There is no need to send an DONE_IN_PROC message to the client after each statement of the stored procedure and trigger is executed.

29. Try to avoid returning large amount of data to the client. If the amount of data is too large, we should consider whether the corresponding demand is reasonable.

Avoid large transaction operations as much as possible to improve the system's concurrency.

Related articles: