Basic tutorial on sorting data using indexes in MySQL

  • 2020-12-05 17:23:57
  • OfStack

In MySQL, there are two ways to generate an ordered result set: 1 using filesort and 2 scanning in index order. Sorting operations using indexes are very fast, and you can use both search and sort operations with the same 1 index. You can use indexes to sort when the order of the indexes is the same as the order of the columns in ORDER BY and all the columns are in the same direction (all ascending or all descending). If the query is joining multiple tables, the index will only be used if all the columns in ORDER BY are columns of the first table. filesort is used in all other cases.

The MySQL index is typically used to speed up the search for rows matching WHERE criteria or matching rows from other tables when performing join operations.
MySQL can also use indexes to quickly perform sorting and grouping operations on ORDER BY and GROUP BY statements.
To achieve ORDER BY statement optimization of MySQL by index optimization:


create table actor(

actor_id int unsigned NOT NULL AUTO_INCREMENT,

name   varchar(16) NOT NULL DEFAULT '',

password    varchar(16) NOT NULL DEFAULT '',

PRIMARY KEY(actor_id),

 KEY   (name)

) ENGINE=InnoDB

insert into actor(name,password) values('cat01','1234567');

insert into actor(name,password) values('cat02','1234567');

insert into actor(name,password) values('ddddd','1234567');

insert into actor(name,password) values('aaaaa','1234567');


mysql> explain select actor_id from actor order by actor_id \G


*************************** 1. row ***************************

      id: 1

 select_type: SIMPLE

    table: actor

     type: index

possible_keys: NULL

     key: PRIMARY

   key_len: 4

     ref: NULL

     rows: 4

    Extra: Using index

1 row in set (0.00 sec)


mysql> explain select actor_id from actor order by password \G


*************************** 1. row ***************************

      id: 1

 select_type: SIMPLE

    table: actor

     type: ALL

possible_keys: NULL

     key: NULL

   key_len: NULL

     ref: NULL

     rows: 4

    Extra: Using filesort

1 row in set (0.00 sec)


mysql> explain select actor_id from actor order by name \G


*************************** 1. row ***************************

      id: 1

 select_type: SIMPLE

    table: actor

     type: index

possible_keys: NULL

     key: name

   key_len: 18

     ref: NULL

     rows: 4

    Extra: Using index

1 row in set (0.00 sec)

Here's a list of 1 common index optimizations of ORFER BY:

1. If an SQL statement is as follows:


SELECT [column1],[column2], ... . FROM [TABLE] ORDER BY [sort];

Create an index in the [sort] column to achieve order by optimization using indexes.
2. Index optimization of WHERE + ORDER BY, such as:


SELECT [column1],[column2], ... . FROM [TABLE] WHERE [columnX] = [value] ORDER BY [sort];

A joint index (columnX,sort) was established to achieve order by optimization.
Note: If columnX has multiple values, the following statement cannot optimize order by using indexes


SELECT [column1],[column2], ... . FROM [TABLE] WHERE [columnX] IN ([value1],[value2], ... ) ORDER BY[sort];

3. WHERE+ multiple fields ORDER BY


mysql> explain select actor_id from actor order by actor_id \G

0

Establishing index (uid,x,y) to optimize order by is much better than establishing index (x,y,uid).
MySQL Order By cannot use indexes to optimize sorting
* ORDER BY for different index keys :(key1,key2 are indexed separately)


mysql> explain select actor_id from actor order by actor_id \G

1

* do ORDER BY on discontinuous key parts :(key_part1,key_part2 to create joint index ; key2 index building)


mysql> explain select actor_id from actor order by actor_id \G

2

* both ASC and DESC are used :(key_part1,key_part2 for joint index)


SELECT * FROM t1 ORDER BY key_part1 DESC, key_part2 ASC;

* not the same key used to search records as ORDER BY :(key1,key2 are indexed separately)


mysql> explain select actor_id from actor order by actor_id \G

4

* If expressions (functions) are applied to the fields of WHERE and ORDER BY, indexes cannot be used to optimize order by


mysql> explain select actor_id from actor order by actor_id \G

5

When MySQL index cannot be used for sorting, will use their sorting algorithm (quick sort algorithm) in memory (sort buffer) in the data sorting, if memory load, it will be the data on the disk partition, then for each block of data sorting, then each piece of merged into an orderly result set (is actually sort out). For filesort, MySQL has two sorting algorithms.
1. Two-pass scanning algorithm (Two passes)
The method is to extract the fields that need to be sorted and the pointer information that can be directly positioned to the relevant row data, then sort them in the set memory (set by the parameter sort_buffer_size), and then extract the required Columns through the row pointer information again after the sorting.
Note: This algorithm is the algorithm adopted before 4.1. It needs to access data twice, especially the second read operation will lead to a large number of random I/O operations. On the other hand, there is less memory overhead.
2. One-sweep algorithm (single pass)
The algorithm takes out all the Columns required once and outputs the results directly after sorting in memory.
Note: This algorithm has been used since version 4.1 of MySQL. It reduces the number of I/O times, which is more efficient, but it also has a large memory overhead. If we took out Columns, which we didn't need, we would waste a lot of memory for the sorting process. In versions after MySQL 4.1, MySQL can select the first sorting algorithm or the second sorting algorithm by setting the max_length_for_sort_data parameter. When the total size of all the extracted large fields is greater than the setting of max_length_for_sort_data, MySQL chooses to use the first sorting algorithm, or vice versa. In order to improve the sorting performance as much as possible, we naturally prefer to use the second sorting algorithm, so it is necessary to extract only the required Columns from Query.

When sorting join operations, EXPLAIN prints "Using filesort" if ORDER BY only refers to the column of the first table, filesort performs the filesort operation on the table and then joins. Otherwise, MySQL must generate a temporary table from the result set of the query and perform filesort operation after the join is completed. At this time, EXPLAIN prints "Using temporary; Using filesort ".


Related articles: