The joint index learning tutorial in MySQL

  • 2020-12-07 04:33:16
  • OfStack

A joint index is also called a composite index. For composite indexes :Mysql uses fields in the index from left to right, a query can only use 1 part of the index, but only the left-most part. For example, the index is key index (a,b,c). a | a,b| a,b,c 3 combinations can be supported for searching, but b,c is not supported for searching. When the leftmost field is a constant reference, the index is valid for 10 points.


Indexes on two or more columns are called composite indexes.
With additional columns in the index, you can narrow your search, but using one index with two columns is different from using two separate indexes. A compound index has a structure similar to that of a phone book, where a person's first and last name are grouped together, and the phone book is sorted first by last name and then by first name for people with the same last name. If you know the last name, the phone book will be very useful; The phone book is more useful if you know first and last names, but it is useless if you only know first names.
So when creating a composite index, you should think carefully about the order of the columns. Composite indexes are useful when searching for all columns in an index or only for the first few columns; A composite index is useless if the search is only performed on any of the following columns.
For example, a composite index of name, age, and sex is created.


create table test(
a int,
b int,
c int,
KEY a(a,b,c)
);

Principles for the establishment of composite indexes:

If you are likely to perform a search on only one column more than once, that column should be the first column in the composite index. If you are likely to perform a separate search on two columns in a two-column index, you should create another index that contains only the second column.
As shown in the figure above, if you need to query for age and gender in your query, you should create a new composite index with age and gender.
A primary key with multiple columns is always automatically created as a composite index in the order in which they appear in the table definition, not in the order specified in the primary key definition. Determine which column should come first, considering future searches performed by the primary key.
Note that creating a composite index should contain a few columns, and these columns are often used in select queries. Having too many columns in a composite index not only doesn't do you much good. And because considerable memory is used to store the values of the columns of the composite index, the result is memory leaks and performance degradation.


Sorting optimization by composite index:

The composite index is optimized only for order by statements that are ordered the same or opposite in the index.
When a composite index is created, each column is defined in ascending or descending order. For example, define a composite index:



CREATE INDEX idx_example  
ON table1 (col1 ASC, col2 DESC, col3 ASC) 


There are three columns: col1 ascending, col2 descending, col3 ascending. Now if we execute two queries
1:


Select col1, col2, col3 from table1 order by col1 ASC, col2 DESC, col3 ASC

Same order as index
2:


Select col1, col2, col3 from table1 order by col1 DESC, col2 ASC, col3 DESC 

In reverse order
Query 1 and 2 can be optimized by composite index.
If the query is:


Select col1, col2, col3 from table1 order by col1 ASC, col2 ASC, col3 ASC

The result of the sorting is completely different from the index, and the query will not be optimized by the composite index.


The role of query optimizer in where queries:

If a multi-column index exists on columns Col1 and Col2, the following statement: Select * from table where col1=val1 AND col2=val2 The query optimizer tries to determine which index will find fewer rows. And then use that index to evaluate it.
1. If there is a multi-column index, any left-most index prefix can be used by the optimizer. Therefore, the order of the joint index is different, affecting the selection of the index, try to put the values with less in the first place.
For example, a multi-column index is (col1, col2, col3)
Then the search in the index in columns (col1), (col1 col2), (col1 col2 col3) will be useful.


SELECT * FROM tb WHERE col1 = val1 
SELECT * FROM tb WHERE col1 = val1 and col2 = val2 
SELECT * FROM tb WHERE col1 = val1 and col2 = val2 AND col3 = val3 

2. If the column does not constitute the left-most prefix of the index, the index created will not be in effect.
Such as:


SELECT * FROM tb WHERE col3 = val3 
SELECT * FROM tb WHERE col2 = val2 
SELECT * FROM tb WHERE col2 = val2 and col3=val3 


3. Use an index if the query condition for an Like statement does not start with a wildcard.
For example: % car or % car % does not use index.
Car % uses index.
Disadvantages of indexes:
1. Take up disk space.
2. Increased the operation time of insert and delete. The more indexes a table has, the slower the inserts and deletes are. If the system requires fast entry, do not build too many indexes.

Here are some common index restriction problems

1. Use the unequal operator ( < > , !=)
In the case of dept_id, the query still performs a full table scan even if there is one index in column dept_id
select * from dept where staff_num < > 1000;
But development does require such queries, isn't there a way to solve the problem?
There are!
By querying with the or syntax instead of the inequality sign, you can use indexes to avoid a full table scan: the statement above is changed to look like this and you can use indexes.


select * from dept shere staff_num < 1000 or dept_id > 1000; 

2. is null or is not null
Using is null or is nuo null also limits the use of indexes because the database does not define the null value. If there are many null columns being indexed, this index will not be used (unless the index is a bitmap index, which will be explained in more detail in a future blog article). Using null in sql statements can cause a lot of trouble.
The solution to this problem is to create a table that defines the columns that need to be indexed as non-empty (not null)

3. Use functions
If no function-based indexes are used, the use of functions on columns with existing indexes in the where clause causes the optimizer to ignore these indexes. The following queries will not use the index:


select * from staff where trunc(birthdate) = '01-MAY-82'; 


However, if the function is applied to a condition, the index is valid. If the above statement is changed to the following statement, the index can be searched.


select * from staff where birthdate < (to_date('01-MAY-82') + 0.9999); 

4. Compare mismatched data types
Comparing mismatched data types is also one of the more difficult performance issues to discover.
In the following example, dept_id is an varchar2 type field on which there is an index, but the following statement performs a full table scan.


CREATE INDEX idx_example  
ON table1 (col1 ASC, col2 DESC, col3 ASC) 
0


This is because oracle automatically converts the where clause to to_number(dept_id)=900198, which is the case mentioned by 3, thus limiting the use of the index.
To use indexes, change the SQL statement to the following form


CREATE INDEX idx_example  
ON table1 (col1 ASC, col2 DESC, col3 ASC) 
1

Well, here's another caveat:


Let's say we have a list of articles. We want to display a list in reverse chronological order under a certain category:


CREATE INDEX idx_example  
ON table1 (col1 ASC, col2 DESC, col3 ASC) 
2

This kind of query is very common, basically no matter what application can find a large number of similar SQL, academic readers see the above SQL, may say SELECT * is not good, should only query the required fields, let's just thoroughly point, SQL changed to the following form:


SELECT id FROM articles WHERE category_id = ... ORDER BY created DESC LIMIT ...

Let's assume that id is the primary key here, and that the content of the article can be stored in a cache of key types like memcached. Therefore, academic readers should not have any problem with this, so let's consider how to build an index according to this SQL:

Regardless of special cases such as data distribution, any qualified WEB developer knows that SQL like this should create a composite index of "category_id, created", but is this the best answer? Not really, it's time to look back at the title: indexing in MySQL should take into account the type of database engine!

If our database engine is InnoDB, then the "category_id, created" composite index is the best answer. Let's look at the index structure of InnoDB. In InnoDB, there is a special place for the index structure: the non-primary key index will save the corresponding primary key value on the leaf node of its BTree. The most immediate advantage of this is that Covering Index can be obtained directly in the index instead of fetching the value of id in the data file.

If our database engine is MyISAM, then creating a composite index of "category_id, created" is not the best answer. Since the non-primary key index of MyISAM does not hold the corresponding primary key value in the index structure of MyISAM, the composite index of "category_id, created, id" should be created if you want to take advantage of Covering Index.

Lao finished, should understand my meaning. I hope you can think more comprehensively when considering the index in the future 1 point, there are many similar problems in the practical application, for example, most people do not build an index from Cardinality (SHOW INDEX... Cardinality represents the number of only one value. Generally speaking, if the number of only one value accounts for less than 20% of the total number of rows, Cardinality can be considered too small. In this case, the index will not have much effect on select except to slow down the speed of insert/update/delete. There is one detail is not considering the influence of the character set index, for example username fields, if only English is allowed, underscore symbols, then don't use gbk, utf - 8 character sets, such as, the simple character set, should use latin1 or ascii index file will be much smaller, speed nature will be a lot faster. These details require the reader's own attention, and I will not dwell on them.


Related articles: