Method of Query Optimization for Android SQLite Database

  • 2021-10-24 23:38:33
  • OfStack

Preface

Database performance optimization industry is generally less, and today this article hopes to bring some help to everyone

SQLite is a typical embedded DBMS. It has many advantages. It is lightweight and small after compilation. One of the reasons is that it is relatively simple in query optimization

When we use SQLite for data store queries, To optimize queries, indexes will be used here. Although the amount of data on C is not very large in most cases, good index building habits often bring good query performance improvement, and at the same time, it can stand the test of larger data in the unknown future. How to optimize database queries is demonstrated by Example 11 below.

First, build a test table table1, which contains three indexes:


sqlite> .schem
CREATE TABLE table1(id integer primary key not null default 0,a integer,b integer, c integer);
CREATE INDEX a_i on table1 (a);
CREATE INDEX a_i2 on table1 (a,b);
CREATE INDEX a_i3 on table1 (c);

In common database systems, explain keywords are used for SQL query verification, such as:


sqlite> explain select * from table1;
addr opcode   p1 p2 p3 p4    p5 comment  
---- ------------- ---- ---- ---- ------------- -- -------------
0  Init   0  10 0     00 Start at 10 
1  OpenRead  0  2  0  4    00 root=2 iDb=0; table1
2  Rewind   0  9  0     00    
3  Rowid   0  1  0     00 r[1]=rowid 
4  Column   0  1  2     00 r[2]=table1.a
5  Column   0  2  3     00 r[3]=table1.b
6  Column   0  3  4     00 r[4]=table1.c
7  ResultRow  1  4  0     00 output=r[1..4]
8  Next   0  3  0     01    
9  Halt   0  0  0     00    
10 Transaction 0  0  4  0    01 usesStmtJournal=0
11 Goto   0  1  0     00 

The output will be immediately, and these outputs show that SQLite executes every instruction used by this SQL, which is actually not very intuitive. We use EXPLAIN QUERY PLAN more, as follows:


sqlite> explain QUERY PLAN select * from table1;
0|0|0|SCAN TABLE table1

This SQL statement queries the whole table, so the result keyword SCAN means to traverse completely, which is the lowest efficiency. Next, let's try to add a query condition:


sqlite> explain QUERY PLAN select * from table1 where a=1;
0|0|0|SEARCH TABLE table1 USING INDEX a_i2 (a=?)

After adding where a=1, the keyword becomes SEARCH, which means that traversal is no longer needed, but the index is used for partial retrieval. In addition, this output has more information, such as using the index a_i2, and a= in parentheses? Is caused by this query condition

Let's modify SQL slightly:


sqlite> explain QUERY PLAN select a from table1 where a=1;
0|0|0|SEARCH TABLE table1 USING COVERING INDEX a_i (a=?)

select is changed into select a, and it is found that the output of explain has a slight change, from INDEX to COVERING INDEX. CONVERING INDEX means that the results can be obtained by using index query directly, and there is no need to look back at the data table again, so the efficiency is higher. Because the previous query is used, there are only a records in the index, so it is necessary to query the original records to get b and c fields. Let's try this SQL again:


sqlite> explain QUERY PLAN select a,b from table1 where a=1 and b=1;
0|0|0|SEARCH TABLE table1 USING COVERING INDEX a_i2 (a=? AND b=?)

Agree to use CONVERING INDEX because the index a_i2 already contains a and b. Some students may ask, when we build the index, we all add other fields. Although the query is not needed, it is efficient to query the original records twice. In theory, this is feasible, but there is an important problem here, that is, the data redundancy is too serious, which leads to the index being as large as the original data. Disk consumption is a problem in the database with massive data storage, so how to choose may need to be balanced.

Next, we replace and with or:


sqlite> explain QUERY PLAN select a,b from table1 where a=1 or b=1;
0|0|0|SCAN TABLE table1 USING COVERING INDEX a_i2

I found that I changed back to SCAN, but I still used the index a_i2. Compare this SQL:


sqlite> explain QUERY PLAN select a,b from table1 where a=1;
0|0|0|SEARCH TABLE table1 USING COVERING INDEX a_i2 (a=?)

After adding one more query condition b=1, the efficiency becomes worse. Why? This leads to the most critical principle we use to create indexes: prefix indexes.

Index 1 generally uses B tree. Prefix index is simply to use this index, and the query conditions must meet the fields involved in index establishment, and the order used by query is 1.

Looking back at the example of or, he can use the index a_i2 (a, b) for the query condition a=1, because the index order also begins with a. However, in the example of or, or the previous query condition b=1, there is no index available for this query, because there is no index at the beginning of b. Although there is b in the index of a_i2 (a, b), b is not in front of the query condition of b=1, which does not satisfy the prefix index principle.

For the example of and just now, we can use the index completely, because there are indexes a_i2 (a, b), so we can imagine filtering the data by index a first, and then filtering the remaining data by index b. For and conditions, it doesn't matter if the order of fields in the index is changed by 1, and the database will automatically optimize the selection, such as:


sqlite> .schem
CREATE INDEX a_i22 on table2 (b,a);
sqlite> explain QUERY PLAN select a,b from table2 where a=1 and b=1;
0|0|0|SEARCH TABLE table2 USING COVERING INDEX a_i22 (b=? AND a=?)

If the or query also makes full use of indexes, as smart reader 1 must have thought, it is to build two indexes, as follows:


CREATE TABLE table3(id integer primary key not null default 0,a integer,b integer, c integer);
CREATE INDEX a_i222 on table3(a);
CREATE INDEX a_i2222 on table3(b);
sqlite> explain QUERY PLAN select a,b from table3 where a=1 or b=1;
0|0|0|SEARCH TABLE table3 USING INDEX a_i222 (a=?)
0|0|0|SEARCH TABLE table3 USING INDEX a_i2222 (b=?)

Let's look at an advanced one, plus a sort:


sqlite> explain select * from table1;
addr opcode   p1 p2 p3 p4    p5 comment  
---- ------------- ---- ---- ---- ------------- -- -------------
0  Init   0  10 0     00 Start at 10 
1  OpenRead  0  2  0  4    00 root=2 iDb=0; table1
2  Rewind   0  9  0     00    
3  Rowid   0  1  0     00 r[1]=rowid 
4  Column   0  1  2     00 r[2]=table1.a
5  Column   0  2  3     00 r[3]=table1.b
6  Column   0  3  4     00 r[4]=table1.c
7  ResultRow  1  4  0     00 output=r[1..4]
8  Next   0  3  0     01    
9  Halt   0  0  0     00    
10 Transaction 0  0  4  0    01 usesStmtJournal=0
11 Goto   0  1  0     00 
0

Comparing these two queries, we find that the following one has an additional USE TEMP B-TREE FOR ORDER BY. For the first query, we can see that the sorting also meets the prefix index principle (first filter the data by index a, and then sort the remaining data by index b). For the second query, because this principle is not met, there is an extra temporary table to sort. Seeing this, everyone should understand the meaning of prefix index.

Let's look at this again and replace the query criteria and sorting:


sqlite> explain select * from table1;
addr opcode   p1 p2 p3 p4    p5 comment  
---- ------------- ---- ---- ---- ------------- -- -------------
0  Init   0  10 0     00 Start at 10 
1  OpenRead  0  2  0  4    00 root=2 iDb=0; table1
2  Rewind   0  9  0     00    
3  Rowid   0  1  0     00 r[1]=rowid 
4  Column   0  1  2     00 r[2]=table1.a
5  Column   0  2  3     00 r[3]=table1.b
6  Column   0  3  4     00 r[4]=table1.c
7  ResultRow  1  4  0     00 output=r[1..4]
8  Next   0  3  0     01    
9  Halt   0  0  0     00    
10 Transaction 0  0  4  0    01 usesStmtJournal=0
11 Goto   0  1  0     00 
1

Obviously, the prefix index principle is not satisfied, because the data needs to be filtered by index b first, but b is not the first one.

Most of the general query statements are the combination of and, or and order. Only by mastering the above principles, 1 will be able to write high-performance database query statements.

For more advanced lists, you can continue to read official documents:

https://www.sqlite.org/eqp.html

https://www.sqlite.org/lang_e...

Summarize


Related articles: