Detailed Explanation of count without Filter Condition in MySQL
- 2021-11-13 03:00:15
- OfStack
count(*)
Realization
1. MyISAM: The total number of rows of the table is stored on the disk, and the query without filtering conditions can be returned directly
If there are filtered count (*), MyISAM will not return quickly
2. InnoDB: Read data line by line from the storage engine 1, and then accumulate the count
Because of MVCC, it is uncertain how many rows InnoDB should return at the same time
Sample
Suppose the table t has 10,000 records
session A | session B | session C |
---|---|---|
BEGIN; | ||
SELECT COUNT(*) FROM t;(返回10000) | ||
INSERT INTO t;(插入1行) | ||
BEGIN; | ||
INSERT INTO t(插入1行); | ||
SELECT COUNT(*) FROM t;(返回10000) | SELECT COUNT(*) FROM t;(返回10002) | SELECT COUNT(*) FROM T;(返回10001) |
At the last moment, three sessions query the total number of rows of t at the same time, but the results are different
The default transaction isolation level for InnoDB is RR, implemented through MVCC
Optimization
1. InnoDB is the index organization table
2, 2-level index trees take up much less space than clustered index trees
3. The optimizer will traverse the smallest index tree and minimize the amount of scanned data on the premise of ensuring correct logic
show table status
mysql> SHOW TABLE STATUS\G;
*************************** 1. row ***************************
Name: t
Engine: InnoDB
Version: 10
Row_format: Dynamic
Rows: 100256
Avg_row_length: 47
Data_length: 4734976
Max_data_length: 0
Index_length: 5275648
Data_free: 0
Auto_increment: NULL
Create_time: 2019-02-01 17:49:07
Update_time: NULL
Check_time: NULL
Collation: utf8_general_ci
Checksum: NULL
Create_options:
Comment:
SHOW TABLE STATUS is also estimated by sampling (very inaccurate), and the error can reach 40% ~ 50%
Maintenance count
Cache
Program
Disadvantages
Missing update
1. Redis may lose updates
2. Solution: After Redis is restarted abnormally, execute count (*) once in the database
Logic imprecision is fatal
1. Scenario: Display the total number of operation records and 100 records of recent operations
2. Redis and MySQL are two different storage systems, which do not support distributed transactions, so it is impossible to get an accurate 1-dimensional view
Time series A
session B at the time of T3, the 100 lines of results found have the latest inserted records, but Redis has not +1, and the logic is not 1
时刻 | session A | session B |
---|---|---|
T1 | ||
T2 | 插入1行数据R; | |
T3 | 读取Redis计数; 查询最近100条记录; |
|
T4 | Redis计数+1; |
Time series B
session B at the time of T3, there is no newly inserted record in the 100 lines of results found, but Redis has +1, and the logic is not 1
时刻 | session A | session B |
---|---|---|
T1 | ||
T2 | Redis计数+1; | |
T3 | 读取Redis计数; 查询最近100条记录; |
|
T4 | 插入1行数据R; |
Database
时刻 | session A | session B |
---|---|---|
T1 | ||
T2 | BEGIN; 表C中的计数值+1; |
|
T3 | BEGIN; 读表C计数值; 查询最新100条记录; COMMIT; |
|
T4 | 插入1行数据R; COMMIT; |
Performance of count
Semantics
1. count () is an aggregate function, which judges the returned result set line by line
If the parameter value of count function is not NULL, the cumulative value is +1, otherwise, it is not added, and finally the cumulative value is returned
2. count (Field F)
3. count (primary key ID), count (1), count (*)
4. The InnoDB engine returns whatever field the Server layer wants
Performance comparison
count (Field F)
1. If the field F is not allowed to be NULL, read this field from the record line by line, and accumulate it line by line after passing the judgment
2. If the field F is defined as allowing NULL, the field is read out from the record line by line, and accumulated according to the line after the judgment is passed
3. If there is no level 2 index on the field F, you can only traverse the whole table (clustered index)
4. Because InnoDB must return the field F, the optimizer can make fewer optimization decisions
count (primary key ID)
count(1)
count(*)
Efficiency ranking
Sample
mysql> SHOW CREATE TABLE prop_action_batch_reward\G;
*************************** 1. row ***************************
Table: prop_action_batch_reward
Create Table: CREATE TABLE `prop_action_batch_reward` (
`id` bigint(20) NOT NULL,
`source` int(11) DEFAULT NULL,
`serial_id` bigint(20) NOT NULL,
`create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
`user_ids` mediumtext,
`serial_index` tinyint(4) DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `uniq_serial_id_source_index` (`serial_id`,`source`,`serial_index`),
KEY `idx_create_time` (`create_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
count (Field F)
No index
There is no index on user_ids, and InnoDB must return the user_ids field, only the clustered index can be traversed
mysql> EXPLAIN SELECT COUNT(user_ids) FROM prop_action_batch_reward;
+----+-------------+--------------------------+------+---------------+------+---------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------------+------+---------------+------+---------+------+----------+-------+
| 1 | SIMPLE | prop_action_batch_reward | ALL | NULL | NULL | NULL | NULL | 16435876 | NULL |
+----+-------------+--------------------------+------+---------------+------+---------+------+----------+-------+
mysql> SELECT COUNT(user_ids) FROM prop_action_batch_reward;
+-----------------+
| count(user_ids) |
+-----------------+
| 17689788 |
+-----------------+
1 row in set (10.93 sec)
Have an index
1. There is an index on serial_id, which can traverse uniq_serial_id_source_index
2. However, since InnoDB must return the serial_id field, it does not traverse the better choice idx_create_time with equivalent logical results
mysql> EXPLAIN SELECT COUNT(serial_id) FROM prop_action_batch_reward;
+----+-------------+--------------------------+-------+---------------+-----------------------------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------------+-------+---------------+-----------------------------+---------+------+----------+-------------+
| 1 | SIMPLE | prop_action_batch_reward | index | NULL | uniq_serial_id_source_index | 15 | NULL | 16434890 | Using index |
+----+-------------+--------------------------+-------+---------------+-----------------------------+---------+------+----------+-------------+
mysql> SELECT COUNT(serial_id) FROM prop_action_batch_reward;
+------------------+
| count(serial_id) |
+------------------+
| 17705069 |
+------------------+
1 row in set (5.04 sec)
count (primary key ID)
The optimizer selects the best index idx_create_time to traverse rather than cluster the index
mysql> EXPLAIN SELECT COUNT(id) FROM prop_action_batch_reward;
+----+-------------+--------------------------+-------+---------------+-----------------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------------+-------+---------------+-----------------+---------+------+----------+-------------+
| 1 | SIMPLE | prop_action_batch_reward | index | NULL | idx_create_time | 5 | NULL | 16436797 | Using index |
+----+-------------+--------------------------+-------+---------------+-----------------+---------+------+----------+-------------+
mysql> SELECT COUNT(id) FROM prop_action_batch_reward;
+-----------+
| count(id) |
+-----------+
| 17705383 |
+-----------+
1 row in set (4.54 sec)
count(1)
mysql> EXPLAIN SELECT COUNT(1) FROM prop_action_batch_reward;
+----+-------------+--------------------------+-------+---------------+-----------------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------------+-------+---------------+-----------------+---------+------+----------+-------------+
| 1 | SIMPLE | prop_action_batch_reward | index | NULL | idx_create_time | 5 | NULL | 16437220 | Using index |
+----+-------------+--------------------------+-------+---------------+-----------------+---------+------+----------+-------------+
mysql> SELECT COUNT(1) FROM prop_action_batch_reward;
+----------+
| count(1) |
+----------+
| 17705808 |
+----------+
1 row in set (4.12 sec)
count(*)
mysql> EXPLAIN SELECT COUNT(*) FROM prop_action_batch_reward;
+----+-------------+--------------------------+-------+---------------+-----------------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------------+-------+---------------+-----------------+---------+------+----------+-------------+
| 1 | SIMPLE | prop_action_batch_reward | index | NULL | idx_create_time | 5 | NULL | 16437518 | Using index |
+----+-------------+--------------------------+-------+---------------+-----------------+---------+------+----------+-------------+
mysql> SELECT COUNT(*) FROM prop_action_batch_reward;
+----------+
| count(*) |
+----------+
| 17706074 |
+----------+
1 row in set (4.06 sec)
References
[MySQL Practical 45 Lectures]
Summarize