Mysql limit optimization a reference to the millions to tens of millions of fast paging composite indexes applied to lightweight frameworks

  • 2020-05-10 23:01:09
  • OfStack

MySql this database is absolutely suitable for dba level master to play, 1 do 11 point 10 thousand news small system how to write, xx framework can achieve rapid development. But when the data volume reaches 100,000, millions to tens of millions, can his performance still be that high? 1 point small error, may cause the entire system to rewrite, or even the system can not run normally! All right, less bullshit. Use facts, see examples:

Data table collect (id, title,info,vtype) for these four fields, title for fixed length, info for text, id for info
vtype is tinyint, vtype is the index. This is a simple model of a basic news system. Now fill it with data, fill it with 100,000 stories.

Finally, collect is 100,000 records, and the database table occupies 1.6G hard disk. OK, look at this sql statement:

select id,title from collect limit 1000,10; Soon; Basically 0.01 seconds on OK, and look at the following

select id,title from collect limit 90000,10; Starting at 90,000 pages, the result?

8-9 seconds to finish, what's wrong with my god?? In fact, to optimize this data, the Internet to find the answer. Take a look at the following statement:

select id from collect order by id limit 90000,10; Pretty soon, 0.04 seconds on OK.
Why is that? Because it's faster to use id as the primary key. The online changes are:

select id,title from collect where id > =(select id from collect order by id
limit 90000,1) limit 10;

This is the result of using id as an index. But the problem is so complicated that a little bit, it's over. Look at the following statement

select id from collect where vtype=1 order by id limit 90000,10;
Very slow, 8-9 seconds!

Here I believe a lot of people will be like me 1, have a breakdown feeling! Is vtype indexed? How slow? vtype indexed is good, you directly select id from
collect where vtype=1 limit 1000,10;
Is very fast, basically 0.05 seconds, but increased by 90 times, from 90,000, that is the speed of 0.05*90=4.5 seconds. And test results of 8-9 seconds to an order of magnitude. So this is where somebody came up with the idea of a sub-table, this and discuz
The forum is a kind of thinking. The idea is as follows:

Create an index table: t (id,title,vtype) and set it to a fixed length. Then do pagination. Page out the results and go to collect to find info.
Is it feasible? You'll see that experimentally.

One hundred thousand records were recorded in t(id,title,vtype), with a data table size of about 20M. with

select id from t where vtype=1 order by id limit 90000,10;
Very soon. Basically, you can do it in 0.1-0.2 seconds. Why is that? I guess it's because collect has too much data, so the paging has to go a long way. limit
It's all about the size of the data table. In fact, this is still a full table scan, only because the data volume is small, only 100,000 fast. OK, do a crazy experiment, add it to a million, test the performance.

Add 10 times more data, and immediately the t table is over 200 M, and it is a fixed length. Or just the query statement, the time is 0.1-0.2 seconds to complete! Sub-table performance ok? Wrong! Because our limit is still 90,000, so it's fast. Let's get a big one. Let's start with 900,000

select id from t where vtype=1 order by id limit limit 900000,10; Look at the results. The time is 1-2 seconds!

why?? Minutes of the time or so long, very depressed! Some people said that the fixed length would improve the performance of limit. At first, I also thought that because the length of a record is fixed, mysql
I should be able to figure out the 900,000 position, right? However, we overestimate the intelligence of mysql, which is not a commercial database. It turns out that fixed length and non-fixed length have little influence on limit. No wonder people say
discuz will be slow when it reaches 1 million records. I believe this is true. It has something to do with database design!

Can't MySQL break the 1 million limit?? At a million pages you really hit the limit??

The answer is: NO!!
The reason why we can't break 1 million is because we can't design mysql. The following non-fractional table method, to a crazy test! 1 sheet takes care of 1 million records, and 10G
Database, how to page quickly!

Ok, our test goes back to collect table, and the test conclusion is:
Three hundred thousand data, with the fractional table method feasible, more than three hundred thousand his speed will slow you can not bear! Of course, if you use the score table + me method, it is absolutely perfect. But after using me this kind of method, need not cent table also can be solved perfectly!

The answer is: composite indexes! Once when designing the mysql index, I happened to find that the index name could be arbitrarily chosen and several fields could be selected. What is the use of this? Start with select id from
collect order by id limit 90000,10; So fast is because the index, but if you add where will not go to the index. With the idea of giving it a try
An index like search(vtype,id). Then test

select id from collect where vtype=1 limit 90000,10; Very fast! Done in 0.04 seconds!

Test again: select id,title from collect where vtype=1 limit 90000,10;
Very sorry, 8-9 seconds, did not walk search index!

Test again: search(id,vtype), or select id, sorry, 0.5 seconds.

To sum up: if you want to access limit if you have the condition of where, you must design an index of where
Put the first bit, the primary key used by limit put the second bit, and only the select primary key!

It solves the paging problem perfectly. You can quickly return id and hopefully optimize limit. By this logic, the million limit should be finished in 0.0x seconds. It seems mysql
Statement optimization and indexing is very important!

Ok, so back to the original question, how do you quickly and successfully apply this to development? If I use composite queries, my lightweight framework is useless. You have to write your own pagination strings, how much trouble is that? Here's another example, and here's the idea:

select * from collect where id in (9000,12,50,7000); Zero seconds!

The index of mygod, mysql is also valid for in statements! It seems that the Internet is wrong to say that in cannot use the index!

With this in mind, it's easy to apply to lightweight frameworks:

The code is as follows:
 
$db=dblink(); 
$db->pagesize=20; 

$sql="select id from collect where vtype=$vtype"; 

$db->execute($sql); 
$strpage=$db->strpage(); 
// Save the paging string in a temporary variable for easy output  
while($rs=$db->fetch_array()){ 
$strid.=$rs['id'].','; 
} 
$strid=substr($strid,0,strlen($strid)-1); 
// construct id string  
$db->pagesize=0; 
// It is critical that you empty the page without unlogging the class so that it is only needed 1 Secondary database connection, no need to open;  
$db->execute("select 
id,title,url,sTime,gTime,vtype,tag from collect where id in ($strid)"); 

<?php while($rs=$db->fetch_array()): ?> 
<tr> 
<td> <?php echo $rs['id'];?></td> 
<td> <?php echo $rs['url'];?></td> 
<td> <?php echo $rs['sTime'];?></td> 
<td> <?php echo $rs['gTime'];?></td> 
<td> <?php echo $rs['vtype'];?></td> 
<td> <a href="?act=show&id=<?php echo $rs['id'];?>" 
target="_blank"><?php echo $rs['title'];?></a></td> 
<td> <?php echo 
$rs['tag'];?></td> 
</tr> 
<?php endwhile; 
?> 
</table> 
<?php 
echo $strpage; 

The idea is simple: 1) find id by optimizing the index and spell it into a string like "123,9000012,000". 2) find the result in the second query.

With a little index +1 change, mysql can support millions or even tens of millions of pages!

Through the example here, I reflected on one point: for large systems, PHP should never use a framework, especially one that can't even see sql statements! Because at first my lightweight framework almost collapsed! Only suitable for the rapid development of small applications, for ERP,OA, large websites, data layer including logical layer things can not use the framework. If the programmer loses control of the sql statement, the risk of the project will increase exponentially! Especially mysql
When, mysql 1 must need professional dba to give full play to its best performance. One index can make a performance difference of thousands of times!

PS:
After the actual test, to 1 million data, 1.6 million data, 15G table, 190M index, even walk the index, limit took 0.49 seconds. So pagination had better not let others see after 100,000 pieces of data, or it will be very slow! Even with the index. After this optimization, mysql to the million page is a limit! But with such a score is already very good, if you are using sqlserver sure card dead! And 1.6 million data
id in (str) is very fast, still basically 0 seconds. If so, mysql should be easy to deal with, with data in the tens of millions.

Related articles: