In mysql RAND of randomly queries to record efficiency problems and share solutions

  • 2020-05-13 03:36:37
  • OfStack

Recently, the random selection implementation method of MYSQL was roughly studied due to the need. For example, to randomly extract a record from the tablename table, you can write it like this: SELECT * FROM tablename ORDER BY RAND() LIMIT 1.
There are two ways to do this.
1. Create a new table with Numbers between -5 and 5 in it, and use order by rand() to get random Numbers.
Create the specified range data table
 
#auther:  Jack Bauer ( Nostradamus ) 
#date: 2008-03-31 
create table randnumber 
select -1 as number 
union 
select -2 
union 
select -3 
union 
select -4 
union 
select -5 
union 
select 0 
union 
select 1 
union 
select 2 
union 
select 3 
union 
select 4 
union 
select 5 

# Get a random number  
#auther:  Jack Bauer ( Nostradamus ) 
#date: 2008-03-31 
select number 
from randnumber order by rand() limit 1 

Advantages: random Numbers can specify a part of the data, not need to be continuous.
Disadvantages: when the range of random Numbers is very wide, it is difficult to build a table.
2. Using MySQL's ROUND() and RAND() function
The sql statement is done
#auther: jack Bauer (soothsayer)
#date: 2008-03-31
 
SELECT ROUND((0.5-RAND())*2*5) 
# annotation  
#0.5-rand() You can get -0.5  to  +0.5 The random number  
#(0.5-rand())*2 You can get -1  to  +1 The random number  
#(0.5-rand())*2*5 You can get -5  to  +5 The random number  
#ROUND((0.5-RAND())*2*5) You can get -5  to  +5 Random integer of  

But then I checked the official MYSQL manual, and the hint for RAND() basically said that you can't use the RAND() function in ORDER BY clause, because it would cause the data column to be scanned multiple times. However, in version 3.23 of MYSQL, randomization is still possible through ORDER BY RAND().
But the real test 1 found that this was very inefficient. A library of more than 150,000 pieces of data, query 5 pieces of data, it takes more than 8 seconds. Looking at the official manual, it also says that rand() will be executed multiple times in ORDER BY clause, which is naturally inefficient and inefficient.

Search for Google, the Internet is basically to query max(id) * rand() to obtain data at random.
 
SELECT * FROM `table` AS t1 JOIN (SELECT ROUND(RAND() * (SELECT MAX(id) FROM `table`)) AS id) AS t2 WHERE t1.id >= t2.id ORDER BY t1.id ASC LIMIT 5; 

But this will result in five consecutive records. The solution is to query one item at a time, five times. Even that is worth it, because with 150,000 tables, the query takes less than 0.01 seconds.
The following statement is JOIN, which is used on mysql's forums
 
SELECT * FROM `table` WHERE id >= (SELECT FLOOR( MAX(id) * RAND()) FROM `table` ) ORDER BY id LIMIT 1; 

I tested 1 time, need 0.5 seconds, the speed is also good, but with the statement is still a big gap. I always feel something is wrong.
So I just rewrote 1.
 
SELECT * FROM `table` 
WHERE id >= (SELECT floor(RAND() * (SELECT MAX(id) FROM `table`))) 
ORDER BY id LIMIT 1; 

This time, the efficiency is improved again, the query time is only 0.01 seconds
Finally, add MIN(id) to the sentence. At the beginning of the test, because I did not add MIN(id), the result was that the first few rows in the table were always queried half of the time.
The full query statement is:
 
SELECT * FROM `table` 
WHERE id >= (SELECT floor( RAND() * ((SELECT MAX(id) FROM `table`)-(SELECT MIN(id) FROM `table`)) + (SELECT MIN(id) FROM `table`))) 
ORDER BY id LIMIT 1; 
SELECT * 
FROM `table` AS t1 JOIN (SELECT ROUND(RAND() * ((SELECT MAX(id) FROM `table`)-(SELECT MIN(id) FROM `table`))+(SELECT MIN(id) FROM `table`)) AS id) AS t2 
WHERE t1.id >= t2.id 
ORDER BY t1.id LIMIT 1; 

Finally, the two statements are queried 10 times in php,
The former takes 0.147433 seconds
The latter takes 0.015130 seconds
It seems that the syntax of JOIN is much more efficient than using functions directly in WHERE.
After many tests, we found that using join's syntax is much faster than using it directly in where. There are better submitting friends to ask for a chat.

Related articles: