Detailed explanation of php using curl_init of and curl_multi_init of multithreading speed comparison
- 2021-10-27 06:47:12
- OfStack
This article illustrates the speed comparison of php using curl_init () and curl_multi_init () multithreading. Share it for your reference, as follows:
curl_init () plays a very important role in php, especially when crawling web page content or file information. For example, the previous article "php uses curl to obtain header detection and turns on GZip compression" introduced the power of curl_init ().
curl_init () deals with transactions in a single-threaded mode. If you need to go through multi-threaded mode for transaction processing, php provides us with a function curl_multi_init (), which is the function of multi-threaded mode to deal with transactions.
curl_init()
And
curl_multi_init()
Speed comparison of
curl_multi_init()
Can multithreading improve the processing speed of web pages? Today, I will verify this problem through experiments.
Today, my test is very simple, that is, to crawl the content of www. webkaka. com web page, to crawl 5 times in a row, using
curl_init()
And
curl_multi_init()
Function to complete, record the time consumption of the two, and draw a conclusion by comparison.
First, use
curl_init()
A single thread grabs the contents of www. webkaka. com pages five times in a row.
The program code is as follows:
<?php
$mtime = explode(" ", microtime());
$mtime = $mtime[1].($mtime[0] * 1000);
$mtime2 = explode(".", $mtime);
$mtime = $mtime2[0];
echo $mtime;
echo "<br>";
for($i=1; $i<=5; $i++){
$szUrl = 'http://www.webkaka.com/';
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $szUrl);
curl_setopt($curl, CURLOPT_HEADER, 0);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_ENCODING, '');
$data=curl_exec($curl);
echo $data;
echo "<br>";
$mtime_ = explode(" ", microtime());
$mtime_ = $mtime_[1].($mtime_[0] * 1000);
$mtime2_ = explode(".", $mtime_);
$mtime_ = $mtime2_[0];
echo $mtime_;
echo "<br>";
echo $mtime_ - $mtime;
}
?>
Then, use
curl_multi_init()
Multithreading grabs the contents of www. webkaka. com web page five times in a row.
The code is as follows:
<?php
echo date("Y-m-d H:m:s",time());
echo " ";
echo floor(microtime()*1000);
echo "<br>";
$mtime = explode(" ", microtime());
$mtime = $mtime[1].($mtime[0] * 1000);
$mtime2 = explode(".", $mtime);
$mtime = $mtime2[0];
echo $mtime;
echo "<br>";
$urls = array(
'http://www.webkaka.com',
'http://www.webkaka.com',
'http://www.webkaka.com',
'http://www.webkaka.com',
'http://www.webkaka.com');
print_r(async_get_url($urls)); // [0] => example1, [1] => example2
echo "<br>";
echo date("Y-m-d H:m:s",time());
echo " ";
echo floor(microtime()*1000);
echo "<br>";
$mtime_ = explode(" ", microtime());
$mtime_ = $mtime_[1].($mtime_[0] * 1000);
$mtime2_ = explode(".", $mtime_);
$mtime_ = $mtime2_[0];
echo $mtime_;
echo "<br>";
echo $mtime_ - $mtime;
function async_get_url($url_array, $wait_usec = 0)
{
if (!is_array($url_array))
return false;
$wait_usec = intval($wait_usec);
$data = array();
$handle = array();
$running = 0;
$mh = curl_multi_init(); // multi curl handler
$i = 0;
foreach($url_array as $url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // return don't print
curl_setopt($ch, CURLOPT_TIMEOUT, 30);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); // 302 redirect
curl_setopt($ch, CURLOPT_MAXREDIRS, 7);
curl_multi_add_handle($mh, $ch); // Put curl resource Put in multi curl handler Li
$handle[$i++] = $ch;
}
/* Execute */
do {
curl_multi_exec($mh, $running);
if ($wait_usec > 0) /* Each connect How long is the interval */
usleep($wait_usec); // 250000 = 0.25 sec
} while ($running > 0);
/* Read data */
foreach($handle as $i => $ch) {
$content = curl_multi_getcontent($ch);
$data[$i] = (curl_errno($ch) == 0) ? $content : false;
}
/* Remove handle*/
foreach($handle as $ch) {
curl_multi_remove_handle($mh, $ch);
}
curl_multi_close($mh);
return $data;
}
?>
In order to avoid randomness, I tested it five times (using CTRL+F5 forced refresh), and the data are as follows:
curl_init ():
第1次 | 第2次 | 第3次 | 第4次 | 第5次 | 平均 | |
耗时(ms) | 3724 | 3615 | 2540 | 1957 | 2794 | 2926 |
curl_multi_init ():
第1次 | 第2次 | 第3次 | 第4次 | 第5次 | 平均 | |
耗时(ms) | 4275 | 2912 | 3691 | 4198 | 3891 | 3793 |
From the test results, we found that the time difference between the two methods is not much, only over 700 milliseconds. Many people thought that multithreading would take much less time than single threading, but this is not the case. From the data point of view, multithreading takes 1 point more time than single threading. However, for some transactions, multi-threaded processing is necessarily in pursuit of speed, which should be noted.
About
curl_multi_init()
1 Generally speaking, the thought of using
curl_multi_init()
The purpose is to request multiple url at the same time, instead of requesting one by one in turn, otherwise,
curl_init()
It's over.
However, when using curl_multi, you may encounter such phenomena as excessive consumption of cpu and fake animation of web pages. You can see "PHP uses curl_multi_select to solve the problem of fake animation of curl_multi web pages"
The steps to use curl_multi are summarized as follows:
Step 1: Callcurl_multi_init()
0
Step 2: Loop call
curl_multi_add_handle
What should be noted in this step is that,
curl_multi_add_handle
The second parameter of is the child handle from curl_init.
Step 3: Continue to call
curl_multi_exec
Step 4: Loop the call as needed
curl_multi_getcontent
Get results
Step 5: Call
curl_multi_remove_handle
And call curl_close for each word handle
Step 6: Call
curl_multi_close
Explanation of each function:
curl_multi_init()
Initializes 1 curl batch handle resource.
curl_multi_add_handle()
Add a separate curl handle resource to the curl batch session.
curl_multi_add_handle()
The function has two arguments, the first representing an curl batch handle resource and the second representing a separate curl handle resource.
curl_multi_exec()
Parses an curl batch handle,
curl_multi_exec()
The function has two arguments, the first representing a batch handle resource, and the second is a reference value that represents the number of single curl handle resources remaining to be processed.
curl_multi_remove_handle()
Removes a handle resource from the curl batch handle resource,
curl_multi_remove_handle()
The function takes two arguments, the first representing an curl batch handle resource, and the second representing a separate curl handle resource.
curl_multi_close()
Close 1 batch handle resource.
curl_multi_getcontent()
After setting up
curl_multi_init()
0
Returns the text stream of the obtained output.
curl_multi_info_read()
Gets the relevant transmission information of the currently parsed curl.
Instances
Please see in this article
curl_multi_init()
1
The writing of.
For more readers interested in PHP related contents, please check the topics of this site: "Summary of php curl Usage", "Summary of PHP Network Programming Skills", "Encyclopedia of PHP Array (Array) Operation Skills", "Summary of php String (string) Usage", "Tutorial on PHP Data Structure and Algorithm" and "Summary of json Format Data Operation Skills in PHP"
I hope this article is helpful to everyone's PHP programming.