php implementation code for reading large files

  • 2020-05-27 04:38:17
  • OfStack

In php, the fastest way to read a file is to use some functions such as file, file_get_contents, and so on. A few lines of code can do what we need very nicely. However, when the operating file is a relatively large file, these functions may be inadequate, the following will start with a requirement to explain the common operation method for reading large files.
demand

There is a log file of 800M, which has more than 5 million lines, and php returns the last few lines.

Implementation method

1. Directly use file function to operate

Note: as file function is all content of one sex will read into memory, and php written 1 in order to prevent some poor program takes up too much memory and lead to system memory is insufficient, the server downtime, so by default only maximum 16 M USES memory, this is done through php. ini memory_limit = 16 M to set, in this value if set to 1, the memory usage is not restricted.

The following is a snippet of code using file to fetch the last line of the file.
The entire code takes 116.9613 (s) to complete.
 
$fp = fopen($file, "r"); 
$num = 10; 
$chunk = 4096; 
$fs = sprintf("%u", filesize($file)); 
$max = (intval($fs) == PHP_INT_MAX) ? PHP_INT_MAX : filesize($file); 
for ($len = 0; $len < $max; $len += $chunk) { 
$seekSize = ($max - $len &gt; $chunk) ? $chunk : $max - $len; 
fseek($fp, ($len + $seekSize) * -1, SEEK_END); 
$readData = fread($fp, $seekSize) . $readData; 

if (substr_count($readData, "\n") &gt;= $num + 1) { 
preg_match("!(.*?\n){".($num)."}$!", $readData, $match); 
$data = $match[0]; 
break; 
} 
} 
fclose($fp); 
echo $data; 

My machine is 2 G memory, when press F5 operation, system directly gray, nearly 20 minutes to recover, visible all such a big file read into memory directly, how much is serious, so not all may not, memory_limit this things can't be too high, otherwise just make a phone call to the room, let the reset machine.

2. Directly call linux's tail command to display the last few lines

From the linux command line, you can use tail-n 10 access.log to easily display the last few lines of the log file. You can use php to invoke the tail command.
The entire code execution takes 0.0034 (s)
 
file = 'access.log'; 
$file = escapeshellarg($file); //  Secure escape of command line arguments  
$line = `tail -n 1 $file`; 
echo $line; 


3. Directly use fseek of php for file operation

This is the most common method, which does not need to read all the contents of the file into the content, but operates directly through the pointer, so the efficiency is quite efficient. There are many different methods to operate the file using fseek, and the efficiency may be slightly different.

Method 1
First, find the last bit of EOF through fseek, then find the starting position of the last row, take the data of this row, find the starting position of this row again, take the position of this row, and so on, until you find the $num row.
The implementation code is as follows
The entire code execution takes 0.0095 (s)
 
function tail($fp,$n,$base=5) 
{ 
assert($n>0); 
$pos = $n+1; 
$lines = array(); 
while(count($lines)< =$n){ 
try{ 
fseek($fp,-$pos,SEEK_END); 
} catch (Exception $e){ 
fseek(0); 
break; 
} 
$pos *= $base; 
while(!feof($fp)){ 
array_unshift($lines,fgets($fp)); 
} 
} 
return array_slice($lines,0,$n); 
} 
var_dump(tail(fopen("access.log","r+"),10)); 

Method 2
The method of fseek is still used to read from the end of the file, but instead of 1 bit and 1 bit, it is 1 block and 1 block. For each block of data read, the read data will be put into 1 buf, and then the number of line feed characters (\n) will be used to judge whether the last $num line data has been read.
The implementation code is as follows
The entire code takes 0.0009(s) to complete.
 
$fp = fopen($file, "r"); 
$line = 10; 
$pos = -2; 
$t = " "; 
$data = ""; 
while ($line > 0) { 
while ($t != "\n") { 
fseek($fp, $pos, SEEK_END); 
$t = fgetc($fp); 
$pos --; 
} 
$t = " "; 
$data .= fgets($fp); 
$line --; 
} 
fclose ($fp); 
echo $data 

Methods 3
The entire code execution takes 0.0003(s)
 
ini_set('memory_limit','-1'); 
$file = 'access.log'; 
$data = file($file); 
$line = $data[count($data)-1]; 
echo $line; 

Related articles: