chunk chunk result set processing in Laravel

  • 2021-10-27 06:46:46
  • OfStack

Preface

If you need to process thousands of Eloquent results, you can use the chunk command. The chunk method takes the Eloquent model of a "chunk" and populates it into a given closure for processing. Using chunk method can effectively reduce memory consumption when dealing with large data sets:


Flight::chunk(200, function ($flights) {
 foreach ($flights as $flight) {
  //
 }
});

$all_ark=Arkvolume::chunk(50000, function ($flights) {
 foreach ($flights as $flight) {
  $GLOBALS['something'][] = $flight['id'];
 }
});
 
var_dump($GLOBALS['something'] );exit;

This code is to execute one 100 pieces of data to update, and continue the other 100 pieces of data after the execution is completed...

That is to say, he operates one data block at a time instead of the whole database.

It should be noted that when using chunk with filtered conditions, if it is self-updating, you will miss 1 bit of data, and then look at the code:


User::where('approved', 0)->chunk(100, function ($users) {
 foreach ($users as $user) {
 $user->update(['approved' => 1]);
 }
});

If you want to run the above code, there will be no error, but the where condition is to filter user with approved 0 and then match the value of approved with the new one.
In this process, after the data of the first database is modified, the data of the next data block will be selected from the modified data, and the data will change at this time, and page is also added with 1. Therefore, after the execution, only one and a half of the data are updated.

If you don't understand, let's look at the underlying implementation of chunk. Taking the above code as an example, if 1 has 400 pieces of data, the data will be divided into 100 pieces.

page = 1: page is 1 at the beginning, and 1-100 pieces of data are selected for processing;

page = 2: At this time, the approved values of the first 100 data are all 1, so the data will start from Article 101 during the second screening, and page = 2 at this time, so the processed data will be the data before the 200th-300th

Still after that.


public function chunk($count, callable $callback)
{
 $results = $this->forPage($page = 1, $count)->get();
 
 while (count($results) > 0) {
  // On each chunk result set, we will pass them to the callback and then let the
  // developer take care of everything within the callback, which allows us to
  // keep the memory low for spinning through large result sets for working.
  if (call_user_func($callback, $results) === false) {
   return false;
  }
 
  $page++;
 
  $results = $this->forPage($page, $count)->get();
 }
 
 return true;
}

Problems in the Use of Laravel chunk

chunk using Laravel can be used to optimize queries with large result sets and provide a method of processing data in blocks, but the following examples will be problematic:


User::where('approved', 0)->chunk(100, function ($users) {
 foreach ($users as $user) {
 $user->update(['approved' => 1]);
 }
});

The reason is that the first query:


select * from users where approved = 0 limit 100 offset 0;

update After the approved of this batch of data is 1,

Look at the second query:


select * from users where approved = 0 limit 100 offset 100;

Because of the where approved = 0 condition and the offset starts at 100, 100 pieces of approved 0 data are actually missing.

Therefore, we should avoid using chunk when changing and filtering the values of the fields of the criteria.

Summarize


Related articles: