Detailed Explanation of the Implementation Method of Co process under PHP7

  • 2021-08-28 19:37:35
  • OfStack

Preface

I believe everyone has heard of the concept of "synergy".

However, some students seem to understand this concept, and don't know how to realize it, how to use it, and where to use it. Some people even think that yield is a cooperative process!

I always believe that if you can't express a knowledge point accurately, I can think you just don't understand it.

If you know how to use PHP to achieve synergy before, you must have read Bird Brother's article: Using synergy to achieve multitask scheduling in PHP

Brother Bird's article is translated from foreign authors. The translation is concise and clear, and concrete examples are given.

The purpose of writing this article is to make a more adequate supplement to Brother Bird's article. After all, some students' foundation is still not good enough, and they are also in the clouds.

What is Synergy

First, find out what is synergy.

You may have heard the concepts of "process" and "thread".

Process is binary executable file in the computer memory of a running instance, just like your. exe file is a class, process is new out of that instance.

Process is the basic unit of resource allocation and scheduling of computer system (scheduling unit here don't entangle thread process), and only one process can be processed at the same time under each CPU.

The so-called parallel, just look parallel, CPU is actually switching different processes at a very fast speed.

Process switching requires system calls, CPU to save the current process of the various information, but also make CPUCache is scrapped.

Therefore, if the process can't be switched, it will not be done as a last resort.

So how to realize "if you can't switch the process, you have to do it"?

First of all, the conditions for the process to be switched are: the process executes, the CPU time slice allocated to the process ends, the system interrupts and needs to be processed, or the process waits for necessary resources (process blocking), etc. If you think about it, there is nothing to say in the first few situations, but if you are blocking waiting, you will be wasted.

In fact, if our program is blocked, there are other executable places to execute, not 1 must be silly and so on!

So there are threads.

A thread is simply understood as a "micro-process", which runs a function (logic flow).

So we can use threads to embody the functions that can run at the same time in the process of writing programs.

There are two types of threads, one is managed and scheduled by the kernel.

We say that as long as it involves kernel participation in management and scheduling, the cost is very high. This kind of thread actually solves the problem that when an executing thread encounters blocking in a process, we can schedule another runnable thread to run, but it is still in the same process, so there is no process switching.

There is another thread whose scheduling is managed by the programmer's own program, which is invisible to the kernel. This kind of thread is called "user space thread".

Synergy can be understood as a kind of user-space thread.

Xiecheng has several characteristics:

Collaboration, because it is a scheduling policy written by the programmer himself, which switches through collaboration instead of preemption Create, switch and destroy in user mode From the programming point of view, the idea of synergy is essentially the mechanism of active surrender (yield) and recovery (resume) of control flow Iterators are often used to implement co-programming

Having said that, you should understand the basic concept of synergy?

PHP implementation of co-process

Step by step, start with explaining the concept!

Iterable object

PHP 5 provides a way to define an object so that it can be traversed through a list of cells, for example with an foreach statement.

If you want to implement an iterative object, you need to implement the Iterator interface:


<?php
class MyIterator implements Iterator
{
 private $var = array();
 public function __construct($array)
 {
  if (is_array($array)) {
   $this->var = $array;
  }
 }
 public function rewind() {
  echo "rewinding\n";
  reset($this->var);
 }
 public function current() {
  $var = current($this->var);
  echo "current: $var\n";
  return $var;
 }
 public function key() {
  $var = key($this->var);
  echo "key: $var\n";
  return $var;
 }
 public function next() {
  $var = next($this->var);
  echo "next: $var\n";
  return $var;
 }
 public function valid() {
  $var = $this->current() !== false;
  echo "valid: {$var}\n";
  return $var;
 }
}
$values = array(1,2,3);
$it = new MyIterator($values);
foreach ($it as $a => $b) {
 print "$a: $b\n";
}

Generator

It can be said that before, in order to have an object that can be traversed by foreach, you had to implement a heap method, and yield keyword is to simplify this process.

Generator provides an easier way to implement simple object iteration, which greatly reduces the performance overhead and complexity compared with the way of defining classes to implement Iterator interface.


<?php
function xrange($start, $end, $step = 1) {
 for ($i = $start; $i <= $end; $i += $step) {
  yield $i;
 }
}
foreach (xrange(1, 1000000) as $num) {
 echo $num, "\n";
}

Remember, if yield is used in a function, it is a generator. It is useless to call him directly, and it cannot be executed as a function!

Therefore, yield is yield. Next time whoever says yield is a cooperative process, I will definitely put you xxxx.

PHP co-process

As I said in the previous introduction of Synergy, Synergy requires programmers to write their own scheduling mechanism. Let's look at how to write this mechanism.

0) The generator is used correctly

Since the generator can't be called directly like function 1, how can it be called?

The method is as follows:

ES 118EN he send($value) current / next...

1) Task implementation

Task is an abstraction of a task. Just now, we said that the synergy is the user space synergy, and the thread can be understood as running a function.

Therefore, the constructor of Task accepts a closure function, which we name coroutine.


/**
 * Task Task class 
 */
class Task
{
 protected $taskId;
 protected $coroutine;
 protected $beforeFirstYield = true;
 protected $sendValue;

 /**
  * Task constructor.
  * @param $taskId
  * @param Generator $coroutine
  */
 public function __construct($taskId, Generator $coroutine)
 {
  $this->taskId = $taskId;
  $this->coroutine = $coroutine;
 }
 /**
  *  Gets the current Task Adj. ID
  * 
  * @return mixed
  */
 public function getTaskId()
 {
  return $this->taskId;
 }
 /**
  *  Judge Task Have you finished the execution yet 
  * 
  * @return bool
  */
 public function isFinished()
 {
  return !$this->coroutine->valid();
 }
 /**
  *  Set the value to be passed to the co-routine next time, such as  $id = (yield $xxxx) , this value is given $id It's over 
  * 
  * @param $value
  */
 public function setSendValue($value)
 {
  $this->sendValue = $value;
 }
 /**
  *  Run a task 
  * 
  * @return mixed
  */
 public function run()
 {
  //  Note here that the beginning of the generator will reset , so the first 1 Values need to be used current Get 
  if ($this->beforeFirstYield) {
   $this->beforeFirstYield = false;
   return $this->coroutine->current();
  } else {
   //  We talked about it, using send To call 1 Generator 
   $retval = $this->coroutine->send($this->sendValue);
   $this->sendValue = null;
   return $retval;
  }
 }
}

2) Scheduler implementation

Then there is the key core part of Scheduler, which plays the role of dispatcher.


/**
 * Class Scheduler
 */
Class Scheduler
{
 /**
  * @var SplQueue
  */
 protected $taskQueue;
 /**
  * @var int
  */
 protected $tid = 0;

 /**
  * Scheduler constructor.
  */
 public function __construct()
 {
  /*  The principle is maintenance 1 Queues, 
   *  As mentioned earlier, from a programming point of view, the idea of synergy is essentially the active surrender of control flow ( yield ) and Recovery ( resume ) Mechanism 
   * */
  $this->taskQueue = new SplQueue();
 }
 /**
  *  Increase 1 Tasks 
  *
  * @param Generator $task
  * @return int
  */
 public function addTask(Generator $task)
 {
  $tid = $this->tid;
  $task = new Task($tid, $task);
  $this->taskQueue->enqueue($task);
  $this->tid++;
  return $tid;
 }
 /**
  *  Queue a task 
  *
  * @param Task $task
  */
 public function schedule(Task $task)
 {
  $this->taskQueue->enqueue($task);
 }
 /**
  *  Run scheduler 
  */
 public function run()
 {
  while (!$this->taskQueue->isEmpty()) {
   //  Task out of the team 
   $task = $this->taskQueue->dequeue();
   $res = $task->run(); //  Run the task until  yield

   if (!$task->isFinished()) {
    $this->schedule($task); //  If the task has not been completely executed, join the team and wait for the next execution 
   }
  }
 }
}

In this way, we basically realize a co-scheduler.

You can use the following code to test:


<?php
function task1() {
 for ($i = 1; $i <= 10; ++$i) {
  echo "This is task 1 iteration $i.\n";
  yield; //  Give up voluntarily CPU The executive power of 
 }
}
function task2() {
 for ($i = 1; $i <= 5; ++$i) {
  echo "This is task 2 iteration $i.\n";
  yield; //  Give up voluntarily CPU The executive power of 
 }
}
$scheduler = new Scheduler; //  Instantiation 1 Scheduler 
$scheduler->newTask(task1()); //  Add different closure functions as tasks 
$scheduler->newTask(task2());
$scheduler->run();

The key is to say where to get PHP co-process.


function task1() {
  /*  There are 1 Remote tasks, which take time 10s , it may be 1 Remote machine crawling analysis of remote URL tasks, we only need to submit the final remote machine to get the results on the line  */
  remote_task_commit();
  //  At this time, after the request is sent, we should not wait here and give up voluntarily CPU The executive power is given to task2 Run, he does not rely on this result 
  yield;
  yield (remote_task_receive());
  ...
} 
function task2() {
 for ($i = 1; $i <= 5; ++$i) {
  echo "This is task 2 iteration $i.\n";
  yield; //  Give up voluntarily CPU The executive power of 
 }
}

In this way, the execution efficiency of the program is improved.

About the realization of "system call", Brother Bird has made it very clear, so I will not explain it here.

3) Synergetic stack

There is also an example of synergistic stack in Bird Brother's article.

As we said above, if yield is used in a function, it cannot be used as a function.

So you nest another synergetic function in one synergetic function:


<?php
function echoTimes($msg, $max) {
 for ($i = 1; $i <= $max; ++$i) {
  echo "$msg iteration $i\n";
  yield;
 }
}
function task() {
 echoTimes('foo', 10); // print foo ten times
 echo "---\n";
 echoTimes('bar', 5); // print bar five times
 yield; // force it to be a coroutine
}
$scheduler = new Scheduler;
$scheduler->newTask(task());
$scheduler->run();

echoTimes here can't be implemented! So you need a synergetic stack.

But it doesn't matter, let's change the code we just had.

Change the initialization method in Task, because when we run an Task, we need to analyze which subroutines it contains, and then save the subroutines with a stack. (C linguistics of good students can naturally understand here, do not understand students I suggest to understand the next process memory model is how to handle function calls)


 /**
  * Task constructor.
  * @param $taskId
  * @param Generator $coroutine
  */
 public function __construct($taskId, Generator $coroutine)
 {
  $this->taskId = $taskId;
  // $this->coroutine = $coroutine;
  //  With this, actually Task->run Is that stackedCoroutine This function is not $coroutine Save the closure function 
  $this->coroutine = stackedCoroutine($coroutine); 
 }

When Task- > run (), 1 loop to analyze:


/**
 * @param Generator $gen
 */
function stackedCoroutine(Generator $gen)
{
 $stack = new SplStack;
 //  Constantly traversing the incoming generator 
 for (; ;) {
  // $gen It can be understood as pointing to the currently running synergetic closure function (generator) 
  $value = $gen->current(); //  Gets the breakpoint, which is yield The value that comes out 
  if ($value instanceof Generator) {
   //  If it is, it is 1 Generator, this is the sub-co-process, the current running co-process into the stack to save 
   $stack->push($gen);
   $gen = $value; //  Give the subcoevolutionary function to gen Continue to execute, notice that the next step is to execute the subroutine process 
   continue;
  }
  //  We encapsulate the results returned by subroutines, which will be described below 
  $isReturnValue = $value instanceof CoroutineReturnValue; //  Subcoroutine return `$value` Need the help of the master program  
  if (!$gen->valid() || $isReturnValue) {
   if ($stack->isEmpty()) {
    return;
   }
   //  If it is gen Has been executed, or if a subroutine needs to return a value to the main routine for processing 
   $gen = $stack->pop(); // Out of the stack, get the main co-process saved on the stack before 
   $gen->send($isReturnValue ? $value->getValue() : NULL); //  Call the master coroutine to process the output value of the child coroutine 
   continue;
  }
  $gen->send(yield $gen->key() => $value); //  Continue to execute subroutines 
 }
}

Then we add the end sign of echoTime:


class CoroutineReturnValue {
 protected $value;
 
 public function __construct($value) {
  $this->value = $value;
 }
 //  Object that can give the output value of the sub-coroutine to the main coroutine as the main coroutine send Parameter 
 public function getValue() {
  return $this->value;
 }
}
function retval($value) {
 return new CoroutineReturnValue($value);
}

Then modify echoTimes:


<?php
function xrange($start, $end, $step = 1) {
 for ($i = $start; $i <= $end; $i += $step) {
  yield $i;
 }
}
foreach (xrange(1, 1000000) as $num) {
 echo $num, "\n";
}
0

Task becomes:


<?php
function xrange($start, $end, $step = 1) {
 for ($i = $start; $i <= $end; $i += $step) {
  yield $i;
 }
}
foreach (xrange(1, 1000000) as $num) {
 echo $num, "\n";
}
1

In this way, a synergetic stack is implemented, and now you can turn 1 against 3.

4) yield from keyword in PHP7

yield from is added to PHP7, so we don't need to implement Ctrip stack by ourselves, which is really great.

Change the constructor of Task back:


<?php
function xrange($start, $end, $step = 1) {
 for ($i = $start; $i <= $end; $i += $step) {
  yield $i;
 }
}
foreach (xrange(1, 1000000) as $num) {
 echo $num, "\n";
}
2

echoTimes function:


function echoTimes($msg, $max) {
 for ($i = 1; $i <= $max; ++$i) {
  echo "$msg iteration $i\n";
  yield;
 }
}

task1 Generator:


<?php
function xrange($start, $end, $step = 1) {
 for ($i = $start; $i <= $end; $i += $step) {
  yield $i;
 }
}
foreach (xrange(1, 1000000) as $num) {
 echo $num, "\n";
}
4

In this way, it is easy to call child routines.

Summarize

Now you should understand how to realize PHP cooperation, right?


Related articles: