Use cluster to extend your Node server to a multi threaded server

  • 2020-03-30 04:16:38
  • OfStack

Nodejs users all know that node is single-threaded, which means it runs on an 8-core CPU and can only use one core's power.
Single threading has long been a problem in node, but with the introduction of cluster in version 0.6, this has changed and developers can easily scale their node servers to multi-threaded servers with cluster.

What is a Cluster

Cluster is a multi-threaded library provided by node. Users can use it to create multiple threads, which share a listening port. When there is an external request to this port, the cluster will forward the request to a random thread. Because each node thread takes up tens of megabytes of memory, you can't create one thread per request, as PHP does, and you generally don't create more than the CPU's core.


var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length; if (cluster.isMaster) {
  // Fork workers.
  for (var i = 0; i < numCPUs; i++) {
    cluster.fork();
  }   cluster.on('exit', function(worker, code, signal) {
    console.log('worker ' + worker.process.pid + ' died');
  });
} else {
  // Workers can share any TCP connection
  // In this case its a HTTP server
  http.createServer(function(req, res) {
    res.writeHead(200);
    res.end("hello worldn");
  }).listen(8000);
}

As shown in the above code, cluster.ismaster is set to true when the program runs, and when cluster.fork() is called, a thread is created and rerun, and cluster.ismaster is set to false. We mainly use this variable to determine whether the current thread is a child thread.

You can also notice that after each child thread is created, it listens to port 8000 without causing a conflict, which is the function of the cluster Shared port.

Communication between threads

When threads are created, they do not share memory or data with each other. All data exchange can only be handled by worker.send and worker.on('message',handler) in the main thread. Here is an example of a broadcast system.


var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length; if (cluster.isMaster) {   var workers=[];
  //The new worker < br / >   function newWorker(){
    var worker=cluster.fork();     //The listening message, if type is broadcast, is determined to be broadcast
    worker.on('message', function(msg) {
      if(msg.type=='broadcast'){
        var event=msg.event;
        //Send this message to all workers
        workers.forEach(function(worker){
          worker.send(event);
        })
      }
    });
    return worker;
  }   for (var i = 0; i < numCPUs; i++) {
    workers.push(newWorker());
  }     cluster.on('online',function(worker){
        console.log('worker %d is online',worker.id);
    })
} else {
  var worker=cluster.worker;   //Broadcast is to send a message of type broadcast, and an event is to broadcast
  worker.broadcast=function(event){
    worker.send({
      type:'broadcast',
      event:event
    });
  }   //Here, the worker.on does not seem to be able to listen for the returned message
  process.on('message',function(event){
    console.log('worker: '+worker.id+' recived event from '+event.workerId);
  })   //Send broadcast
  worker.broadcast({
    message:'online',
    workerId:worker.id
  })
}

Something to be aware of

As mentioned above, data cannot be Shared between threads, and all data exchanges can only be exchanged through communication between threads. And the data being exchanged is serializable, so functions, file descriptors, HttpResponse and the like cannot be passed.

If cluster is used, the problem of data exchange needs to be taken into account in the program design. My own method is to store the data like session in redis, and each thread does the work of access, and all the data is not put in node memory.

Finally, cluster is still officially marked as Experimental by Node, and the API may change in the future.


Related articles: