Miyud Miyud - 2 months ago 8
Node.js Question

Node.js cluster - optimal number of workers

I have 4 cores and ran this code according to this example :

var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length;

var id = 0;
if (cluster.isWorker) {
id = cluster.worker.id;
}

var iterations = 1000000000;
console.time('Function #' + id);
for (var i = 0; i < iterations; i++) {
var test = 0;
}
console.timeEnd('Function #' + id);

if (cluster.isMaster) {
// Fork workers.
for (var i = 0; i < numCPUs; i++) {
cluster.fork();
}
}


With 4 fork (the code above), I got :


Function #0: 1698.801ms

Function #1: 3282.679ms

Function #4: 3290.384ms

Function #3: 3425.090ms

Function #2: 3424.922ms


With 3 fork, I got :


Function #0: 1695.155ms

Function #2: 1822.867ms

Function #3: 2444.156ms

Function #1: 2606.680ms


With 2 fork, I got :


Function #0: 1684.929ms

Function #1: 1682.897ms

Function #2: 1686.123ms


I don't understand these results. Isn't 1 fork/core the optimal number ? Here I see that 4 fork is not better than 2 fork.

Answer

My guess is that your hardware actually only has 2 physical cores. However, because of hyper-threading (HT), the OS will say that there are 4 (logical) cores present.

The workers in your code keep a (physical) core entirely occupied, which is something that HT can't deal with very well, so the performance when keeping all 4 logical cores busy will be worse than when you keep only the 2 physical cores busy.

My hardware (quad core, so 4 physical and 8 logical cores) shows the same pattern:

  • 8 workers:

    Function #5: 926ms
    Function #3: 916ms
    Function #1: 928ms
    Function #4: 895ms
    Function #7: 934ms
    Function #6: 905ms
    Function #8: 928ms
    Function #2: 928ms
    
  • 4 workers:

    Function #3: 467ms
    Function #2: 467ms
    Function #1: 473ms
    Function #4: 472ms
    

That said, the rule of thumb of making the number of workers equivalent to the number of logical cores in your hardware still makes sense if your workers are I/O bound (which most Node apps are).

If you really want to perform heavy, blocking, calculations, count one physical core per worker.

Comments