sfinnie sfinnie - 5 months ago 7
Node.js Question

Idiomatic sync of concurrent activities ('map-reduce') in node?

EDIT

Thanks to answers, I have a working version. Code at end of question; thanks to @estus and @Jared for their help.

Original Question

Working my way into Node, and trying to get a handle on concurrency. Starting with a simple example: given the name of two files, determine which is bigger. Conventional (sequential) solution:

var fs = require('fs');

var fname1=process.argv[2]
var fname2=process.argv[3]

var stats1 = fs.statSync(fname1)
size1=stats1["size"]

var stats2 = fs.statSync(fname2)
size2=stats2["size"]

if(size1 > size2) {
console.log(fname1 + " is bigger")
} else if (size2 > size1) {
console.log(fname2 + " is bigger")
} else {
console.log("The files are the same size")
}


Now suppose I want to stat the files in parallel*. I can convert the code to use the async
stat
function:

var fs = require('fs');

var fname1=process.argv[2]
var fname2=process.argv[3]

fs.stat(fname1, function doneReading(err, stats) {
size1=stats["size"]
fs.stat(fname2, function doneReading(err, stats) {
size2=stats["size"]
if(size1 > size2) {
console.log(fname1 + " is bigger")
} else if (size2 > size1) {
console.log(fname2 + " is bigger")
} else {
console.log("The files are the same size")
}
})
})


However:


  1. It's less readable;

  2. It won't scale well if I want to compare >2 files;

  3. Not sure it would even stat the files in parallel (I'm unclear atm how the background threading works).



So, to be specific, what's the idiomatic way to:


  1. Spawn multiple actions concurrently, then

  2. Use their combined results once all are complete?



Perhaps promises might be a candidate?
Promise.all
looks like the way to await all promises, but not clear how to actually use their results.

Thanks.

SOLUTION

'use strict';

const co = require('co');
const fs = require('fs-promise');

var fname1=process.argv[2]
var fname2=process.argv[3]

co(function* () {
let res = yield [fs.stat(fname1), fs.stat(fname2)];
let size1 = res[0]["size"]
let size2 = res[1]["size"]
if(size1 > size2) {
console.log(fname1 + " is bigger")
} else if (size2 > size1) {
console.log(fname2 + " is bigger")
} else {
console.log("The files are the same size")
}
})


It's very readable, succinct, and completely devoid of callback nastiness. And readily extensible to comparing n files.

--

*Yes I know there's no need to do so for this scenario; the purpose is to understand the pattern using a simple example.

Answer
fs.stat(fname1, function doneReading(err, stats) {
    ...
    fs.stat(fname2, function doneReading(err, stats) {
    ...

is still sequential and not parallel, the difference from fs.statSync is that it fs.stat is non-blocking.

The suggested 'readable' approach in modern Node are promises and co. fs.stat may be promisified (with pify or Bluebird's Promise.promisify/Promise.promisifyAll). Or some existing promisified fs package may be used, like fs-promise.

Sequential and non-blocking alternative to the code above may look like:

'use strict';

const co = require('co');
const fs = require('fs-promise');

co(function* () {
    let stat1 = yield fs.stat(fname1);
    let stat2 = yield fs.stat(fname2);
    ...
});

If we want to make it parallel, Promise.all steps in:

co(function* () {
    let [stat1, stat2] = yield [fs.stat(fname1), fs.stat(fname2)];
    // a shortcut for
    // let [stat1, stat2] = yield Promise.all([fs.stat(fname1), fs.stat(fname2)]);
    ...
});
Comments