Matt Bryson Matt Bryson - 4 months ago 40
Node.js Question

How to execute / abort long running tasks in Node JS?

NodeJS server with a Mongo DB - one feature will generate a report JSON file from the DB, which can take a while (60 seconds up - has to process hundreds of thousands of entries).

We want to run this as a background task. We need to be able to start a report build process, monitor it, and abort it if the user decides to change the params and re build it.

What is the simplest approach with node? Don't really want to get into the realms of separate worker servers processing jobs, message queues etc - we need to keep this on the same box and fairly simple implementation.

1) Start the build as a async method, and return to the user, with socket.io reporting progress?

2) Spin off a child process for the build script?

3) Use something like https://www.npmjs.com/package/webworker-threads?

With the few approaches I've looked at I get stuck on the same two areas;

1) How to monitor progress?
2) How to abort an existing build process if the user re-submits data?

Any pointers would be greatly appreciated...

Answer

The best would be to separate this task from your main application. That said, it'd be easy to run it in the background. To run it in the background and monit without message queue etc., the easiest would be a child_process.

  1. You can launch a spawn job on an endpoint (or url) called by the user.
  2. Next, setup a socket to return live monitoring of the child process
  3. Add another endpoint to stop the job, with a unique id returned by 1. (or not, depending of your concurrency needs)

Some coding ideas:

var spawn = require('child_process').spawn

var job = null //keeping the job in memory to kill it

app.get('/save', function(req, res) {

    if(job && job.pid)
        return res.status(500).send('Job is already running').end()

    job = spawn('node', ['/path/to/save/job.js'], 
    {
        detached: false, //if not detached and your main process dies, the child will be killed too
        stdio: [process.stdin, process.stdout, process.stderr] //those can be file streams for logs or wathever
    })

    job.on('close', function(code) { 
        job = null 
        //send socket informations about the job ending
    })

    return res.status(201) //created
})

app.get('/stop', function(req, res) {
    if(!job || !job.pid)
        return res.status(404).end()

    job.kill('SIGTERM')
    //or process.kill(job.pid, 'SIGTERM')
    job = null
    return res.status(200).end()
})

app.get('/isAlive', function(req, res) {
    try {
        job.kill(0)
        return res.status(200).end()
    } catch(e) { return res.status(500).send(e).end() }
})

To monit the child process you could use pidusage, we use it in PM2 for example. Add a route to monit a job and call it every second. Don't forget to release memory when job ends.


You might want to check out this library which will help you manage multi processing across microservices.

Comments