druidvav druidvav - 3 months ago 16
HTTP Question

Lots of parallel http requests in node.js

I've created a node.js script, that scans network for available HTTP pages, so there is a lot of connections i want to run in parallel, but it seems that some of the requests wait for previous to complete.

Following is the code fragment:

var reply = { };
reply.started = new Date().getTime();
var req = http.request(options, function(res) {
reply.status = res.statusCode;
reply.rawHeaders = res.headers;
reply.headers = JSON.stringify(res.headers);
reply.body = '';
res.setEncoding('utf8');
res.on('data', function (chunk) {
reply.body += chunk;
});
res.on('end', function () {
reply.finished = new Date().getTime();
reply.time = reply.finished - reply.started;
callback(reply);
});
});
req.on('error', function(e) {
if(e.message == 'socket hang up') {
return;
}
errCallback(e.message);
});
req.end();


This code performs only 10-20 requests per second, but i need 500-1k requests performance. Every queued request is made to a different HTTP server.

I've tried to do something like that, but it didn't help:

http.globalAgent.maxSockets = 500;

Answer

I've found solution for me, it is not very good, but works:

childProcess = require('child_process')

I'm using curl:

childProcess.exec('curl --max-time 20 --connect-timeout 10 -iSs "' + options.url + '"', function (error, stdout, stderr) { }

This allows me to run 800-1000 curl processes simultaneously. Of course, this solution has it's weekneses, like requirement for lots of open file decriptors, but works.

I've tried node-curl bindings, but that was very slow too.

Comments