Steven Booth Steven Booth - 15 days ago 9
Javascript Question

Post 1 million object to Mongo DB without running out of memory

For the sake of simplicity, I've used a for loop here to create an array of 1 million objects.

I'd like to post each object in the array to mongo db.

I keep running out on memory.

I know I can increase the memory allocation, but that isn't the solution I want. I don't want to use the bulk method of mongo, either, for reasons beyond the scope of this example.

I'm aware that there must be some method of posting these objects which means that memory is released at each iteration (garbage collected?) - can anyone tell me what it is?

Thanks!

Current code here for reference (which causes allocation error):

var results = [];

for(var i = 1; i < 1000001; i++){

results.push({"num":i});

}

async = require("async");
var mongodb = require('mongodb');

var MongoClient = mongodb.MongoClient;
var mongoUrl = 'mongodb://root@localhost:27017/results';

MongoClient.connect(mongoUrl, function (err, db) {

var collection = db.collection('results');

// 1st para in async.each() is the array of items
async.each(results,
// 2nd param is the function that each item is passed to
function(item, callback){
// Call an asynchronous function, often a save() to DB
collection.insert(item, function (err, result) {
if(err) console.log(err);
console.log(item);
callback();
});
},
// 3rd param is the function to call when everything's done
function(err){
// All tasks are done now
db.close();
}
);
});

Answer

async.each() runs the inserts in parallel, so it basically starts 1000000 concurrent insert operations.

You may want to limit that to, say, 100, by using async.eachLimit()