Cuga Cuga - 3 months ago 12
Java Question

What explains this unexpected size of the object from MongoDb?

I'm testing the limits of storing data in Mongo.

I wrote this test class which inserts 1,000,000 random doubles into an array and stores that document a test collection.

MongoCollection<Document> collection = mongo.getCollection("TestEmbedded");
Random random = new Random();
Document document = new Document();
document.append("easyFinder", "oneMillion");
List<Double> values = new ArrayList<>(1000000);
for (int i = 0; i < 1000000; i++) {
double randomCost = 1000 * random.nextDouble();
values.add(randomCost);
}
document.append("costs", values);
collection.insertOne(document);


Fetching this object in the command line, I see the million records are stored:

db.TestEmbedded.find()
{ "_id" : ObjectId("57ac6cffc75e5e2a6ffe24cc"), "easyFinder" : "oneMillion", "costs" : [ 102.58052971628796, 522.5775655563692, 537.8794277847542, ... ]}


I'm trying to see how close I can get before hitting the 16MB limit BSON size in an effort to demonstrate why we don't store this much data in an embedded document. I know there's alternatives like 'GridFS' and better ways of modeling this data (which is what we really are doing).

But what perplexed me was the
Object.bsonsize()
operation is showing this document as taking up less than a kilobyte of space:

Object.bsonsize(db.TestEmbedded.find())
877


So what gives? Knowing Java uses 8 bytes to store a double and Mongo would have to use at least that much space per data point, why isn't this bson size closer to 8 megabytes?

Thanks!

Answer

db.TestEmbedded.find() returns not the object, but a database cursor which is of small size.

If you use Object.bsonsize(db.TestEmbedded.findOne()) instead you will receive the real document bson size.