Quinton Pike Quinton Pike - 5 months ago 23
Node.js Question

Querying MongoDB GridFS?

I have a blogging system that stores uploaded files into the GridFS system. Problem is, I dont understand how to query it!

I am using Mongoose with NodeJS which doesnt yet support GridFS so I am using the actual mongodb module for the GridFS operations. There doesn't SEEM to be a way to query the files metadata like you do documents in a regular collection.

Would it be wise to store the metadata in a document pointing to the GridFS objectId? to easily be able to query?

Any help would be GREATLY appreciated, im kinda stuck :/

Answer

GridFS works by storing a number of chunks for each file. This way, you can deliver and store very large files without having to store the entire file in RAM. Also, this enables you to store files that are larger than the maximum document size. The recommended chunk size is 256kb.

The file metadata field can be used to store additional file-specific metadata, which can be more efficient than storing the metadata in a separate document. This greatly depends on your exact requirements, but the metadata field, in general, offers a lot of flexibility. Keep in mind that some of the more obvious metadata is already part of the fs.files document, by default:

> db.fs.files.findOne();
{
    "_id" : ObjectId("4f9d4172b2ceac15506445e1"),
    "filename" : "2e117dc7f5ba434c90be29c767426c29",
    "length" : 486912,
    "chunkSize" : 262144,
    "uploadDate" : ISODate("2011-10-18T09:05:54.851Z"),
    "md5" : "4f31970165766913fdece5417f7fa4a8",
    "contentType" : "application/pdf"
}

To actually read the file from GridFS you'll have to fetch the file document from fs.files and the chunks from fs.chunks. The most efficient way to do that is to stream this to the client chunk-by-chunk, so you don't have to load the entire file in RAM. The chunks collection has the following structure:

> db.fs.chunks.findOne({}, {"data" :0});
{
    "_id" : ObjectId("4e9d4172b2ceac15506445e1"),
    "files_id" : ObjectId("4f9d4172b2ceac15506445e1"),
    "n" : 0, // this is the 0th chunk of the file
    "data" : /* loads of data */
}

If you want to use the metadata field of fs.files for your queries, make sure you understand the dot notation, e.g.

> db.fs.files.find({"metadata.OwnerId": new ObjectId("..."), 
                    "metadata.ImageWidth" : 280});

also make sure your queries can use an index using explain().