deformitysnot deformitysnot - 1 year ago 70
Java Question

Does mongo-hadoop support replacing documents?

I'm trying to replace (not update with $set) documents in MongoDB using mongo-hadoop in Spark (mongo-hadoop-core-1.4.2.jar & mongo-java-driver-3.2.1.jar) :

BasicDBObject query = new BasicDBObject();
query.append("_id", 6972);

BasicDBObject update = new BasicDBObject();
update.append("_id", 6988);
update.append("f1", "ACTIVE_USER");

Then I'm writing something like this :

new MongoUpdateWritable(query, update, false, true);

But this is failing with :

Caused by: java.lang.IllegalArgumentException: Invalid BSON field name f1

I can do :

new MongoUpdateWritable(query, new BasicDBObject("$set", update), false, true);

But I want to replace the whole document.

Answer Source

mongo-hadoop added support for replacing documents in 2.0.0-rc0. From then on, to replace a document entirely, you'll have to use

new MongoUpdateWritable(query, update, false/*or true*/, false, true)

Note: you can't use replacing and multiUpdate = true together, and replacing also doesn't change the _id value - see Replace a Document Entirely:

The update() method does not replace the _id value.
update() cannot update multiple documents.

Looking at the code, that doesn't seem to be possible: MongoOutputCommitter always issues an update command (that can only contain $operators) - see line 155 and below. They use BulkUpdateRequestBuilder's update / updateOne methods, but to issue a replace command they need to use replaceOne method.
So, no such feature.
Maybe you'll make a pull request :)