x0v x0v - 1 year ago 61
Python Question

Delete partial data in mongoDB

I have a mongoDB collection which has a count of

372985
names, I want to delete entries after
200000
so that total number of entries after deletion reduces from
372985
to
200000


How can I do this by mongoDB query?

Usecase

My python code is unable to process huge data as per my machine configuration, So I want to reduce the size of mongo collection so that it can run in limited RAM.

If this cannot be done by mongo query, Can someone give hint for trying python to do the same.

Answer Source

You need to do it in steps, cause MongoDB needs a query to match documents to be deleted; MongoDB cannot use skip or limit when removing documents.

  1. find (the ids of) documents that you want to delete, using skip to jump to documents after 200000
  2. delete the documents that belong to the list found in 1

You can try in mongo shell:

var to_delete = db.collection.find({}, {_id : 1})
        .skip(200000)
        .toArray()
        .map(function(doc) { return doc._id; });

db.collection.remove({_id: {$in: to_delete}})
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download