www123 www123 - 1 month ago 17
Python Question

Mongodb, aggregate, how to suppress _id, yet keep the content inside it?

Need help with Mongodb Aggregate output format.

My data entry includes something look like this:

{'parent_id': '133', 'status_id': '209101162445115_1199071210114767', 'author_id': '10209422198664172', 'comment_published': '2016-08-15 08:57:09'}


I need to count the number of occurrence of author_ids, given a matching parent_id. I did that with aggregate:

m = collection.aggregate([{"$match": {"parent_id":"437325203079413_1543639"}},
{ "$group": {"_id": {"author_id": "$author_id"}, "count":{"$sum":1}}},
{"$project": {"_id":1, "count":1}} ]) #this line does not make any difference in the output.

page =[]
for i in m:
page.append(i)
print(page)


The output looks like this:

[{'_id': {'author_id': '10155430875324466'}, 'count': 1},
{'_id':{'author_id': '1249853341715138'}, 'count': 2},
{'_id': {'author_id': '10153804689530108'}, 'count': 1}]


I want the output to be in this format:

[{'author_id': '10155430875324466', 'count': 1},
{'author_id': '1249853341715138', 'count': 2},
{'author_id': '10153804689530108', 'count': 1}]


Or this:

[{'10155430875324466', 1},
{'1249853341715138', : 2},
{'10153804689530108', 1}]


I know a slow way of doing that in python, but I feel there should be better solutions. Is it possible to accomplish that within the aggregate query itself? Can anyone advise?

Answer

You could try this. You can use author_id as the grouping _id directly and then project the value in the _id as author_id in the final stage.

db.collection.aggregate([
    { "$match" : { "parent_id" : "437325203079413_1543639" } }, 
    { "$group" : { "_id" : "$author_id", "count": { "$sum" : 1 } } }, 
    { "$project" : { "_id" : 0, "author_id" : "$_id", "count" : 1 } } 
]);

or you can change the final $project stage as shown below.

db.collection.aggregate([
    { "$match" : { "parent_id" : "437325203079413_1543639" } }, 
    { "$group" : { "_id" : { "author_id": "$author_id"}, "count": { "$sum" : 1 } } }, 
    { "$project" : { "_id" : 0, "author_id" : "$_id.author_id", "count":1 } } 
]);