Rajshri Mohan K S Rajshri Mohan K S - 2 months ago 7
Javascript Question

Mongoose: Aggregating and Counting

I have a NodeJS/MongoDB app which has data stored in a collection called

"Feedback"
. The data looks like the following:

[
{
"__v": 0,
"_id": "57d6b2d09f46ca14440ac14e",
"customerFeedback": [
{
"_id": "57d6b2d09f46ca14440ac14f",
"answer": [
{
"_id": "57d6b2d09f46ca14440ac150",
"answerValue": "cat",
"answerWeight": 0
}
],
"question": "What is your favourite thing about this shop?",
"questionId": "57d65edc0132461120fa0afd"
},
{
"_id": "57d6b2d09f46ca14440ac151",
"answer": [
{
"_id": "57d6b2d09f46ca14440ac152",
"answerValue": "Okay",
"answerWeight": 0
}
],
"question": "How was your experience today?",
"questionId": "57d69ef6dbb25611e46e6bc9"
}
],
"shopId": "SH0001",
"feedbackCreatedOn": "2016-09-12T13:51:12.703Z",
"questionsForDay": "2016-09-12T00:00:00Z"
},
{
"__v": 0,
"_id": "57d6b3389f46ca14440ac157",
"customerFeedback": [
{
"_id": "57d6b3389f46ca14440ac158",
"answer": [
{
"_id": "57d6b3389f46ca14440ac159",
"answerValue": "cat",
"answerWeight": 0
}
],
"question": "What is your favourite thing about this shop?",
"questionId": "57d65edc0132461120fa0afd"
},
{
"_id": "57d6b3389f46ca14440ac15a",
"answer": [
{
"_id": "57d6b3389f46ca14440ac15b",
"answerValue": "Very Good",
"answerWeight": 0
}
],
"question": "How was your experience today?",
"questionId": "57d69ef6dbb25611e46e6bc9"
},
{
"_id": "57d6b3389f46ca14440ac15c",
"answer": [
{
"_id": "57d6b3389f46ca14440ac15d",
"answerValue": "Cost",
"answerWeight": 0
}
],
"question": "What would you like us to improve on?",
"questionId": "57d6b32d9f46ca14440ac153"
}
],
"shopId": "SH0001",
"feedbackCreatedOn": "2016-09-12T13:52:56.939Z",
"questionsForDay": "2016-09-12T00:00:00Z"
},
{
"__v": 0,
"_id": "57d6c8eb97157f10a4e5c2e7",
"customerFeedback": [
{
"_id": "57d6c8eb97157f10a4e5c2e8",
"answer": [
{
"_id": "57d6c8eb97157f10a4e5c2ea",
"answerValue": "Customer Experience",
"answerWeight": 0
},
{
"_id": "57d6c8eb97157f10a4e5c2e9",
"answerValue": "Others",
"answerWeight": 0
}
],
"question": "What would you like us to improve on?",
"questionId": "57d6b7d99ee61e47f01e5334"
}
],
"shopId": "SH0003",
"feedbackCreatedOn": "2016-09-12T15:25:31.724Z",
"questionsForDay": "2016-09-12T00:00:00Z"
}
]


There are many such entries in the array of results, but the above data serves to illustrate it.

My question is, for a given
shopId
and
questionId
, I want to count the number of times each individual
answerValue
occurs. How do I do it?

I am able to filter the results down to the required datasets using the find method as (for example)

db.Feedback.find({shopId:"SH0001",'customerFeedback.questionId':"57d65edc0132461120fa0afd"})


But I'm not sure how to aggregate the data to the format I want.

Answer

This pipeline should give you the desired result

db.getCollection("yourCollection").aggregate([
    { $unwind: "$customerFeedback" },
    { $unwind: "$customerFeedback.answer" },
    { 
        $group: {
            _id: {
                shopId: "$shopId",
                questionId: "$customerFeedback.questionId",
                answerValue: "$customerFeedback.answer.answerValue"
            },
            count: { $sum: 1 }
        }
    }
])

which for your sample data gives the following output

/* 1 */
{
    "_id" : {
        "shopId" : "SH0003",
        "questionId" : "57d6b7d99ee61e47f01e5334",
        "answerValue" : "Others"
    },
    "count" : 1.0
}

/* 2 */
{
    "_id" : {
        "shopId" : "SH0003",
        "questionId" : "57d6b7d99ee61e47f01e5334",
        "answerValue" : "Customer Experience"
    },
    "count" : 1.0
}

/* 3 */
{
    "_id" : {
        "shopId" : "SH0001",
        "questionId" : "57d65edc0132461120fa0afd",
        "answerValue" : "cat"
    },
    "count" : 2.0
}

/* 4 */
{
    "_id" : {
        "shopId" : "SH0001",
        "questionId" : "57d69ef6dbb25611e46e6bc9",
        "answerValue" : "Okay"
    },
    "count" : 1.0
}

/* 5 */
{
    "_id" : {
        "shopId" : "SH0001",
        "questionId" : "57d6b32d9f46ca14440ac153",
        "answerValue" : "Cost"
    },
    "count" : 1.0
}

/* 6 */
{
    "_id" : {
        "shopId" : "SH0001",
        "questionId" : "57d69ef6dbb25611e46e6bc9",
        "answerValue" : "Very Good"
    },
    "count" : 1.0
}

Of course you can filter the result down with an additional $match stage if you are just interested in specific values


Update due to comment:

To filter your result down with $match you could use it after unwinding the nested array of customer feedbacks like so

...
{ $unwind: "$customerFeedback" },
{ $match: { shopId: "SH0001", "customerFeedback.questionId": "57d65edc0132461120fa0afd" } },
{ $unwind: "$customerFeedback.answer" },
{ 
    $group: {
        _id: {
            shopId: "$shopId",
            questionId: "$customerFeedback.questionId",
            answerValue: "$customerFeedback.answer.answerValue"
        },
        count: { $sum: 1 }
    }
}
...  

which would result in

{
    "_id" : {
        "shopId" : "SH0001",
        "questionId" : "57d65edc0132461120fa0afd",
        "answerValue" : "cat"
    },
    "count" : 2.0
}

If you have a large number of documents or an index on shopId and / or customerFeedback.questionId then you might want to duplicate that $match stage to the front of your pipeline so that you unwind only the documents of the corresponding shop and with at least one feedback of the desired question. So with that (optional optimization from the correctness point of view) it would look like so

...
{ $match: { shopId: "SH0001", "customerFeedback.questionId": "57d65edc0132461120fa0afd" } },
{ $unwind: "$customerFeedback" },
{ $match: { shopId: "SH0001", "customerFeedback.questionId": "57d65edc0132461120fa0afd" } },
{ $unwind: "$customerFeedback.answer" },
{ 
    $group: {
        _id: {
            shopId: "$shopId",
            questionId: "$customerFeedback.questionId",
            answerValue: "$customerFeedback.answer.answerValue"
        },
        count: { $sum: 1 }
    }
}
...