student student - 28 days ago 7
Node.js Question

What is the fix unwind is doing in mongodb?


Note: I've added screenshots by executing code on my PC. I've seen this multiple times but I'm unable to explain this at least to myself


In this MongoDB
$unwind
for nodejs
11:10 minute tutorial - the speaker says that:


This query:




db.companies.aggregate([
{ $match: {"funding_rounds.investments.financial_org.permalink": "greylock" } },
{ $project: {
_id: 0,
name: 1,
amount: "$funding_rounds.raised_amount",
year: "$funding_rounds.funded_year"
} }
])



produces documents that have arrays for both amount and year.

documents that have arrays for both amount and year


Because we're accessing the raised amount and the funded year for every element within the funding rounds array. To fix this, we can include an unwind stage before our project stage in this aggregation pipeline, and parameterize this by saying that we want to
unwind
the funding rounds array:




db.companies.aggregate([
{ $match: {"funding_rounds.investments.financial_org.permalink": "greylock" } },
{ $unwind: "$funding_rounds" },
{ $project: {
_id: 0,
name: 1,
amount: "$funding_rounds.raised_amount",
year: "$funding_rounds.funded_year"
} }
])



unwind has the effect of outputting to the next stage more documents than it receives as input


unwind
has the effect of outputting to the next stage more documents than it receives as input.


My confusion:


  • What problem the speaker is referring to at 1:21 minute?

  • What fix is he referring to?


Answer

The problem is that we need a single document for each set of [company, amount, year] where the amount and year are each only one scalar value. The input that we have has arrays, and the $unwind converts those to single values, as explained in the lecture.

The lecture doesn't give a specific problem that this is trying to solve, but let me suggest that it would be something like: "Give me a report showing how much Greylock put into funding each year. Then give me a report for each year showing the companies funded, ordered by amount provided." If you think about how to produce that data using the aggregation framework, you can see that those reports will require a set of documents as shown in the lecture.