user2183592 user2183592 - 4 months ago 49
Python Question

Workaround for preserveNullAndEmptyArrays in MongoDB 2.6

I am using a python script to query a MongoDB collection. The collection contains embedded documents with varying structures.

I am trying to simply "$unwind" an array contained in several documents. However, the array is not in ALL documents.

That means only the documents that contain the field are returned, the others are ignored. I am using Pymongo2.6 so I am unable to use "preserveNullAndEmptyArrays".

Is there a workaround to this? Something along the lines of "if the field path exists, unwind".

Manual: https://docs.mongodb.com/manual/reference/operator/aggregation/unwind/

If you specify a path for a field that does not exist in an input document or the field is an empty array, $unwind, by default, ignores the input document and will not output documents for that input document.

New in version 3.2: To output documents where the array field is missing, null or an empty array, use the option preserveNullAndEmptyArrays.

preserveNullAndEmptyArrays boolean
Optional. If true, if the path is null, missing, or an empty array, $unwind outputs the document. If false, $unwind does not output a document if the path is null, missing, or an empty array.

The default value is false.*

UPDATE:

The structure of documents and code in question is outlined in detail in this separate but related question I asked earlier: Project different embedded structures to same name

ISSUE:

I am trying to "$unwind" the value of $hostnames.name. However, since the path doesn't exist in all documents, this results in several ignored documents.

Structure 1 Hostname stored as $hostnames.name

{
"_id" : "192.168.1.1",
"addresses" : {
"ipv4" : "192.168.1.1"
},
"hostnames" : [
{
"type" : "PTR",
"name" : "example.hostname.com"
}
]
}


Structure 2 Hostname stored as $hostname

{
"_id" : "192.168.2.1",
"addresses" : {
"ipv4" : "192.168.2.1"
},
"hostname" : "helloworld.com",

}


Script

cmp = db['computers'].aggregate([
{"$project": {
"u_hostname": {
"$ifNull": [
"$hostnames.name",
{ "$map": {
"input": {"$literal": ["A"]},
"as": "el",
"in": "$hostname"
}}
]
},
"_id": 0,
"u_ipv4": "$addresses.ipv4"
}},
{"$unwind": "$u_hostname"}
])


UPDATE 2:

The answer provided by SSDMS is very close but is not fully working.

I am still missing all documents that have an empty array for $hostnames.

Here is the structure of the documents that are still missing.

Structure 3

{
"_id" : "192.168.1.1",
"addresses" : { "ipv4" : "192.168.1.1" },
"hostnames" : [], }
}

Answer

We can still preserve all the documents where the array field is missing by playing with the $ifNull operator and use a logical $condition processing to assign a value to the newly computed field.

The condition here is $eq which returns True if the field is [None] or False when the condition expression evaluates to false.

cmp = db['computers'].aggregate(
    [
        {"$project":{ 
            "u_ipv4": "$addresses.ipv4",
            "u_hostname": {
                "$let": {
                    "vars": {
                        "hostnameName": {
                            "$cond": [
                                {"$eq": ["$hostnames", []]},
                                [None], 
                                {"$ifNull": ["$hostnames.name", [None]]}
                            ]
                        }, 
                        "hostname": {"$ifNull": ["$hostname", None]}
                    }, 
                    "in": {
                        "$cond": [ 
                            {"$eq": ["$$hostnameName", [None]]}, 
                            {"$map": {
                                "input": {"$literal": [None]}, 
                                "as": "el", 
                                "in": "$$hostname"
                            }}, 
                            "$$hostnameName"
                        ]
                    }
                }
            }
        }},        
        { "$unwind": "$u_hostname" }
    ]
)