ktgold ktgold - 6 months ago 61
JSON Question

JQ filtering on fields in nested Objects

I have a large set of data, I am using JQ to construct the object that contains only the data I am interested in for a record. My problem is that I am starting to see duplicate objects, it seems my syntax is incorrect.

I am working with an object that contains flat fields and an array of subObjects, there are particular fields I want to pull out and make new objects that have all the data I want. Including some flat fields and some fields from the array objects.

Here is a smaller sample that helps demonstrate problem

tmpData.json


{
"id": "0001",
"type": "donut",
"name": "Cake",
"ppu": 0.55,
"batter": [{
"id": "1001",
"type": "Regular"
},
{
"id": "1002",
"type": "Chocolate"
},
{
"id": "1003",
"type": "Blueberry"
},
{
"id": "1004",
"type": "Devil's Food"
}
]
}


I run this :
cat tmpData.txt | jq {'id: .id, type: .type, batter: .batter[].id'}


Which outputs this non-json set of objects (it's missing commas)

{
"id": "0001",
"type": "donut",
"batter": "1001"
}
{
"id": "0001",
"type": "donut",
"batter": "1002"
}
{
"id": "0001",
"type": "donut",
"batter": "1003"
}
{
"id": "0001",
"type": "donut",
"batter": "1004"
}


This is good. I now have objects each containing the parentID
0001
and the different items in the array are associated in each object.

When I run:
cat tmpData.txt | jq {'id: .id, type: .type, batterID: .batter[].id, batterType: .batter[].type'}


With the added
type
field I get a lot of duplicates that wrongly associate items

{
"id": "0001",
"type": "donut",
"batterID": "1001",
"batterType": "Regular"
}
{
"id": "0001",
"type": "donut",
"batterID": "1001",
"batterType": "Chocolate"
}
{
"id": "0001",
"type": "donut",
"batterID": "1001",
"batterType": "Blueberry"
}
{
"id": "0001",
"type": "donut",
"batterID": "1001",
"batterType": "Devil's Food"
}
{
"id": "0001",
"type": "donut",
"batterID": "1002",
"batterType": "Regular"
}
{
"id": "0001",
"type": "donut",
"batterID": "1002",
"batterType": "Chocolate"
}
{
"id": "0001",
"type": "donut",
"batterID": "1002",
"batterType": "Blueberry"
}
{
"id": "0001",
"type": "donut",
"batterID": "1002",
"batterType": "Devil's Food"
}
{
"id": "0001",
"type": "donut",
"batterID": "1003",
"batterType": "Regular"
}
{
"id": "0001",
"type": "donut",
"batterID": "1003",
"batterType": "Chocolate"
}
{
"id": "0001",
"type": "donut",
"batterID": "1003",
"batterType": "Blueberry"
}
{
"id": "0001",
"type": "donut",
"batterID": "1003",
"batterType": "Devil's Food"
}
{
"id": "0001",
"type": "donut",
"batterID": "1004",
"batterType": "Regular"
}
{
"id": "0001",
"type": "donut",
"batterID": "1004",
"batterType": "Chocolate"
}
{
"id": "0001",
"type": "donut",
"batterID": "1004",
"batterType": "Blueberry"
}
{
"id": "0001",
"type": "donut",
"batterID": "1004",
"batterType": "Devil's Food"
}


Now I see that each
batterID
is in an object with every type
regular, chocolate, blueberry
. But in fact
1002
is only ever
chocolate
.

My ideal output would be like this

[{
"id": "0001",
"type": "donut",
"batterID": "1001",
"batterType": "Regular"
},
{
"id": "0001",
"type": "donut",
"batterID": "1002",
"batterType": "Chocolate"
}]


Your expertise is appreciated!

EDIT SOLVED: working command:
cat tmpData.txt | jq '[{id, type} + (.batter[] | {batterId: .id, batterType: .type})]'

Answer Source
  1. The output "without commas" is a stream of JSON; to emit an array, wrap your jq filter in square brackets.
  2. You can abbreviate {id: id, type: .type} to {id, type}
  3. Your filter which repeats .batter[] has the effect of creating a Cartesian product. What you evidently want instead is to expand .batter just once.

Putting everything together:

[{id, type} + (.batter[] | {batterId: .id, batterType: .type})]
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download