dimid dimid - 1 month ago 5
JSON Question

Filter only specific keys from an external file in jq

I have a JSON file with the following format:

[
{
"id": "00001",
"attr": {
"a": "foo",
"b": "bar",
...
}
},
{
"id": "00002",
"attr": {
...
},
...
},
...
]


and a text file with a list of ids, one per line. I'd like to use
jq
to filter only the records whose ids are mentioned in the text file. I.e. if the list contains "00001", only the first one should be printed.

Note, that I can't simply
grep
since each record may have an arbitrary number of attributes and sub-attributes.

Answer

There are basically two ways to proceed:

  1. read the file of ids from STDIN
  2. read the JSON from STDIN

Both are feasible, but here we illustrate (2) as it leads to a simple but efficient solution.

Suppose the JSON file is named in.json and the list of ids is in a file named ids.txt like so:

00001
00010

Notice that this file has no quotation marks. If it does, then the following can be significantly simplified as shown in the postscript.

The trick is to convert ids.txt into a JSON array. With the above assumption about quotation marks, this can be done by:

jq -R . ids.txt | jq -s .

Assuming a reasonable shell, a simple solution is now at hand:

jq --argjson ids "$(jq -R . ids.txt | jq -s .)" '
  map( select( .id as $id | $ids | index($id) ))' in.json

Faster

Assuming your jq has any/2, then a simpler and more efficient solution can be obtaining by defining:

def isin($a): . as $in | any($a[]; $in == .);

The required jq filter is then just:

map( select( .id | isin($ids) ) )

If these two lines of jq are put into a file named select.jq, the required incantation is simply:

jq --argjson ids "$(jq -R . ids.txt | jq -s)" -f select.jq in.json

Postscript

If the index file consists of a stream of valid JSON texts (e.g., strings with quotation marks) and if your jq supports the --slurpfile option, the invocation can be further simplified to:

jq --slurpfile ids ids.txt -f select.jq in.json 

Or if you want everything as a one-liner:

jq --slurpfile ids ids.txt 'map(select(.id as $id|any($ids[];$id==.)))' in.json 
Comments