mwag mwag - 6 months ago 38
JSON Question

how to get the intersection of two JSON arrays using jq

Given arrays X and Y (preferably both as inputs, but otherwise, with one as input and the other hardcoded), how can I use jq to output the array containing all elements common to both? e.g. what is a value of f such that

echo '[1,2,3,4]' | jq 'f([2,4,6,8,10])'

would output



I've tried the following:

map(select(in([2,4,6,8,10]))) --> outputs [1,2,3,4]
select(map(in([2,4,6,8,10]))) --> outputs [1,2,3,4,5]


A simple and quite fast (but somewhat naive) filter that probably does essentially what you want can be defined as follows: # x and y are arrays def intersection(x;y): ( (x|unique) + (y|unique) | sort) as $sorted | reduce range(1; $sorted|length) as $i ([]; if $sorted[$i] == $sorted[$i-1] then . + [$sorted[$i]] else . end) ;

If x is provided as input on STDIN, and y is provided in some other way (e.g. def y: ...), then you could use this as: intersection(.;y)

Other ways to provide two distinct arrays as input include:

 * using the --slurp option
 * using "--arg a v" (or "--argjson a v" if available in your jq)

Here's an an even shorter def that's slower but often quite fast in practice: def i(x;y): (x|unique) as $x | (y|unique) as $y | (($x + $y) | unique) - (($x - $y) + ($y - $x));

Here's a standalone filter for finding the intersection of arbitrarily many arrays:

# Input: an array of arrays
def intersection:
  def i(y): ((unique + (y|unique)) | sort) as $sorted
  | reduce range(1; $sorted|length) as $i
       ([]; if $sorted[$i] == $sorted[$i-1] then . + [$sorted[$i]] else . end) ;
  reduce .[1:][] as $a (.[0]; i($a)) ;


[ [1,2,4], [2,4,5], [4,5,6]] #=> [4]
[[]]                         #=> []
[]                           #=> null

Of course if x and y are already known to be sorted and/or unique, more efficient solutions are possible. See in particular