OkyDokyman OkyDokyman - 2 months ago 18
JSON Question

Bash with JQ grouping

I have a file with a stream of JSON objects as follows:

{"id":4496,"status":"Analyze","severity":"Critical","severityCode":1,"state":"New","code":"RNPD.DEREF","title":"Suspicious dereference of pointer before NULL check","message":"Suspicious dereference of pointer \u0027peer-\u003esctSapCb\u0027 before NULL check at line 516","file":"/home/build/branches/mmm/file1","method":"CzUiCztGpReq","owner":"unowned","taxonomyName":"C and C++","dateOriginated":1473991086512,"url":"http://xxx/yyy","issueIds":[4494]}
{"id":4497,"status":"Analyze","severity":"Critical","severityCode":1,"state":"New","code":"NPD.GEN.CALL.MIGHT","title":"Null pointer may be passed to function that may dereference it","message":"Null pointer \u0027tmpEncodedPdu\u0027 that comes from line 346 may be passed to function and can be dereferenced there by passing argument 1 to function \u0027SCpyMsgMsgF\u0027 at line 537.","file":"/home/build/branches/mmm/file1","method":"CzUiCztGpReq","owner":"unowned","taxonomyName":"C and C++","dateOriginated":1473991086512,"url":"http://xxx/yyy/zzz","issueIds":[4495]}
{"id":4498,"status":"Analyze","severity":"Critical","severityCode":1,"state":"New","code":"NPD.GEN.CALL.MIGHT","title":"Null pointer may be passed to function that may dereference it","message":"Null pointer \u0027tmpEncodedPdu\u0027 that comes from line 346 may be passed to function and can be dereferenced there by passing argument 1 to function \u0027SCpyMsgMsgF\u0027 at line 537.","file":"/home/build/branches/mmm/otherfile.c","method":"CzUiCztGpReq","owner":"unowned","taxonomyName":"C and C++","dateOriginated":1473991086512,"url":"http://xxx/yyy/zzz","issueIds":[4495]}


I would like to get with JQ (or in some other way), three lines, one each for the ids, the URLs, and the file name:

This is what I have so far:

cat /tmp/file.json | ~/bin_compciv/jq --raw-output '.id,.url,.file'


Result:

4496
http://xxx/yyy
/home/build/branches/mmm/file1
.
.
.


BUT - I would like to group them by file name, so that I will get comma-separated lists of urls and ids on the same line, like this:

4496,4497
http://xxx/yyy,http://xxx/yyy/zzz
/home/build/branches/mmm/file1

Answer

With one minor exception, you can readily achieve the stated goals using jq as follows:

jq -scr 'map({id,url,file})
  | group_by(.file)
  | .[]
  | ((map(.id) | @csv) , (map(.url) | @csv))'

Given your input, the output would be:

4496,4497
"http://xxx/yyy","http://xxx/yyy/zzz"
4498
"http://xxx/yyy/zzz"

You could then eliminate the quotation marks using a text-editing tool such as sed; using another invocation of jq; or as described below. However, this might not be such a great idea if there's ever any chance that any of the URLs contains a comma.

Here's the filter for eliminating the quotation marks with just one invocation of jq:

map({id,url,file})
| group_by(.file)
| .[]
| ((map(.id) | @csv),
   ([map(.url) | join(",")] | @csv | .[1:-1]