Brad Parks Brad Parks - 1 month ago 10
JSON Question

Convert arbitrary output to json by column in the terminal?

I'd like to be able to pipe the output from any command line program to a command that converts it to json.

For example my unknown program could accept target columns, a delimiter and output field names

# select columns 1 and 3 from the output and convert it to simple json
netstat -a | grep CLOSE_WAIT | convert_to_json 1,3 name,other


and would generate something like so:

[
{"name": "tcp4", "other": "31"},
{"name": "tcp4", "other": "0"}
...
]


I'm looking for something that works for any program, not just
netstat
!



I'm open to installing any 3rd party tool/opensource project, and tend to run things on linux/osx - does not have to be a bash script solution, can be written in node, perl, python, etc.



EDIT: I'm of course willing to pass in any more info that'd be required to make it work, for example a delimiter or multiple delimiters - I'd just like to avoid explicit parsing in the command line, and have the tool do that.

Answer

Here's my ruby version :

#! /usr/bin/env ruby
#
# Converts stdin columns to a JSON array of hashes
#
# Installation : Save as convert_to_json, make it executable and put it somewhere in PATH. Ruby must be installed
#
# Examples :
#
# netstat -a | grep CLOSE_WAIT | convert_to_json 1,3 name,other
# ls -l | convert_to_json
# ls -l | convert_to_json 6,7,8,9
# ls -l | convert_to_json 6,7,8,9 month,day,time,name
# convert_to_json 1,2 time,value ";" < some_file.csv
#
#
# http://stackoverflow.com/questions/40246134/convert-arbitrary-output-to-json-by-column-in-the-terminal

require 'json'

script_name = File.basename(__FILE__)
syntax = "Syntax : command_which_outputs_columns | #{script_name} column1_id,column2_id,...,columnN_id column1_name,column2_name,...,columnN_name delimiter"


if $stdin.tty? or $stdin.closed? then
  $stderr.puts syntax
else
  if ARGV[2]
    delimiter = ARGV[2]
    $stderr.puts "#{script_name} : Using #{delimiter} as delimiter"
  else
    delimiter = /\s+/
  end

  column_ids = (ARGV[0] || "").split(',').map{|column_id| column_id.to_i-1}
  column_names = (ARGV[1] || "").split(',')

  results = []
  $stdin.each do |stdin_line|
    if column_ids.empty?
      values = stdin_line.strip.split(delimiter)
    else
      values = stdin_line.strip.split(delimiter).values_at(*column_ids)
    end
    line_hash=Hash.new
    values.each_with_index.each{|value,i|
      colum_name = column_names[i] || "column#{(column_ids[i] || i)+1}"
      line_hash[colum_name]=value
    }
    results<<line_hash
  end
  puts JSON.pretty_generate(results)
end

It works as defined in your example :

netstat -a | grep CLOSE_WAIT | convert_to_json 1,3 name,other
[
  {
    "name": "tcp",
    "other": "0"
  },
  {
    "name": "tcp6",
    "other": "0"
  }
]

As a bonus, you can

  • omit to specify parameters : every column will be converted to json
  • omit to specify names : column will be called column1, column2, ...
  • choose a missing column : value will be null
  • define a delimiter as third parameter. Default is whitespace

Other examples :

netstat -a | grep CLOSE_WAIT | ./convert_to_json
# [
#   {
#     "column1": "tcp",
#     "column2": "1",
#     "column3": "0",
#     "column4": "10.0.2.15:51074",
#     "column5": "123.45.101.207:https",
#     "column6": "CLOSE_WAIT"
#   },
#   {
#     "column1": "tcp6",
#     "column2": "1",
#     "column3": "0",
#     "column4": "ip6-localhost:50293",
#     "column5": "ip6-localhost:ipp",
#     "column6": "CLOSE_WAIT"
#   }
# ]

netstat -a | grep CLOSE_WAIT | ./convert_to_json 1,3
# [
#   {
#     "column1": "tcp",
#     "column3": "0"
#   },
#   {
#     "column1": "tcp6",
#     "column3": "0"
#   }
# ]

ls -l | tail -n3 | convert_to_json 6,7,8,9 month,day,time,name
# [
#   {
#     "month": "Oct",
#     "day": "27",
#     "time": "10:35",
#     "name": "test.dot"
#   },
#   {
#     "month": "Nov",
#     "day": "2",
#     "time": "14:27",
#     "name": "uniq.rb"
#   },
#   {
#     "month": "Nov",
#     "day": "2",
#     "time": "14:27",
#     "name": "utf8_nokogiri.rb"
#   }
# ]


ls -l | tail -n3 | convert_to_json 9,12
# [
#   {
#     "column9": "test.dot",
#     "column12": null
#   },
#   {
#     "column9": "uniq.rb",
#     "column12": null
#   },
#   {
#     "column9": "utf8_nokogiri.rb",
#     "column12": null
#   }
# ]

convert_to_json 1,2 time,value ";" < some_file.csv
# convert_to_json : Using ; as delimiter
# [
#   {
#     "time": "1",
#     "value": "3"
#   },
#   {
#     "time": "2",
#     "value": "5"
#   }
# ]