HattrickNZ HattrickNZ - 4 months ago 8
JSON Question

converting a csv file to json + python with specific json format

can I convert a csv file into json as follows:

csv = headers in line1 with values below

json =

[{"key1":"value1",...},{"key1":"value2",...}...]


This is my csv file:

$ cat -v head_data.csv
"Rec Open Date","MSISDN","IMEI","Data Volume (Bytes)","Device Manufacturer","Device Model","Product Description"
"2016-05-30","686","230","63979","Samsung SM-G935FD ","Samsung SM-G935FD","$29.95 Carryover Plan (1GB)"
"2016-05-30","533","970","171631866","Apple iPhone 6 (A1586)","iPhone 6 (A1586)","$69.95 Plan"
"2016-05-30","191","610","145713","Samsung GT-I9195","Samsung GT-I9195","$29.95 Plan"
"2016-05-30","660","660","2994742","Samsung SM-N920I","Samsung SM-N920I","GOVERNMENT TIER 2 PLAN"
"2016-05-30","182","970","37799939","Samsung SM-J200Y","Samsung SM-J200Y","PREPAY PLUS - $0 -"
"2016-05-30","993","360","14096114","Samsung SM-A300Y","Samsung SM-A300Y","$39.95 Carryover Plan"
"2016-05-30","894","730","9851177","Samsung GT-N7105","Samsung GT-N7105","PREPAY STD - $0 - #2"
"2016-05-30","600","070","18420650","Apple iPhone 5C (A1529)","Apple iPhone 5C (A1529)","PREPAY PLUS - $0 -"
"2016-05-30","234","000","1769661","Galaxy S7 SM-G930F ","Galaxy S7 SM-G930F","$39.95 Plan"


This is my script:

$ cat csv_to_json.py

#!/usr/bin/python

#from here
#http://stackoverflow.com/a/7550352/2392358

import csv, json
csvreader = csv.reader(open('head_data.csv', 'rb'), delimiter='\t',
quotechar='"')
data = []
for row in csvreader:
r = []
for field in row:
if field == '': field = None
else: field = unicode(field, 'ISO-8859-1')
r.append(field)
data.append(r)
jsonStruct = {
'header': data[0],
'data': data[1:]
}
open('head_data.json', 'wb').write(json.dumps(jsonStruct))


Runnning my script and output

$ python csv_to_json.py


$ cat -v head_data.json
{"header": ["Rec Open Date,\"MSISDN\",\"IMEI\",\"Data Volume (Bytes)\",\"Device Manufacturer\",\"Device Model\",\"Product Description\""], "data": [["2016-05-30,\"686\",\"230\",\"63979\",\"Samsung SM-G935FD \",\"Samsung SM-G935FD\",\"$29.95 Carryover Plan (1GB)\""], ["2016-05-30,\"533\",\"970\",\"171631866\",\"Apple iPhone 6 (A1586)\",\"iPhone 6 (A1586)\",\"$69.95 Plan\""], ["2016-05-30,\"191\",\"610\",\"145713\",\"Samsung GT-I9195\",\"Samsung GT-I9195\",\"$29.95 Plan\""], ["2016-05-30,\"660\",\"660\",\"2994742\",\"Samsung SM-N920I\",\"Samsung SM-N920I\",\"GOVERNMENT TIER 2 PLAN\""], ["2016-05-30,\"182\",\"970\",\"37799939\",\"Samsung SM-J200Y\",\"Samsung SM-J200Y\",\"PREPAY PLUS - $0 -\""], ["2016-05-30,\"993\",\"360\",\"14096114\",\"Samsung SM-A300Y\",\"Samsung SM-A300Y\",\"$39.95 Carryover Plan\""], ["2016-05-30,\"894\",\"730\",\"9851177\",\"Samsung GT-N7105\",\"Samsung GT-N7105\",\"PREPAY STD - $0 - #2\""], ["2016-05-30,\"600\",\"070\",\"18420650\",\"Apple iPhone 5C (A1529)\",\"Apple iPhone 5C (A1529)\",\"PREPAY PLUS - $0 -\""], ["2016-05-30,\"234\",\"000\",\"1769661\",\"Galaxy S7 SM-G930F \",\"Galaxy S7 SM-G930F\",\"$39.95 Plan\""]]}


Can i slightly modify the code so that I can get output like this:

[{"Rec Open Date":"2016-07-03","MSISDN":540,"IMEI":990,"Data Volume (Bytes)":36671453,"Device Manufacturer":"HUAWEI Technologies Co Ltd","Device Model":"H1512","Product Description":"PREPAY PLUS - $0 -"},
{"Rec Open Date":"2016-07-03","MSISDN":334,"IMEI":340,"Data Volume (Bytes)":129835114,"Device Manufacturer":"Apple Inc","Device Model":"Apple iPhone S (A1530)","Product Description":"$29.95 Plan"},
{"Rec Open Date":"2016-07-03","MSISDN":133,"IMEI":870,"Data Volume (Bytes)":42213030,"Device Manufacturer":"Apple Inc","Device Model":"Apple iPhone 6 Plus (A1524)","Product Description":"$49.95 Plan"}]


related Q here and here

edit1 found this here but this does the conversion in the browser and I think it uses js.

EDIT2 - based on the answer below this is what I want



This is the file I want to convert

$ cat -v head_data.csv
"Rec Open Date","MSISDN","IMEI","Data Volume (Bytes)","Device Manufacturer","Device Model","Product Description"
"2016-05-30","686","230","63979","Samsung SM-G935FD ","Samsung,A, SM-G935FD","$29.95 Carryover Plan (1GB)"
"2016-05-30","533","970","171631866","Apple iPhone 6 (A1586)","iPhone 6 (A1586)","$69.95 Plan"
"2016-05-30","191","610","145713","Samsung GT-I9195","Samsung GT-I9195","$29.95 Plan"
"2016-05-30","660","660","2994742","Samsung SM-N920I","Samsung SM-N920I","GOVERNMENT TIER 2 PLAN"
"2016-05-30","182","970","37799939","Samsung SM-J200Y","Samsung SM-J200Y","PREPAY PLUS - $0 -"
"2016-05-30","993","360","14096114","Samsung SM-A300Y","Samsung SM-A300Y","$39.95 Carryover Plan"
"2016-05-30","894","730","9851177","Samsung GT-N7105","Samsung GT-N7105","PREPAY STD - $0 - #2"
"2016-05-30","600","070","18420650","Apple iPhone 5C (A1529)","Apple iPhone 5C (A1529)","PREPAY PLUS - $0 -"
"2016-05-30","234","000","1769661","Galaxy S7 SM-G930F ","Galaxy S7 SM-G930F","$39.95 Plan"


This is the script:

$ cat -v csv_to_json2.py
#!/usr/bin/python

#from here
#http://stackoverflow.com/a/38193687/2392358

import csv
import json
from collections import OrderedDict

dR=csv.DictReader(open("head_data.csv"))
oD=[ OrderedDict(
sorted(dct.iteritems(),
key=lambda item:dR.fieldnames.index(item[0])))
for dct in dR ]

#print to terminal
print json.dumps(oD)

#write to file
#json.dump(oD,"head_op.json")
open('head_op.json', 'wb').write(json.dumps(oD))


Running the script:

$ python csv_to_json2.py
[{"Rec Open Date": "2016-05-30", "MSISDN": "686", "IMEI": "230", "Data Volume (Bytes)": "63979", "Device Manufacturer": "Samsung SM-G935FD ", "Device Model": "Samsung,A, SM-G935FD", "Product Description": "$29.95 Carryover Plan (1GB)"}, {"Rec Open Date": "2016-05-30", "MSISDN": "533", "IMEI": "970", "Data Volume (Bytes)": "171631866", "Device Manufacturer": "Apple iPhone 6 (A1586)", "Device Model": "iPhone 6 (A1586)", "Product Description": "$69.95 Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "191", "IMEI": "610", "Data Volume (Bytes)": "145713", "Device Manufacturer": "Samsung GT-I9195", "Device Model": "Samsung GT-I9195", "Product Description": "$29.95 Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "660", "IMEI": "660", "Data Volume (Bytes)": "2994742", "Device Manufacturer": "Samsung SM-N920I", "Device Model": "Samsung SM-N920I", "Product Description": "GOVERNMENT TIER 2 PLAN"}, {"Rec Open Date": "2016-05-30", "MSISDN": "182", "IMEI": "970", "Data Volume (Bytes)": "37799939", "Device Manufacturer": "Samsung SM-J200Y", "Device Model": "Samsung SM-J200Y", "Product Description": "PREPAY PLUS - $0 -"}, {"Rec Open Date": "2016-05-30", "MSISDN": "993", "IMEI": "360", "Data Volume (Bytes)": "14096114", "Device Manufacturer": "Samsung SM-A300Y", "Device Model": "Samsung SM-A300Y", "Product Description": "$39.95 Carryover Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "894", "IMEI": "730", "Data Volume (Bytes)": "9851177", "Device Manufacturer": "Samsung GT-N7105", "Device Model": "Samsung GT-N7105", "Product Description": "PREPAY STD - $0 - #2"}, {"Rec Open Date": "2016-05-30", "MSISDN": "600", "IMEI": "070", "Data Volume (Bytes)": "18420650", "Device Manufacturer": "Apple iPhone 5C (A1529)", "Device Model": "Apple iPhone 5C (A1529)", "Product Description": "PREPAY PLUS - $0 -"}, {"Rec Open Date": "2016-05-30", "MSISDN": "234", "IMEI": "000", "Data Volume (Bytes)": "1769661", "Device Manufacturer": "Galaxy S7 SM-G930F ", "Device Model": "Galaxy S7 SM-G930F", "Product Description": "$39.95 Plan"}]


This is the output:

$ cat -v head_op.json
[{"Rec Open Date": "2016-05-30", "MSISDN": "686", "IMEI": "230", "Data Volume (Bytes)": "63979", "Device Manufacturer": "Samsung SM-G935FD ", "Device Model": "Samsung,A, SM-G935FD", "Product Description": "$29.95 Carryover Plan (1GB)"}, {"Rec Open Date": "2016-05-30", "MSISDN": "533", "IMEI": "970", "Data Volume (Bytes)": "171631866", "Device Manufacturer": "Apple iPhone 6 (A1586)", "Device Model": "iPhone 6 (A1586)", "Product Description": "$69.95 Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "191", "IMEI": "610", "Data Volume (Bytes)": "145713", "Device Manufacturer": "Samsung GT-I9195", "Device Model": "Samsung GT-I9195", "Product Description": "$29.95 Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "660", "IMEI": "660", "Data Volume (Bytes)": "2994742", "Device Manufacturer": "Samsung SM-N920I", "Device Model": "Samsung SM-N920I", "Product Description": "GOVERNMENT TIER 2 PLAN"}, {"Rec Open Date": "2016-05-30", "MSISDN": "182", "IMEI": "970", "Data Volume (Bytes)": "37799939", "Device Manufacturer": "Samsung SM-J200Y", "Device Model": "Samsung SM-J200Y", "Product Description": "PREPAY PLUS - $0 -"}, {"Rec Open Date": "2016-05-30", "MSISDN": "993", "IMEI": "360", "Data Volume (Bytes)": "14096114", "Device Manufacturer": "Samsung SM-A300Y", "Device Model": "Samsung SM-A300Y", "Product Description": "$39.95 Carryover Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "894", "IMEI": "730", "Data Volume (Bytes)": "9851177", "Device Manufacturer": "Samsung GT-N7105", "Device Model": "Samsung GT-N7105", "Product Description": "PREPAY STD - $0 - #2"}, {"Rec Open Date": "2016-05-30", "MSISDN": "600", "IMEI": "070", "Data Volume (Bytes)": "18420650", "Device Manufacturer": "Apple iPhone 5C (A1529)", "Device Model": "Apple iPhone 5C (A1529)", "Product Description": "PREPAY PLUS - $0 -"}, {"Rec Open Date": "2016-05-30", "MSISDN": "234", "IMEI": "000", "Data Volume (Bytes)": "1769661", "Device Manufacturer": "Galaxy S7 SM-G930F ", "Device Model": "Galaxy S7 SM-G930F", "Product Description": "$39.95 Plan"}]

Answer

If you don't care about key's order, just do:

import csv
import json
json.dumps(list(csv.DictReader(open("file.csv"))))

Check pretty printing section on the manual for more options, or do

json.dumps(list( csv.DictReader(open("file.csv")) ])).replace("}, ","},\n")

To get your expected output.


If you want ordered printing, you may order the keys via OrderedDict:

import csv
import json
from collections import OrderedDict

dR=csv.DictReader(open("/tmp/ah.csv"))
oD=[ OrderedDict(
         sorted(dct.iteritems(),
                key=lambda item:dR.fieldnames.index(item[0])))
     for dct in dR ]
json.dumps(oD)
Comments