Dex' ter Dex' ter - 2 months ago 6
JSON Question

How to add an object to already existing json object if a key from csv already exists?

I have a

csv
file that looks like this:

NDB_No,Seq,Amount,Msre_Desc,Gm_Wgt,Num_Data_Pts,Std_Dev
01001,1,1,pat (1" sq, 1/3" high),5,,
01001,2,1,tbsp,14.2,,
01001,3,1,cup,227,,
01001,4,1,stick,113,,
01002,1,1,pat (1" sq, 1/3" high),3.8,,
01002,2,1,tbsp,9.4,,
01002,3,1,cup,151,,
01002,4,1,stick,76,,


And I want to export it as a json like this:

{
"NDB_No": [
{
"01001":
{
"Seq 1":
{
"Amount": 1,
"Msre_Desc": "pat (1 in sq, 1/3 in high)",
"Gm_Wgt": 5,
"Num_Data_Pts": null,
"Std_Dev": null
},
"Seq 2":
{
"Amount": 1,
"Msre_Desc": "tbsp",
"Gm_Wgt": 14.2,
"Num_Data_Pts": null,
"Std_Dev": null
},
"Seq 3":
{
"Amount": 1,
"Msre_Desc": "cup",
"Gm_Wgt": 227,
"Num_Data_Pts": null,
"Std_Dev": null
},

"Seq 4":
{
"Amount": 1,
"Msre_Desc": "stick",
"Gm_Wgt": 113,
"Num_Data_Pts": null,
"Std_Dev": null
}
}
},
{
"01002":
{
"Seq 1":
{
"Amount": 1,
"Msre_Desc": "pat (1 in sq, 1/3 in high)",
"Gm_Wgt": 3.8,
"Num_Data_Pts": null,
"Std_Dev": null
},
"Seq 2":
{
"Amount": 1,
"Msre_Desc": "tbsp",
"Gm_Wgt": 9.4,
"Num_Data_Pts": null,
"Std_Dev": null
},
"Seq 3":
{
"Amount": 1,
"Msre_Desc": "cup",
"Gm_Wgt": 151,
"Num_Data_Pts": null,
"Std_Dev": null
},

"Seq 4":
{
"Amount": 1,
"Msre_Desc": "stick",
"Gm_Wgt": 76,
"Num_Data_Pts": null,
"Std_Dev": null
}
}
}]
}


As you can see, if
NDB_No
from the csv is repeated I just add a new
Seq
to it. Otherwise, I continue with another object.

What I've got was:

{
"NDB_No": [
{
"01001": {
"Seq 1": {
"Msre_Desc": "pat (1\" sq",
"Std_Dev": "NULL",
"Amount": "1",
"Gm_Wgt": " 1/3\" high)",
"Num_Data_Pts": "5"
}
}
}
]
}
{
"NDB_No": [
{
"01001": {
"Seq 2": {
"Msre_Desc": "tbsp",
"Std_Dev": "NULL",
"Amount": "1",
"Gm_Wgt": "14.2",
"Num_Data_Pts": "NULL"
}
}
}
]
}
{
"NDB_No": [
{
"01001": {
"Seq 3": {
"Msre_Desc": "cup",
"Std_Dev": "NULL",
"Amount": "1",
"Gm_Wgt": "227",
"Num_Data_Pts": "NULL"
}
}
}
]
}
{
"NDB_No": [
{
"01001": {
"Seq 4": {
"Msre_Desc": "stick",
"Std_Dev": "NULL",
"Amount": "1",
"Gm_Wgt": "113",
"Num_Data_Pts": "NULL"
}
}
}
]
}
{
"NDB_No": [
{
"01002": {
"Seq 1": {
"Msre_Desc": "pat (1\" sq",
"Std_Dev": "NULL",
"Amount": "1",
"Gm_Wgt": " 1/3\" high)",
"Num_Data_Pts": "3.8"
}
}
}
]
}
{
"NDB_No": [
{
"01002": {
"Seq 2": {
"Msre_Desc": "tbsp",
"Std_Dev": "NULL",
"Amount": "1",
"Gm_Wgt": "9.4",
"Num_Data_Pts": "NULL"
}
}
}
]
}
{
"NDB_No": [
{
"01002": {
"Seq 3": {
"Msre_Desc": "cup",
"Std_Dev": "NULL",
"Amount": "1",
"Gm_Wgt": "151",
"Num_Data_Pts": "NULL"
}
}
}
]
}
{
"NDB_No": [
{
"01002": {
"Seq 4": {
"Msre_Desc": "stick",
"Std_Dev": "NULL",
"Amount": "1",
"Gm_Wgt": "76",
"Num_Data_Pts": "NULL"
}
}
}
]
}


By writing it in Python like this:

import json
from collections import OrderedDict, defaultdict

with open('Example_Measures.csv') as csvfile, open('x.json', 'w') as json_file:
reader = csv.DictReader(csvfile)

for row in reader:
source = {'NDB_No': []}

source['NDB_No'].append({
row['NDB_No']: {
'Seq {}'.format(row['Seq']): {
'Amount': row['Amount'],
'Msre_Desc': row['Msre_Desc'],
'Gm_Wgt': row['Gm_Wgt'],
'Num_Data_Pts': 'NULL' if not row['Num_Data_Pts'] else row['Num_Data_Pts'],
'Std_Dev': 'NULL' if not row['Std_Dev'] else row['Std_Dev']
}
}
})

data_json = json.dumps(dict(OrderedDict(source)), indent=2, sort_keys=False)
print(data_json)


The problems that I'm facing are:


  • how can I only add
    Seq
    to
    NDB_No
    if it already exists ?

  • how can I avoid printing
    NDB_No
    if
    NDB_No
    already exists ?



I'd like to know if it's possible to get the above JSON from the csv. I've been struggling a lot but I coulnd't figure it out (all that I've got was the above JSON - which is not so correct). Any help/advice is welcome!

Answer

You have two keys and a dictionary of data. To make things simpler change {"NDB_No": [{key1: {key2: data}}]} to {key1: {key2: data}}. This is as you're just making noise.

Make key1, key2 and data:

key1, key2 = row['NDB_No'], 'Seq {}'.format(row['Seq'])
data = {
    'Amount': row['Amount'],
    'Msre_Desc': row['Msre_Desc'],
    'Gm_Wgt': row['Gm_Wgt'],
    'Num_Data_Pts': row['Num_Data_Pts'] or None,
    'Std_Dev': row['Std_Dev'] or None,
}

Then you want to build the data, and wrap it in the additional dictionary:

src = {}
for row in reader:
    src.setdefault(key1, {})[key2] = data
src = {"NDB_No": src}

You then want to write your own setdefault to build a list rather than a dictionary.

def setdefault(list, key, value):
    for item in list:
        if key in item:
            return item[key]
    list.append({key: value})
    return value

And then adapt the above:

src = []
for row in reader:
    key1, key2 = row['NDB_No'], 'Seq {}'.format(row['Seq'])
    data = ...
    setdefault(src, key1, {})[key2] = data
src = {"NDB_No": src}