user1717931 user1717931 - 7 months ago 40
JSON Question

write pandas data frame to a json file

I have a small data frame and want to convert to json file. I want the json file to be in a nice format (with indents etc). Also, I want one of the columns of the data frame to be the key to the json data. Here is an example:

input dataframe:

start_date Nums ID tag
51 2016-03-13 3110 2 787623
55 2016-03-17 676754 1 787623


Desired output format of json: [note: the indices 51 and 55 are gone, the tag value 78723 has become the 'key']. The below desired

{
"787623": [{
"start_date": "2016-03-13",
"Nums": "3110",
"ID": "2"
}, {
"start_date": "2016-03-17",
"Nums": "676754",
"ID": "1"
}]
}


Currently, I am doing a simple
df.to_json(jsonFile, date_format = 'iso', orient = 'records')
which yields a more compact json file like this:

[{"start_date":"2016-03-13T00:00:00.000Z","Nums":3110,"ID":2,"tag":787623},{"start_date":"2016-03-17T00:00:00.000Z","Nums":676754,"ID":1,"tag":787623}]


But, I like the nicely formatted style.

any help is appreciated.

Answer

The DataFrame.to_dict() method has a lot of options for how the dictionary is laid out. Then you can use the indent option in the json module functions to create the nicely formatted string.

Example:

d={}
for tag,group in df.set_index('tag').groupby(level='tag'):
    d[str(tag)]=group.to_dict(orient='records')
print(json.dumps(d,indent=1))

Gives

{
 "787623": [
  {
   "ID": 2,
   "Nums": 3110,
   "start_date": "2016-03-13"
  },
  {
   "ID": 1,
   "Nums": 676754,
   "start_date": "2016-03-17"
  }
 ]
}

If you didn't have duplicate entries in the tag field, a one-liner would work.