user1253493 user1253493 - 7 months ago 42
Python Question

Convert Pandas Dataframe to specific json format

I have been looking at different methods to export pandas dataframes into json files but I am not sure how to include other string 'constants' into the JSON.

The purpose is to spit out a JSON file that can be read by chart.js.

The format of the pandas dataframe for the example is:

df = pandas.DataFrame({
"labels" : [1,2,3,4,5],
"data" : [5,4,3,2,1]
})


The format of the JSON required file is:

{
labels: [1,2,3,4,5],
datasets: [
{
label: "My First dataset",
fillColor: "rgba(220,220,220,0.2)",
strokeColor: "rgba(220,220,220,1)",
pointColor: "rgba(220,220,220,1)",
pointStrokeColor: "#fff",
pointHighlightFill: "#fff",
pointHighlightStroke: "rgba(220,220,220,1)",
data: [5,4,3,2,1]
}
]
}


I can export the pandas file as JSON using the inbuilt functions of pandas but I do not know how to separate the vectors and add the constant values seen above.

Answer

I recommend using a templating library like mako or jinja. Here's a quick mako file that contains your JSON, but references a DataFrame argument df which mako will pass to it when asked to render the final form of the document (i.e. with actual data).

Here's the file "pandas_to_json.mako":

<%page args="df"/>

<%
"""Comments about this template file.
"""
# Python imports, helper function definitions, etc.
import pandas
%>

<%text> 
{
    labels: </%text> ${df.labels.values.tolist()} <%text>,
    datasets: [
        {
            label: "My First dataset",
            fillColor: "rgba(220,220,220,0.2)",
            strokeColor: "rgba(220,220,220,1)",
            pointColor: "rgba(220,220,220,1)",
            pointStrokeColor: "#fff",
            pointHighlightFill: "#fff",
            pointHighlightStroke: "rgba(220,220,220,1)",
            data: </%text> ${df.data.values.tolist()} <%text>
        }
    ]
}
</%text>

Then in a .py file called whatever you want (I called it "pandas_to_json.py") you can run this (from the same directory with the .mako file):

import pandas
from mako.template import Template
mako_file_name = "pandas_to_json.mako"

df = pandas.DataFrame({
    "labels" : [1,2,3,4,5],
    "data" : [5,4,3,2,1]
})

t = Template(filename=mako_file_name)
print t.render(df)

and it prints out this:

In [1]: %run pandas_to_json.py




{
    labels:  [1, 2, 3, 4, 5] ,
    datasets: [
        {
            label: "My First dataset",
            fillColor: "rgba(220,220,220,0.2)",
            strokeColor: "rgba(220,220,220,1)",
            pointColor: "rgba(220,220,220,1)",
            pointStrokeColor: "#fff",
            pointHighlightFill: "#fff",
            pointHighlightStroke: "rgba(220,220,220,1)",
            data:  [5, 4, 3, 2, 1] 
        }
    ]
}

For this to work, you just need to install mako. If you use the conda package management tool, then this code worked for me to install mako for Python 2.7 on Ubuntu 13:

conda install -c auto mako

but you can also use other install tools like pip or just follow mako's installation instructions.

For some reason, the more basic conda install mako did not work, claiming to find no package named mako in the linux-64 channels (despite the fact that use the "auto" channel from binstar in my first command does work, which ostensibly should be equivalent).

I've heard good things about jinja as well, and it may be more convenient, easier to install/use, or better for you for other reasons too. You should probably check them both out before choosing one to use.

The big point here is that you control what the template rendering process produces, instead of being at the mercy of pandas to do it for you.

Comments