Carlos Carlos - 11 months ago 72
Python Question

How to convert python JSON rows to dataframe columns without looping

I'm trying to figure out how to do the following without using a loop.

I have a dataframe that has several columns including one that has a JSON string. What I'm trying to do is convert the JSON string column into their own columns within the dataframe. For example I have the following dataframe:

Column 1 | column 2 | Json Column

123 | ABC | {"anotherNumber":345,"anotherString":"DEF"}

I want to convert to this:

Column 1 | column 2 | anotherNumber | anotherString

123 | ABC | 345 | DEF

Answer Source

You can first convert Json Column to dict if necessary by json.loads:

import json

df = pd.DataFrame({'Column 1':[123],
                   'Column 2':['ABC'], 
                   'Json Column':['{"anotherNumber":345,"anotherString":"DEF"}']})
print (df)

   Column 1 Column 2                                     Json Column
0       123      ABC  {'anotherString': 'DEF', 'anotherNumber': 345}

print (type(df.ix[0,'Json Column']))
<class 'str'>

df['Json Column'] =  df['Json Column'].apply((json.loads))

print (type(df.ix[0,'Json Column']))
<class 'dict'>

Then generate list of lists and create Dataframe from constructor:

print (df['Json Column'].values.tolist())
[{'anotherString': 'DEF', 'anotherNumber': 345}]

df1 = pd.DataFrame(df['Json Column'].values.tolist())
print (df1)
   anotherNumber anotherString
0            345           DEF

Last concat to original, where column Json Column was removed by drop:

print (pd.concat([df.drop('Json Column', axis=1), df1], axis=1))
   Column 1 Column 2  anotherNumber anotherString
0       123      ABC            345           DEF