Carla Carla - 1 month ago 5
Python Question

column (in index format) to dataframe?

I have a column in my dataframe that is formatted like an index:

0 [u'Basketball', u'Swimming', u'Gym']
1 [u'Gym', u'Soccer', u'Football']
2 [u'Ballet', u'Basketball', u'Volleyball']


Is there an easy way for me to clean this up (remove the u, and the square brackets) then split them by (',') such that sports are grouped to three columns?

Answer

consider s

s = pd.Series([
      "[u'Basketball', 'Swimming', 'Gym']",
      "[u'Gym', u'Soccer', u'Football']",
      "[u'Ballet', u'Basketball', u'Volleyball']"
    ])
s

0           [u'Basketball', 'Swimming', 'Gym']
1             [u'Gym', u'Soccer', u'Football']
2    [u'Ballet', u'Basketball', u'Volleyball']
dtype: object

quickest way is to apply eval

s.apply(eval)

0         [Basketball, Swimming, Gym]
1             [Gym, Soccer, Football]
2    [Ballet, Basketball, Volleyball]
dtype: object

to get a dataframe

s.apply(eval).apply(pd.Series)

enter image description here