eleanora eleanora - 2 months ago 15
Python Question

How to separate a numpy array into separate columns in pandas

I have a dataframe that looks like

ID_0 ID_1 ID_2
0 a b 0.05
1 a b 0.10
2 a b 0.19
3 a c 0.25
4 a c 0.40
5 a c 0.65
6 a c 0.71
7 d c 0.95
8 d c 1.00


I want to groupby and make a normalized histogram of the ID_2 column for each group. So I do

df.groupby(['ID_0', 'ID_1']).apply(lambda x: np.histogram(x['ID_2'], range = (0,1), density=True)[0]).reset_index(name='ID_2')


However what I would really like is for the 11 elements of the numpy arrays to be in separate columns of the dataframe.

How can I do this?

Answer

You can construct a series object from each numpy array and the elements will be broadcasted as columns:

import pandas as pd
import numpy as np
df.groupby(['ID_0', 'ID_1']).apply(lambda x: pd.Series(np.histogram(x['ID_2'], range = (0,1), density=True)[0])).reset_index()

enter image description here

Comments