Nathan Miller Nathan Miller - 18 days ago 8
Python Question

Create a numpy array from columns of a pandas dataframe

I have a dataframe that looks like this:

A B C
1 2 3
1 5 3
4 8 2
4 2 1


I would like to create a numpy array from this data using column A as the index, column B as the column headers and column C as the fill data. In the end, it should look like this:

2 5 8
1 3 3
4 1 2


Is there a good way to do this? I have tried df.pivot_table, but I'm worried I have messed up the data, and I would rather do it in another, more intuitive way.

Answer

manipulate the dataframe like this

df.set_index(['A', 'B']).C.unstack()

enter image description here

Or

df.set_index(['A', 'B']).C.unstack(fill_value='')

enter image description here


get the numpy array like this

df.set_index(['A', 'B']).C.unstack().values

array([[  3.,   3.,  nan],
       [  1.,  nan,   2.]])

Or

df.set_index(['A', 'B']).C.unstack(fill_value='').values

array([[3, 3, ''],
       [1, '', 2]], dtype=object)
Comments