emax emax - 1 year ago 104
Python Question

Python: how to count number of couple values repeated in a pandas dataframe?

I have a dataframe

df
. Composed by
2
columns that denote the coordinates of a Matrix
M
. I defined the Matrix
M
as

s = [5, 5]
M = np.zeros((s[1], s[0]))


Now I want to add to count how times the same cell is called in the dataframe

df

x y
0 1 4
1 0 2
3 3 1
4 4 2
5 4 2
4 2 0


What I am doing is the following:

for i in df.index:
M[df['x'][i]][df['y'][i]] += 1


I would like to do it in a more elegant way maybe grouping the pandas dataframe.

The output should generate a dataframe
df1
that count the number of times a couple
xy
is repeated, so:

df1

x y count
0 1 4 1
1 0 2 1
3 3 1 1
4 4 2 2
5 2 0 1


and the matrix
M


M

array([[ 0., 0., 1.0, 0., 0.],
[ 0., 0., 0., 1.0, 0.],
[ 1.0, 0., 0., 0., 0.],
[ 0., 1., 0., 0., 0.],
[ 0., 0., 2., 0., 0.]])

Answer Source

You could do something like this and reindex axis to cover missing values:

M = (df.groupby(['x','y'])['x']
       .count()
       .unstack()
       .reindex(index=np.arange(df.x.max()+1),
                columns=np.arange(df.y.max()+1))
       .fillna(0)
       .values)

Output:

[[ 0.  0.  1.  0.  0.]
 [ 0.  0.  0.  0.  1.]
 [ 1.  0.  0.  0.  0.]
 [ 0.  1.  0.  0.  0.]
 [ 0.  0.  2.  0.  0.]]
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download