cjds cjds - 14 days ago 5
Python Question

Perform ID based averaging of one array with IDs from another array - NumPy

I have two numpy arrays

A= np.array([1,1,1,1,0,0,0,0,0,1])
B= np.array([2,2,2,2,32,1,12,124,1,2)
C= #mean of B's elements where A is 1
D= #mean of B's elements where A is 0


How can I do this? I think it's some combination of
np.mean
and
np.ma
but I don't understand how you can calculate the mean with a mask?

Answer

You can use np.bincount for a generic case when you might be dealing with other such IDs/tags in A, like so -

np.bincount(A,B)/np.bincount(A)

Basically, np.bincount(A,B) gives us the ID based summations of B, where the IDs are from A. Then, we are dividing those summations by the count of each group of IDs to get the average values per ID group.

Sample run -

In [12]: A
Out[12]: array([1, 1, 1, 1, 0, 0, 0, 0, 0, 1])

In [13]: B
Out[13]: array([  2,   2,   2,   2,  32,   1,  12, 124,   1,   2])

In [14]: B[A==0].mean() # Using boolean indexing per ID and getting avg
Out[14]: 34.0

In [15]: B[A==1].mean()
Out[15]: 2.0

In [16]: np.bincount(A,B)/np.bincount(A)
Out[16]: array([ 34.,   2.])
Comments