cjds cjds - 7 months ago 30
Python Question

Perform ID based averaging of one array with IDs from another array - NumPy

I have two numpy arrays

A= np.array([1,1,1,1,0,0,0,0,0,1])
B= np.array([2,2,2,2,32,1,12,124,1,2)
C= #mean of B's elements where A is 1
D= #mean of B's elements where A is 0

How can I do this? I think it's some combination of
but I don't understand how you can calculate the mean with a mask?


You can use np.bincount for a generic case when you might be dealing with other such IDs/tags in A, like so -


Basically, np.bincount(A,B) gives us the ID based summations of B, where the IDs are from A. Then, we are dividing those summations by the count of each group of IDs to get the average values per ID group.

Sample run -

In [12]: A
Out[12]: array([1, 1, 1, 1, 0, 0, 0, 0, 0, 1])

In [13]: B
Out[13]: array([  2,   2,   2,   2,  32,   1,  12, 124,   1,   2])

In [14]: B[A==0].mean() # Using boolean indexing per ID and getting avg
Out[14]: 34.0

In [15]: B[A==1].mean()
Out[15]: 2.0

In [16]: np.bincount(A,B)/np.bincount(A)
Out[16]: array([ 34.,   2.])