view raw
Afflatus Afflatus - 8 months ago 52
Python Question

Python: Assigning # values in a list to bins, by rounding up

I want a function that can take a series and a set of bins, and basically round up to the nearest bin. For example:

my_series = [ 1, 1.5, 2, 2.3, 2.6, 3]
def my_function(my_series, bins):

my_function(my_series, bins=[1,2,3])
> [1,2,2,3,3,3]

This seems to be very close to what Numpy's Digitize is intended to do, but it produces the wrong values (asterisks for wrong values):

np.digitize(my_series, bins= [1,2,3], right=False)
> [1, 1*, 2, 2*, 2*, 3]

The reason why it's wrong is clear from the documentation:

Each index i returned is such that bins[i-1] <= x < bins[i] if bins is
monotonically increasing, or bins[i-1] > x >= bins[i] if bins is
monotonically decreasing. If values in x are beyond the bounds of
bins, 0 or len(bins) is returned as appropriate. If right is True,
then the right bin is closed so that the index i is such that
bins[i-1] < x <= bins[i] or bins[i-1] >= x > bins[i]`` if bins is
monotonically increasing or decreasing, respectively.

I can kind of get closer to what I want if I enter in the values decreasing and set "right" to True...

np.digitize(my_series, bins= [3,2,1], right=True)
> [3, 2, 2, 1, 1, 1]

but then I'll have to think of a way of basically methodically reversing the lowest number assignment (1) with the highest number assignment (3). It's simple when there are just 3 bins, but will get hairier when the number of bins get longer..there must be a more elegant way of doing all this.


We can simply use np.digitize with its right option set as True to get the indices and then to extract the corresponding elements off bins, bring in np.take, like so -