robintw - 1 year ago 182

Python Question

I've got a

`ndarray`

Is there a way to do this? At the moment I am simply doing:

`unique(array)`

Which gives me something like:

`array([ -Inf, 0.62962963, 0.62962963, 0.62962963, 0.62962963,`

0.62962963])

where the values that look the same (to the number of decimal places being displayed) are obviously slightly different.

Answer

Doesn't `floor`

and `round`

both fail the OP's requirement in some cases?

```
np.floor([5.99999999, 6.0]) # array([ 5., 6.])
np.round([6.50000001, 6.5], 0) #array([ 7., 6.])
```

The way I would do it is (and this may not be optimal (and is surely slower than other answers)) something like this:

```
import numpy as np
TOL = 1.0e-3
a = np.random.random((10,10))
i = np.argsort(a.flat)
d = np.diff(a.flat[i])
result = a.flat[i[d>TOL]]
```

Of course this method will exclude all but the largest member of a run of values that come within the tolerance of any other value, which means you may not find any unique values in an array if all values are significantly close even though the max-min is larger than the tolerance.

Here is essentially the same algorithm, but easier to understand and should be faster as it avoids an indexing step:

```
a = np.random.random((10,))
b = a.copy()
b.sort()
d = np.diff(b)
result = b[d>TOL]
```

The OP may also want to look into `scipy.cluster`

(for a fancy version of this method) or `numpy.digitize`

(for a fancy version of the other two methods)

Source (Stackoverflow)