webmaker - 10 months ago 48

Python Question

I reviewed the following posts beforehand. Is there a way to use DataFrame.isin() with an approximation factor or a tolerance value? Or is there another method that could?

How to filter the DataFrame rows of pandas by "within"/"in"?

use a list of values to select rows from a pandas dataframe

EX)

`df = DataFrame({'A' : [5,6,3.3,4], 'B' : [1,2,3.2, 5]})`

In : df

Out:

A B

0 5 1

1 6 2

2 3.3 3.2

3 4 5

df[df['A'].isin([3, 6], tol=.5)]

In : df

Out:

A B

1 6 2

2 3.3 3.2

Answer Source

You can do a similar thing with numpy's isclose:

```
df[np.isclose(df['A'].values[:, None], [3, 6], atol=.5).any(axis=1)]
Out:
A B
1 6.0 2.0
2 3.3 3.2
```

np.isclose returns this:

```
np.isclose(df['A'].values[:, None], [3, 6], atol=.5)
Out:
array([[False, False],
[False, True],
[ True, False],
[False, False]], dtype=bool)
```

It is a pairwise comparison of `df['A']`

's elements and `[3, 6]`

(that's why we needed `df['A'].values[: None]`

- for broadcasting). Since you are looking for whether it is close to any one of them in the list, we call `.any(axis=1)`

at the end.