mar tin mar tin - 27 days ago 12
Python Question

Pandas: check rows for columns division match with rounding

Suppose I have a DataFrame

df
as

A B r
145 146 99.32
1 10 10
2 20 35


The column
r
is meant to be the ratio of
A
to
B
, except for cases like the third row. But, as you can see, this ratio in first row has been rounded.

If I run

df[df.A/df.B == r]


I don't catch any rows because of the rounding. Obviously I could construct the column with the division, round it and then do the comparison, but is there a way to do this directly from the selection instruction above?

Answer

I would use np.isclose() method:

In [32]: df
Out[32]:
   A  B          r
0  3  7   0.420000
1  3  7   0.428571
2  1  2  10.000000

In [33]: df.A/df.B
Out[33]:
0    0.428571
1    0.428571
2    0.500000
dtype: float64

In [34]: np.isclose(df.A/df.B, df.r)
Out[34]: array([False,  True, False], dtype=bool)

In [35]: np.isclose(df.A/df.B, df.r, atol=1e-2)
Out[35]: array([ True,  True, False], dtype=bool)

In [36]: df.loc[np.isclose(df.A/df.B, df.r, atol=1e-2)]
Out[36]:
   A  B         r
0  3  7  0.420000
1  3  7  0.428571

In [37]: df.loc[np.isclose(df.A/df.B, df.r)]
Out[37]:
   A  B         r
1  3  7  0.428571

It's pretty flexible - you can specify relative or absolute tolerance:

rtol : float

The relative tolerance parameter (see Notes).

atol : float

The absolute tolerance parameter (see Notes).

equal_nan : bool

Whether to compare NaN’s as equal. If True, NaN’s in a will be considered equal to NaN’s in b in the output array.