tommy.carstensen tommy.carstensen - 1 month ago 24
Python Question

Filter Pandas dataframe based on strict inequality and missing data

I am new to Pandas. How do I filter based on either a strict inequality or missing data? In the code below I want

one
to be either above a threshold or missing. How do I achieve this? Thanks.

import pandas as pd
import numpy as np

d = {
'one' : [1.1, np.nan, 3.1],
'two' : [3.2, 2.2, 1.2],
}

df = pd.DataFrame(d)

for one in np.arange(0, 6, 1.):
df1 = df[(df['one']>one) | (df['one']==np.nan)]
if len(df1) == 0:
continue
for two in np.arange(0, 6, 1.):
df2 = df1[(df1['two']>two)]
if len(df2) == 0:
continue
print(one, two, len(df2))

Answer

Use the isnull() function to identify missing values.

df.loc[(df['one'] > 2) | (df['one'].isnull())]

#    one  two
# 1  NaN  2.2
# 2  3.1  1.2