I have a DataFrame containing many NaN values. I want to delete rows that contain too many NaN values; specifically: 7 or more.
I tried using the dropna function several ways but it seems clear that it greedily deletes columns or rows that contain any NaN values.
This question (Slice Pandas DataFrame by Row), shows me that if I can just compile a list of the rows that have too many NaN values, I can delete them all with a simple
### LOOP FOR ADDRESSING EACH row:
m = total - row.count()
if (m > 7):
Basically the way to do this is determine the number of cols, set the minimum number of non-nan values and drop the rows that don't meet this criteria:
df.dropna(thresh=(len(df) - 7))
See the docs