Dennis Golomazov Dennis Golomazov - 1 month ago 10
Python Question

Pandas Series any() vs all()

>>> s = pd.Series([float('nan')])
>>> s.any()
False
>>> s.all()
True


Isn't that weird? Documentation on any (Return whether any element is True over requested axis) and all (Return whether all elements are True over requested axis) is similar, but the difference in behavior doesn't seem to make sense to me.

What gives?

Answer

It seems to be an issue with how pandas normally ignores NaN unless told not to:

>>> pd.Series([float('nan')]).any()
False
>>> pd.Series([float('nan')]).all()
True
>>> pd.Series([float('nan')]).any(skipna=False)
True
>>> 

Note, NaN is falsey:

>>> bool(float('nan'))
True

Also note: this is consistent with the built-in any and all. Empty iterables return True for all and False for any. Here is a relevant question on that topic.

Interestingly, the default behavior appears to be inconsistent with the documentation:

skipna : boolean, default True Exclude NA/null values. If an entire row/column is NA, the result will be NA

But observe:

>>> pd.Series([float('nan')]).any(skipna=None)
False
>>> pd.Series([float('nan')]).any(skipna=True)
False
>>> pd.Series([float('nan')]).any(skipna=False)
True
>>>