Patthebug Patthebug - 1 month ago 6
Python Question

fillna() produces NaN values

I am using the following code to fill the

NaN
values and then adding a column to the
DataFrame
which would contain the number of values in a row which are greater than 0. Here's the code:

df.fillna(0, inplace=True)
dfMin10 = df
dfMin10['Sum'] = (dfMin10.iloc[1:len(dfMin10.columns)] > 0).sum(1)
dfMin10


When I see the column
Sum
, I still see some
NaN
values. Why would this be? I'm assuming my
DataFrame (df)
also has some
NaN
values even after replacing
NaN
.

Any pointers would be highly appreciated.

Answer

Are you seeing NaN in the first sum entry? This line:

branchConceptsWithScoresMin10['Sum'] = (branchConceptsWithScoresMin10.iloc[1:len(branchConceptsWithScoresMin10.columns)] > 0).sum(1)

Should this be:

branchConceptsWithScoresMin10['Sum'] = (branchConceptsWithScoresMin10.iloc[0:len(branchConceptsWithScoresMin10.columns)] > 0).sum(1)

Note the indexing starting from 0.

Example:

df = pandas.DataFrame(columns=['a','b','c','d'], index=['x','y','z'])
df.fillna(0, inplace=True)
branchConceptsWithScoresMin10 = df
# Your original code
branchConceptsWithScoresMin10['Sum'] = (branchConceptsWithScoresMin10.iloc[1:len(branchConceptsWithScoresMin10.columns)] > 0).sum(1)

# This should return
# a  b  c  d  Sum
# x  0  0  0  0  NaN
# y  0  0  0  0  0.0
# z  0  0  0  0  0.0

branchConceptsWithScoresMin10['Sum'] = (branchConceptsWithScoresMin10.iloc[0:] > 0).sum(1)

# There should not be any NaNs here.
Comments