pang2016 pang2016 - 3 years ago 306
Python Question

fillna by other dataframe row

I have a problem:

import pandas
df1=pandas.DataFrame([['2017-1-22',25,None],['2017-1-23','',''],['2017-1-24',24,15]],columns=['date','high_tem','low_tem'])
df2=pandas.DataFrame([['2017-1-22',22,18],['2017-1-23',23,''],['2017-1-24',20,10]],columns=['date','high_tem','low_tem'])
df3=pandas.DataFrame([['2017-1-22',25,16],['2017-1-23',24,18],['2017-1-24',22,11]],columns=['date','high_tem','low_tem'])


the df1,df2,df3 like this:

date high_tem low_tem
0 2017-1-22 25 None
1 2017-1-23
2 2017-1-24 24 15
date high_tem low_tem
0 2017-1-22 22 18
1 2017-1-23 23
2 2017-1-24 20 10
date high_tem low_tem
0 2017-1-22 25 16
1 2017-1-23 24 18
2 2017-1-24 22 11


I try get the result like this:

date high_tem low_tem
0 2017-1-22 22 18
1 2017-1-23 24 18
2 2017-1-24 24 15


the None is in df1 first rows(the index=0). So I use the df2 first row(which have the same date with df1 first rows) to replace it.

the Null string is in df1 second row(the index=1), the df2 second have null string.So I use the df3 second row to replce it.

the df1 thrid row doesn't contain None and Null string, I don't change this row.

Thanks

Answer Source

You could loop through each dataframe and replace None and empty strings with np.NaN. Whenever there are presence of any NaNs, fill the complete row with it.

for df in [df1, df2, df3]:
    df.replace({None: np.NaN, "": np.NaN}, inplace=True)
    df.loc[df.isnull().any(axis=1), ['high_tem', 'low_tem']] = np.NaN

Following this, df1, df2 and df3 would be modified accordingly.

Use DF.combine_first() in df1→df2→df3 order to fill the missing values.

df1.combine_first(df2).combine_first(df3)

enter image description here

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download