Nishant ranjan Nishant ranjan - 7 months ago 9
Python Question

Pandas numpy.where() use - Not getting the desired result

I am trying to merge two columns into a third column based on the NaN values

df['code2'] = np.where(df['code']==np.nan, df['code'], df['code1'])


I am getting only the values if code1 column in the code2. The result is coming as shown in the image
Output image

Please tell me what is wrong in the code i am writing. Thanks

Answer

I think you need isnull for comparing NaN:

df['code2'] = np.where(df['code'].isnull(), df['code'], df['code1'])

Docs:

Warning

One has to be mindful that in python (and numpy), the nan's don’t compare equal, but None's do. Note that Pandas/numpy uses the fact that np.nan != np.nan, and treats None like np.nan.

In [11]: None == None
Out[11]: True

In [12]: np.nan == np.nan
Out[12]: False

So as compared to above, a scalar equality comparison versus a None/np.nan doesn’t provide useful information.

In [13]: df2['one'] == np.nan
Out[13]: 
a    False
b    False
c    False
d    False
e    False
f    False
g    False
h    False
Name: one, dtype: bool
Comments