alphanumeric alphanumeric - 6 months ago 41
Python Question

How to fill missing DataFrame value by copying a value from another column

Mick Jagger's last name is missing in a data. Only the fist name and the band's name were specified

import pandas as pd

df = pd.DataFrame({ 'fist': ['John', 'Mick'],
'last':['Lennon', ''],
'band':['Beatles', 'Rolling Stones'] })

I can easily assign a new value to the missing field with:

df['last'][df['last'] == ''] = 'Jagger'

But instead of assigning a new value 'Jagger' I would rather fill it by copying a value taken from the corresponding 'band' column. So the end result would look like:

band fist last
0 Beatles John Lennon
1 Rolling Stones Mick Rolling Stones

How to achieve it?


You can use pandas.Series.where() to conditionally replace values of the current Series with values from another Series:

df['last'].where(df['last'] != "",

#0            Lennon
#1    Rolling Stones
#Name: last, dtype: object

df['last'] = df['last'].where(df['last'] != "",

#   band            fist              last
#0  Beatles         John            Lennon
#1  Rolling Stones  Mick    Rolling Stones