ArchieTiger ArchieTiger - 2 months ago 14
Python Question

how to shift single value of a pandas dataframe column

Using pandas

first_valid_index()
to get index of first non-null value of a column, how can I shifta single value of column rather than the whole column. i.e.

data = {'year': [2010, 2011, 2012, 2013, 2014, 2015, 2016,2017, 2018, 2019],
'columnA': [10, 21, 20, 10, 39, 30, 31,45, 23, 56],
'columnB': [None, None, None, 10, 39, 30, 31,45, 23, 56],
'total': [100, 200, 300, 400, 500, 600, 700,800, 900, 1000]}

df = pd.DataFrame(data)
df = df.set_index('year')
print df
columnA columnB total
year
2010 10 NaN 100
2011 21 NaN 200
2012 20 NaN 300
2013 10 10 400
2014 39 39 500
2015 30 30 600
2016 31 31 700
2017 45 45 800
2018 23 23 900
2019 56 56 1000

for col in df.columns:
if col not in ['total']:
idx = df[col].first_valid_index()
df.loc[idx, col] = df.loc[idx, col] + df.loc[idx, 'total'].shift(1)

print df

AttributeError: 'numpy.float64' object has no attribute 'shift'


desired result:

print df
columnA columnB total
year
2010 10 NaN 100
2011 21 NaN 200
2012 20 NaN 300
2013 10 310 400
2014 39 39 500
2015 30 30 600
2016 31 31 700
2017 45 45 800
2018 23 23 900
2019 56 56 1000

Answer

You can filter all column names, where is least one NaN value and then use union with column total:

for col in df.columns:
    if col not in pd.Index(['total']).union(df.columns[~df.isnull().any()]):
        idx = df[col].first_valid_index()
        df.loc[idx, col] += df.total.shift().loc[idx]
print (df)
      columnA  columnB  total
year                         
2010       10      NaN    100
2011       21      NaN    200
2012       20      NaN    300
2013       10    310.0    400
2014       39     39.0    500
2015       30     30.0    600
2016       31     31.0    700
2017       45     45.0    800
2018       23     23.0    900
2019       56     56.0   1000
Comments