Khaled Khaled - 2 months ago 7
Python Question

Calculate a new column with Pandas

Based on this Question, I would like to know how can I use a def() to calculate a new column with Pandas and use more than one arguments (strings and integers)?

Concrete example:

df_joined["IVbest"] = IV(df_joined["Saison"], df_joined["Wald_Typ"], df_joined["NS_Cap"])


"Saison", "Wald_Typ" are strings "NS_Cap" is an integer

Now I want to run all those values through this definition and return me again an x-value:

def IV(saison, wald, ns):
if saison == "Sommer":
if wald == "Laubwald":
x = ns * 0.1
elif wald == "Nadelwald":
x = ns * 0.2
elif wald == "Mischwald":
x = ns * 0.3
elif saison == "Winter":
if wald == "Laubwald":
x = ns * 0.01
elif wald == "Nadelwald":
x = ns * 0.02
elif wald == "Mischwald":
x = ns * 0.03
return x


How would I accomplish that best?

I have tried stuff like

df_joined["IVbest"] = IV(df_joined["Saison", "Wald_Typ", "NS_Cap"])


or

df_joined["IVbest"] = df_joined["Saison", "Wald_Typ", "NS_Cap"].apply(IV)


but nothing works :(

Answer

I think in this case it would be better to use 6 masks and use these to perform the calculations just on those rows:

sommer_laub = (df_joined['Saison'] == 'Sommer') & (df_joined['Wald_Typ'] == 'Laubwald')
sommer_nadel = (df_joined['Saison'] == 'Sommer') & (df_joined['Wald_Typ'] == 'Nadelwald')
sommer_misch = (df_joined['Saison'] == 'Sommer') & (df_joined['Wald_Typ'] == 'Mischwald')
winter_laub = (df_joined['Saison'] == 'Winter') & (df_joined['Wald_Typ'] == 'Laubwald')
winter_nadel = (df_joined['Saison'] == 'Winter') & (df_joined['Wald_Typ'] == 'Nadelwald')
winter_misch = (df_joined['Saison'] == 'Winter') & (df_joined['Wald_Typ'] == 'Mischwald')
df.loc[sommer_laub, 'IVbest'] = df.loc[sommer_laub,'NS_Cap'] * 0.1
df.loc[sommer_nadel, 'IVbest'] = df.loc[sommer_nadel,'NS_Cap'] * 0.2
df.loc[sommer_misch, 'IVbest'] = df.loc[sommer_misch,'NS_Cap'] * 0.3
df.loc[winter_laub, 'IVbest'] = df.loc[winter_laub,'NS_Cap'] * 0.01
df.loc[winter_nadel, 'IVbest'] = df.loc[winter_nadel,'NS_Cap'] * 0.02
df.loc[winter_misch, 'IVbest'] = df.loc[winter_misch,'NS_Cap'] * 0.03