shm2008 shm2008 - 9 months ago 21
Python Question

Is it possible to have a function that gets any data frame and any column as its inputs using Pandas in Python?

Imagine we have a dataframe like this:

my_df>>

column_1 column_2 column_3 column_4
0 0.276162 0.552951 0.866023 0.571535
1 0.112933 0.549487 0.626958 0.988705
2 0.916932 0.561641 0.220696 0.545019


Can I have a function that can get any dataframe like this one with any column of it as its input?

To clarify better, if I have a function like this:

def multiply_5(df,column):
df.column=df.column.apply(lambda x:x*5-3)


Does it work if I use it any similar way like this? :

multiply_5(my_df,column_2)


in order to get this:

my_df.column_2=my_df.column_2.apply(lambda x:x*5-3)


I know this specific way I wrote does not work, but is there any easy way to use this function for other data frames?

Answer Source

Try:

def multiply_5(df,column):
    df[column]=df[column].apply(lambda x:x*5-3)
    return df

df = multiply_5(df, "column_2")

print(df)
    column_1    column_2    column_3    column_4
0   0.276162    -4.176225   0.866023    0.571535
1   0.112933    -4.262825   0.626958    0.988705
2   0.916932    -3.958975   0.220696    0.545019

Even a weirder way to do the same thing through attributes:

def multiply_5(df,column):
    setattr(df,column, getattr(df,column).apply(lambda x:x*5-3))
    return df