eleanora eleanora - 2 months ago 24
Python Question

How to swap columns conditionally in pandas

I have a pandas dataframe

df
with 4 columns. For example here is a toy example:

foo1 foo2 foo3 foo4
egg cheese 2 1
apple pear 1 3
french spanish 10 1


The columns are foo1, foo2, foo3 and foo4

I would like to swap columns foo1 and foo2 and also swap columns foo3 and foo4 when foo3 < foo4. So the result would be:

foo1 foo2 foo3 foo4
cheese egg 1 2
apple pear 1 3
spanish french 1 10


I can find the rows that need swapping with
df[df['foo3'] < df['foo4']]
but how can I do the swapping efficiently. My dataframe is large.

Answer

You can use pandas.Series.where function to construct new data frame based on the condition:

pairs = [('foo1', 'foo2'), ('foo3', 'foo4')]  # construct pairs of columns that need to swapped

df_out = pd.DataFrame() 

# for each pair, swap the values if foo3 < foo4
for l, r in pairs:
    df_out[l] = df[l].where(df.foo3 < df.foo4, df[r])
    df_out[r] = df[r].where(df.foo3 < df.foo4, df[l])

df_out
#     foo1   foo2   foo3  foo4
#0  cheese    egg      1     2
#1   apple   pear      1     3
#2 spanish french      1    10
Comments