Ohumeronen Ohumeronen - 4 months ago 53
Python Question

Sort pandas DataFrame with function over column values

Based on python, sort descending dataframe with pandas:

Given:

from pandas import DataFrame
import pandas as pd

d = {'one':[2,3,1,4,5],
'two':[5,4,3,2,1],
'letter':['a','a','b','b','c']}

df = DataFrame(d)


df then looks like this:

df:
letter one two
0 a 2 5
1 a 3 4
2 b 1 3
3 b 4 2
4 c 5 1


I would like to have something like:

f = lambda x,y: x**2 + y**2
test = df.sort(f('one', 'two'))


This should order the complete dataframe with respect to the sum of the squared values of column 'one' and 'two' and give me:

test:
letter one two
2 b 1 3
3 b 4 2
1 a 3 4
4 c 5 1
0 a 2 5


Ascending or descending order does not matter. Is there a nice and simple way to do that? I could not yet find a solution.

Answer

You can create a temporary column to use in sort and then drop it:

df.assign(f = df['one']**2 + df['two']**2).sort_values('f').drop('f', axis=1)
Out: 
  letter  one  two
2      b    1    3
3      b    4    2
1      a    3    4
4      c    5    1
0      a    2    5
Comments