HenryB HenryB - 7 months ago 27
Python Question

Max values in Pandas database to return corresponding values in another dataframe

I have 2 pandas dataframes with identical rows and columns but different values and I am trying to use the maximum value from each row of one dataframe and return the value from the corresponding location in the other dataframe.

For example:

A = DataFrame(np.arange(9).reshape((3, 3)), index=['d1', 'd2', 'd3'], columns=['s1', 's2', 's3'])

B = DataFrame(np.random.randn(3, 3), index=['d1', 'd2', 'd3'], columns=['s1', 's2', 's3'])


gives something like:

s1 s2 s3 s1 s2 s3
d1 0 1 2 d1 -0.02 0.01 0.05
d2 3 4 5 d2 0.01 -0.03 -0.03
d3 6 7 8 d3 0.04 -0.06 0.02


I can get the max values of B with 'B.max(axis=1)', but I want to return the corresponding values from A. I'm looking for something like:

d1 2
d2 3
d3 6

Answer

OK you can use eq to compare against the max values and use this boolean mask against df A:

In [265]:
A = pd.DataFrame(np.arange(9).reshape((3, 3)), index=['d1', 'd2', 'd3'], columns=['s1', 's2', 's3'])
​
B = pd.DataFrame(np.random.randn(3, 3), index=['d1', 'd2', 'd3'], columns=['s1', 's2', 's3'])

In [271]:
A

Out[271]:
    s1  s2  s3
d1   0   1   2
d2   3   4   5
d3   6   7   8

In [272]:
B

Out[272]:
          s1        s2        s3
d1 -0.171054 -0.608554  0.168851
d2 -0.514654 -2.281136  1.706951
d3 -1.193330  0.238856  0.770068

In [277]:
A[B.eq(B.max(axis=1), axis=0)]

Out[277]:
    s1  s2  s3
d1 NaN NaN   2
d2 NaN NaN   5
d3 NaN NaN   8

You can then get the row-wise max on the above:

In [279]:
A[B.eq(B.max(axis=1), axis=0)].max(axis=1)

Out[279]:
d1    2.0
d2    5.0
d3    8.0
dtype: float64
Comments