Brian - 1 year ago 129
Python Question

# correlation matrix of one dataframe with another

I was reading through the answers to this question. Then question came up on how to calculate the correlations of all columns from one dataframe with all columns from the other dataframe. Since it seemed this question wasn't going to get answered, I wanted to ask it as I need something just like that.

So say I have dataframes

`A`
and
`B`
:

``````import pandas as pd
import numpy as np

A = pd.DataFrame(np.random.rand(24, 5), columns=list('abcde'))
B = pd.DataFrame(np.random.rand(24, 5), columns=list('ABCDE'))
``````

how do I get a dataframe that looks like this:

``````pd.DataFrame([], A.columns, B.columns)

A    B    C    D    E
a  NaN  NaN  NaN  NaN  NaN
b  NaN  NaN  NaN  NaN  NaN
c  NaN  NaN  NaN  NaN  NaN
d  NaN  NaN  NaN  NaN  NaN
e  NaN  NaN  NaN  NaN  NaN
``````

But filled with the appropriate correlations?

One way to do it would be:

``````pd.concat([A, B], axis=1).corr().filter(B.columns).filter(A.columns, axis=0)
``````

A more efficient way would be:

``````Az = (A - A.mean())
Bz = (B - B.mean())

Az.T.dot(Bz).div(len(A)).div(Bz.std(ddof=0)).div(Az.std(ddof=0), axis=0)
``````

And you'd get the same as above.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download