Mark Richardson Mark Richardson - 21 days ago 7
Python Question

Pairwise regressions in Pandas


  1. I have a Dataframe
    df
    with
    n
    columns. The index is a DatetimeIndex. Given a reference column
    ref_col
    , I wish to compute the
    n-1
    one-dimensional linear regressions of the remaining columns against this reference column. The following does not achieve this, but rather computes a single
    n-1
    -dimensional regression:

    pd.ols(y=df[ref_col], x=df.drop(ref_col, axis=1))

  2. Suppose now I wish to compute all possible pairwise regressions in order to produce an
    nxn
    matrix of betas with unit diagonal.



One can do both of the above relatively easily using loops. Is there a "vectorised" way?

Answer

You can get the list of the pairwise regressions to the reference column like this:

models=[pd.ols(y=df[ref_col],x=df[col]) for col in df if col<>ref_col]

To get the matrix of models over all possible reference columns, the next step would be

models_matrix=[[pd.ols(y=df[ref_col],x=df[col]) for col in df if col<>ref_col] for ref_col in df]

Finally, the matrix of betas can be achieved like this

betas=[[pd.ols(y=df[ref_col],x=df[col]).beta.x for col in df if col<>ref_col] for ref_col in df]