Mark Richardson Mark Richardson - 2 months ago 14
Python Question

Pairwise regressions in Pandas

  1. I have a Dataframe
    columns. The index is a DatetimeIndex. Given a reference column
    , I wish to compute the
    one-dimensional linear regressions of the remaining columns against this reference column. The following does not achieve this, but rather computes a single
    -dimensional regression:

    pd.ols(y=df[ref_col], x=df.drop(ref_col, axis=1))

  2. Suppose now I wish to compute all possible pairwise regressions in order to produce an
    matrix of betas with unit diagonal.

One can do both of the above relatively easily using loops. Is there a "vectorised" way?


You can get the list of the pairwise regressions to the reference column like this:

models=[pd.ols(y=df[ref_col],x=df[col]) for col in df if col<>ref_col]

To get the matrix of models over all possible reference columns, the next step would be

models_matrix=[[pd.ols(y=df[ref_col],x=df[col]) for col in df if col<>ref_col] for ref_col in df]

Finally, the matrix of betas can be achieved like this

betas=[[pd.ols(y=df[ref_col],x=df[col]).beta.x for col in df if col<>ref_col] for ref_col in df]