Carolina_G Carolina_G - 8 months ago 62
Python Question

Standard errors for multivariate regression coefficients

I've done a multivariate regression using sklearn.linear_model.LinearRegression and obtained the regression coefficients doing this:

import numpy as np
from sklearn import linear_model
clf = linear_model.LinearRegression()
TST = np.vstack([x1,x2,x3,x4])
TST = TST.transpose() (TST,y)

Now, I need the standard errors for these same coefficients. How can I do that?
Thanks a lot.


Based on this stats question and wikipedia, my best guess is:

MSE = np.mean((y - clf.predict(TST).T)**2)
var_est = MSE * np.diag(np.linalg.pinv(,TST)))
SE_est = np.sqrt(var_est)

However, my linear algebra and stats are both quite poor, so I could be missing something important. Another option might be to bootstrap the variance estimate.