Carolina_G Carolina_G - 7 months ago 58
Python Question

Standard errors for multivariate regression coefficients

I've done a multivariate regression using sklearn.linear_model.LinearRegression and obtained the regression coefficients doing this:

import numpy as np
from sklearn import linear_model
clf = linear_model.LinearRegression()
TST = np.vstack([x1,x2,x3,x4])
TST = TST.transpose()
clf.fit (TST,y)
clf.coef_


Now, I need the standard errors for these same coefficients. How can I do that?
Thanks a lot.

Answer

Based on this stats question and wikipedia, my best guess is:

MSE = np.mean((y - clf.predict(TST).T)**2)
var_est = MSE * np.diag(np.linalg.pinv(np.dot(TST.T,TST)))
SE_est = np.sqrt(var_est)

However, my linear algebra and stats are both quite poor, so I could be missing something important. Another option might be to bootstrap the variance estimate.

Comments