Carolina_G - 1 year ago 95

Python Question

I've done a multivariate regression using sklearn.linear_model.LinearRegression and obtained the regression coefficients doing this:

`import numpy as np`

from sklearn import linear_model

clf = linear_model.LinearRegression()

TST = np.vstack([x1,x2,x3,x4])

TST = TST.transpose()

clf.fit (TST,y)

clf.coef_

Now, I need the standard errors for these same coefficients. How can I do that?

Thanks a lot.

Answer

Based on this stats question and wikipedia, my best guess is:

```
MSE = np.mean((y - clf.predict(TST).T)**2)
var_est = MSE * np.diag(np.linalg.pinv(np.dot(TST.T,TST)))
SE_est = np.sqrt(var_est)
```

However, my linear algebra and stats are both quite poor, so I could be missing something important. Another option might be to bootstrap the variance estimate.

Source (Stackoverflow)