cloud36 - 1 year ago 258
Python Question

# Python Pure RMSE vs Sklearn

I believe I'm making an error in my calculation of RMSE in pure python. Below is code.

``````y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
e = abs(np.matrix(y_pred) - np.matrix(y_true)).A1
ee = np.dot(e,e)
np.sqrt(ee.sum()/3)

This returns: 0.707
``````

However when I try with Sklearn

``````mean_squared_error(np.matrix(y_true),np.matrix(y_pred))**0.5
This returns: 0.612
``````

Any idea what is going on? Pretty sure the my python code is correct.

You're not making an error. You're dividing by `3` and `sklearn` is dividing by `4`

``````y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
e = abs(np.matrix(y_pred) - np.matrix(y_true)).A1
ee = np.dot(e,e)
np.sqrt(ee.sum()/4)

0.61237243569579447
``````

Dividing by `n-1` gives you an unbiased estimation and is used when calculating 2nd moments for samples. When calculating these same moments for populations, we divide by `n`. Here is are links that could be relevant Wikipedia Some other link

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download