Erdem KAYA Erdem KAYA - 4 months ago 51
Python Question

How to get odds-ratios and other related features with scikit-learn

I'm going through this odds ratios in logistic regression tutorial, and trying to get the exactly the same results with the logistic regression module of scikit-learn. With the code below, I am able to get the coefficient and intercept but I could not find a way to find other properties of the model listed in the tutorial such as log-likelyhood, Odds Ratio, Std. Err., z, P>|z|, [95% Conf. Interval]. If someone could show me how to have them calculated with

package, I would appreciate it.

import pandas as pd
from sklearn import linear_model

url = ''
df = pd.read_csv(url, na_values=[''])
y = df.hon.values
X = df.math.values
y = y.reshape(200,1)
X = X.reshape(200,1)
clf = linear_model.LogisticRegression(C=1e5),y)


You can get the odds ratios by taking the exponent of the coeffecients:

import numpy as np
X = df.female.values.reshape(200,1),y)

# array([[ 1.80891307]])

As for the other statistics, these are not easy to get from scikit-learn (where model evaluation is mostly done using cross-validation), if you need them you're better off using a different library such as statsmodels.