Gyan Veda Gyan Veda - 3 months ago 46
Python Question

How to pass argument to scoring function in scikit-learn's LogisticRegressionCV call

Problem

I am trying to use scikit-learn's

with
roc_auc_score
as the scoring metric.

from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score

clf = LogisticRegressionCV(scoring=roc_auc_score)


But when I attempt to fit the model (
clf.fit(X, y)
), it throws an error.

ValueError: average has to be one of (None, 'micro', 'macro', 'weighted', 'samples')


That's cool. It's clear what's going on:
roc_auc_score
needs to be called with the
average
argument specified, per its documentation and the error above. So I tried that.

clf = LogisticRegressionCV(scoring=roc_auc_score(average='weighted'))


But it turns out that
roc_auc_score
can't be called with an optional argument alone, because this throws another error.

TypeError: roc_auc_score() takes at least 2 arguments (1 given)


Question

Any thoughts on how I can use
roc_auc_score
as the scoring metric for
LogisticRegressionCV
in a way that I can specify an argument for the scoring function?

I can't find an SO question on this issue or a discussion of this issue in scikit-learn's GitHub repo, but surely someone has run into this before?

Answer

I found a way to solve this problem!

scikit-learn offers a make_scorer function in its metrics module that allows a user to create a scoring object from one of its native scoring functions with arguments specified to non-default values (see here for more information on this function from the scikit-learn docs).

So, I created a scoring object with the average argument specified.

roc_auc_weighted = sk.metrics.make_scorer(sk.metrics.roc_auc_score, average='weighted')

Then, I passed that object in the call to LogisticRegressionCV and it ran without any issues!

clf = LogisticRegressionCV(scoring=roc_auc_weighted)