Georg Heiler Georg Heiler - 1 month ago 18
Python Question

sklearn custom scorer multiple metrics at once

I have a function which returns an

Observation
object with multiple scorers
How can I integrate it into a custom sklearn scorer?
I defined it as:

class Observation():
def __init__(self):
self.statValues = {}
self.modelName = ""

def setModelName(self, nameOfModel):
self.modelName = nameOfModel

def addStatMetric(self, metricName,metricValue):
self.statValues[metricName] = metricValue


A custom score is defined like:

def myAllScore(y_true, y_predicted):
return Observation
my_scorer = make_scorer(myAllScore)


which could look like

{ 'AUC_R': 0.6892943119440752,
'Accuracy': 0.9815382629183745,
'Error rate': 0.018461737081625407,
'False negative rate': 0.6211453744493393,
'False positive rate': 0.0002660016625103907,
'Lift value': 33.346741089307166,
'Precision J': 0.9772727272727273,
'Precision N': 0.9815872808592603,
'Rate of negative predictions': 0.0293063938288739,
'Rate of positive predictions': 0.011361068973307943,
'Sensitivity (true positives rate)': 0.3788546255506608,
'Specificity (true negatives rate)': 0.9997339983374897,
'f1_R': 0.9905775376404309,
'kappa': 0.5384745595159575}

Answer

In short: you cannot.

Long version: scorer has to return a single scalar, since it is something that can be used for model selection, and in general - comparing objects. Since there is no such thing as a complete ordering over vector spaces - you cannot return a vector inside a scorer (or dictionary, but from mathematical perspective it might be seen as a vector). Furthermore, even other use cases, like doing cross validation does not support arbitrary structured objects as a return value since they try to call np.mean over the list of the values, and this operation is not defined for the list of python dictionaries (which your method returns).

The only thing you can do is to create separate scorer for each of the metrics you have, and use them independently.

Comments