Georg Heiler - 6 months ago 66

Python Question

I have a function which returns an

`Observation`

How can I integrate it into a custom sklearn scorer?

I defined it as:

`class Observation():`

def __init__(self):

self.statValues = {}

self.modelName = ""

def setModelName(self, nameOfModel):

self.modelName = nameOfModel

def addStatMetric(self, metricName,metricValue):

self.statValues[metricName] = metricValue

A custom score is defined like:

`def myAllScore(y_true, y_predicted):`

return Observation

my_scorer = make_scorer(myAllScore)

which could look like

`{ 'AUC_R': 0.6892943119440752,`

'Accuracy': 0.9815382629183745,

'Error rate': 0.018461737081625407,

'False negative rate': 0.6211453744493393,

'False positive rate': 0.0002660016625103907,

'Lift value': 33.346741089307166,

'Precision J': 0.9772727272727273,

'Precision N': 0.9815872808592603,

'Rate of negative predictions': 0.0293063938288739,

'Rate of positive predictions': 0.011361068973307943,

'Sensitivity (true positives rate)': 0.3788546255506608,

'Specificity (true negatives rate)': 0.9997339983374897,

'f1_R': 0.9905775376404309,

'kappa': 0.5384745595159575}

Answer

In short: you cannot.

Long version: scorer **has to** return a single scalar, since it is something that can be used for model selection, and in general - comparing objects. Since there is no such thing as a complete ordering over vector spaces - you cannot return a vector inside a scorer (or dictionary, but from mathematical perspective it might be seen as a vector). Furthermore, even other use cases, like doing cross validation does not support arbitrary structured objects as a return value since they try to call `np.mean`

over the list of the values, and this operation is not defined for the list of python dictionaries (which your method returns).

The only thing you can do is to create separate scorer for each of the metrics you have, and use them independently.