Hack-R Hack-R - 1 year ago 324
R Question

Disambiguating eval, obj (objective), and metric in LightGBM

I'm asking this in reference to the R library

lightgbm
but I think it applies equally to the Python and Multiverso versions.

There are 3 parameters wherein you can choose statistics of interest for your model -
metric
,
eval
, and
obj
. I'm trying to clearly distinguish the different roles of these 3 in plain language.

The documentation says:


obj objective function, can be character or custom objective function. Examples include regression, regression_l1, huber, binary,
lambdarank, multiclass, multiclass

eval evaluation function, can be (list of) character or custom eval function


metric had no R documentation, except for the catch all that says "see paraters.md", which also doesn't really explain it, but which lists the following options:


metric, default={l2 for regression}, {binary_logloss for binary
classification},{ndcg for lambdarank}, type=multi-enum,
options=l1,l2,ndcg,auc,binary_logloss,binary_error...
l1, absolute loss, alias=mean_absolute_error, mae
l2, square loss, alias=mean_squared_error, mse
l2_root, root square loss, alias=root_mean_squared_error, rmse
huber, Huber loss
fair, Fair loss
poisson, Poisson regression
ndcg, NDCG
map, MAP
auc, AUC
binary_logloss, log loss
binary_error. For one sample 0 for correct classification, 1 for error classification.
multi_logloss, log loss for mulit-class classification
multi_error. error rate for mulit-class classification
Support multi metrics, separate by , metric_freq, default=1, type=int
frequency for metric output is_training_metric, default=false, type=bool
set this to true if need to output metric result of training ndcg_at, default=1,2,3,4,5, type=multi-int, alias=ndcg_eval_at,eval_at
NDCG evaluation position, separate by ,


My best guess is that


  1. obj
    is the objective function of the algorithm, i.e. what it's trying to maximize or minimize, e.g. "regression" means it's minimizing squared residuals

  2. eval
    I'm guessing is just one or more additional statistics you'd like to see computed as your algorithm is being fit.

  3. metric
    I have no clue how this is used differently than
    obj
    and
    eval


Answer Source

As you have said,

obj is the objective function of the algorithm, i.e. what it's trying to maximize or minimize, e.g. "regression" means it's minimizing squared residuals.

Metric and eval are essentially the same. They only really differ in where they are used. Eval is used with the cross-validation method (because it can be used to evaluate the model for early-stopping etc?). Metric is used in the normal train situation.

The confusion arises from the influence on several gbm variants (xgboost, lightgbm and sklearn's gbm + maybe an R package) all having slightly differing argument names. For example xgb.cv() in python uses eval but for R it uses metric. Then in lgbm.cv() for python and R eval is used.

I have been very confused switching between xgboost and lightgbm. There is an absolutely amazing resource by Laurae that helps you understand each parameter.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download