Chris Parry Chris Parry - 1 month ago 6
Python Question

XGBoost CV and early stopping

I am trying to use XGBoost.cv with early stopping, based on mlogloss:

params = {'booster': 'gbtree', 'objective': 'multi:softprob',
'num_class': len(le.classes_), 'eta': 0.1,
'max_depth': 10, 'subsample': 1.0,
'scale_pos_weight': 1, 'min_child_weight': 5,
'colsample_bytree': 0.2, 'gamma': 0, 'reg_alpha': 0,
'reg_lambda': 1, 'eval_metric': 'mlogloss'}

res = xgb.cv(params, dm_train, nfold=5,
seed=42, early_stopping_rounds=10, verbose_eval=True,
metrics={'mlogloss'}, show_stdv=False)

print(res)


My understanding of early stopping is that, if my eval metric does not improve for n rounds (in this case 10), the run will terminate. When I run this code, it terminates after 10 rounds, printing the output:

test-mlogloss-mean
0: 6.107054
1: 5.403606
2: 4.910938
3: 4.546221
4: 4.274113
5: 4.056968
6: 3.876368
7: 3.728714
8: 3.599812
9: 3.485113


Test-mlogloss is falling with each epoch, therefore, I expected the run to not terminate (as accuracy must be improving). Where am I going wrong?

Thanks.

Answer

I didn't set the num_rounds param, which defaults to 10. Simple.