custom evaluation function is not used if name is lexicographically smaller than default #4421

JaakobKind · 2019-04-30T08:45:36Z

Dear xgboost developers,
I have found the following issue.
When I try to use a custom evaluation function, it is only working when I either use parameter disable_default_eval_metric or choose a name for my evaluation metric that is lexicographically larger than the name of the default metric.
If you know these workarounds, they are acceptable, but it took me some time to find out how to overcome the issue. Hence, it would be nice if this bug could be fixed.
I have attached a jupyter notebook that can be used to reproduce the bug. I needed to zip it to be able to upload it -:)

xgboost_bug.zip

JaakobKind · 2019-04-30T11:06:13Z

Additional info: I use xgboost 0.82.
I checked whether I get the same issue in R and was not able to reproduce it there. Hence, it seems only to occur in the python version.

trivialfis · 2019-05-20T04:47:14Z

@JaakobKind Sorry for late reply, will look into this.

trivialfis · 2019-05-20T06:23:29Z

@hcho3 Can I borrow some help here? What's the convention for early stopping when multiple evaluation metrics are used? Monitor one of them or all of them?

hcho3 · 2019-05-22T00:55:48Z

@trivialfis The convention is to always use the first metric for early stopping

JaakobKind · 2019-05-24T12:05:55Z

What is the meaning of "first" here? I hope that it is not lexicographic order of the metric name.

…and always use last metric for early stopping

hcho3 · 2019-07-04T06:19:45Z

@JaakobKind It turns out that there was a bug that caused the metrics to be ordered in lexicographical order. #4638 recovers the originally intended semantics, where the last validation set and last evaluation metric are used for early stopping. (Here "last" means the exact order the user specifies the list of validation sets and metrics.)

Also, #4638 clarifies how early stopping works when multiple validation sets and metrics are given.
Preview: https://hcho3-xgboost.readthedocs.io/en/fix_early_stopping_python/python/python_api.html#xgboost.train

@trivialfis Sorry I was mistaken earlier. It was actually the last metric that should be used for early stopping. I clarified the exact behavior of early stopping in the Python API doc (#4638).

hcho3 · 2019-07-04T06:34:58Z

@JaakobKind And if a customized metric function is added, it will always be added after built-in metrics, so early stopping will always use the customized metric.

* Fix #4630, #4421: Preserve correct ordering between metrics, and always use last metric for early stopping * Clarify semantics of early stopping in presence of multiple valid sets and metrics * Add a test * Fix lint

* Fix dmlc#4630, dmlc#4421: Preserve correct ordering between metrics, and always use last metric for early stopping * Clarify semantics of early stopping in presence of multiple valid sets and metrics * Add a test * Fix lint

hcho3 added a commit to hcho3/xgboost that referenced this issue Jul 4, 2019

Fix dmlc#4630, dmlc#4421: Preserve correct ordering between metrics, …

5c3d45f

…and always use last metric for early stopping

hcho3 mentioned this issue Jul 4, 2019

Fix early stopping in the Python package #4638

Merged

hcho3 closed this as completed in #4638 Jul 7, 2019

lock bot locked as resolved and limited conversation to collaborators Oct 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

custom evaluation function is not used if name is lexicographically smaller than default #4421

custom evaluation function is not used if name is lexicographically smaller than default #4421

JaakobKind commented Apr 30, 2019

JaakobKind commented Apr 30, 2019

trivialfis commented May 20, 2019

trivialfis commented May 20, 2019

hcho3 commented May 22, 2019

JaakobKind commented May 24, 2019

hcho3 commented Jul 4, 2019 •

edited

Loading

hcho3 commented Jul 4, 2019

custom evaluation function is not used if name is lexicographically smaller than default #4421

custom evaluation function is not used if name is lexicographically smaller than default #4421

Comments

JaakobKind commented Apr 30, 2019

JaakobKind commented Apr 30, 2019

trivialfis commented May 20, 2019

trivialfis commented May 20, 2019

hcho3 commented May 22, 2019

JaakobKind commented May 24, 2019

hcho3 commented Jul 4, 2019 • edited Loading

hcho3 commented Jul 4, 2019

hcho3 commented Jul 4, 2019 •

edited

Loading