You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I believe that early stopping is broken in the Spark package. Here is a minimal example:
scala>valdata= sc.parallelize(List(1.0->Vectors.dense(Array(1.0)))).toDF("label", "features")
data: org.apache.spark.sql.DataFrame= [label: double, features: vector]
scala>valmodel=newXGBoostClassifier(Map[String, Any]()).setNumEarlyStoppingRounds(10).fit(data)
java.lang.IllegalArgumentException: custom_eval does not support early stopping
at ml.dmlc.xgboost4j.scala.spark.XGBoostExecutionParamsFactory.overrideParams(XGBoost.scala:150)
at ml.dmlc.xgboost4j.scala.spark.XGBoostExecutionParamsFactory.<init>(XGBoost.scala:96)
at ml.dmlc.xgboost4j.scala.spark.XGBoost$.trainDistributed(XGBoost.scala:535)
at ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier.train(XGBoostClassifier.scala:190)
at ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier.train(XGBoostClassifier.scala:40)
at org.apache.spark.ml.Predictor.fit(Predictor.scala:118)
... 49 elided
However, I am not setting custom_eval in my params. Checking the logs, these are the parameters that XGBoost is running with (clipped for brevity):
21/01/28 21:04:57 INFO XGBoostSpark: Running XGBoost 1.1.2 with parameters:
num_early_stopping_rounds -> 10
custom_eval -> null
So if you add .setMaximizeEvaluationMetrics(maximizeEvaluationMetrics) to your model then it works. maximizeEvaluationMetrics should be a boolean value
I believe that early stopping is broken in the Spark package. Here is a minimal example:
However, I am not setting
custom_eval
in my params. Checking the logs, these are the parameters that XGBoost is running with (clipped for brevity):So, it seems that internally, XGBoost is setting
custom_eval
tonull
, and then later checking to see if the key exists. But, it always exists, since it is set with a default value ofnull
.I believe the correct behavior should be to not set a default value, or modify the check appropriately.
I am using version
1.1.2
and Spark2.4.0
.The text was updated successfully, but these errors were encountered: