-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[jvm-packages] XGBoost-Spark reported "ERROR RabitTracker ......" #3834
Comments
|
@CodingCat , I am running on Yarn cluster. Everything about the cluster is set by default. I did not change them explicitly. |
Yes, you need to contact your cluster admin say whether hosts files are correctly written |
Thank you, @CodingCat, is there other way to bind the ip address to tracker by manual setting? |
see the discussion in #3833 you can add xgboost-tracker.properties in your resources directory |
@CodingCat , I have checked the hosts file, its content: |
has anyone got a solution for this, i am struggling with the same error |
@henokyen , the first thing is to check /etc/hosts of the job driver/worker, make sure that it has the 127.0.0.1 for localhost. If the localhost is correct, then check the python version that you used. If it's lower that Python 2.7, the error is then expected. Because the rabbit tracker is supposed to work for Python 2.7+. If this is the case, you can change the tracker to the scala-version while setting the parameters for XGBoostClassifier. |
this is the content of my /etc/host Host Databaselocalhost is used to configure the loopback interfacewhen the system is booting. Do not change this entry.127.0.0.1 localhost i have 127.0.0.1 for localhost I am on MacBook Pro. I am trying to implement this tutorial I am using python version 2.7.10. Any idea? Thanks |
I was running spark using local mode, following the example posted on the official tutorial (XGBoost4J-Spark Tutorial). While I called
val xgbClassificationModel = xgbClassifier.fit(xgbInput)
, the following error occurred:scala> val xgbClassificationModel = xgbClassifier.fit(xgbInput)
Tracker started, with env={}
2018-10-26 14:08:36 ERROR RabitTracker:91 - Uncaught exception thrown by worker:
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:202)
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218)
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:153)
at org.apache.spark.util.ThreadUtils$.awaitReady(ThreadUtils.scala:222)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:633)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2027)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2048)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2067)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2092)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:929)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:927)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:927)
at ml.dmlc.xgboost4j.scala.spark.XGBoost$$anonfun$trainDistributed$4$$anon$1.run(XGBoost.scala:233)
[14:08:36] /xgboost/src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 2 extra nodes, 0 pruned nodes, max_depth=1
[14:08:36] /xgboost/src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 2 extra nodes, 0 pruned nodes, max_depth=1
[14:08:36] /xgboost/src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 4 extra nodes, 0 pruned nodes, max_depth=2
[14:08:36] /xgboost/src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 4 extra nodes, 0 pruned nodes, max_depth=2
[14:08:36] /xgboost/src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 4 extra nodes, 0 pruned nodes, max_depth=2
[14:08:36] /xgboost/src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 4 extra nodes, 0 pruned nodes, max_depth=2
[0] train-merror:0.026667
[0] train-merror:0.026667
.......
.......
[14:08:36] /xgboost/src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 0 extra nodes, 0 pruned nodes, max_depth=0
[14:08:36] /xgboost/src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 2 extra nodes, 0 pruned nodes, max_depth=1
[14:08:36] /xgboost/src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 2 extra nodes, 0 pruned nodes, max_depth=1
[99] train-merror:0.000000
ml.dmlc.xgboost4j.java.XGBoostError: XGBoostModel training failed
at ml.dmlc.xgboost4j.scala.spark.XGBoost$.ml$dmlc$xgboost4j$scala$spark$XGBoost$$postTrackerReturnProcessing(XGBoost.scala:283)
at ml.dmlc.xgboost4j.scala.spark.XGBoost$$anonfun$trainDistributed$4.apply(XGBoost.scala:240)
at ml.dmlc.xgboost4j.scala.spark.XGBoost$$anonfun$trainDistributed$4.apply(XGBoost.scala:222)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:285)
at ml.dmlc.xgboost4j.scala.spark.XGBoost$.trainDistributed(XGBoost.scala:221)
at ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier.train(XGBoostClassifier.scala:191)
at ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier.train(XGBoostClassifier.scala:48)
at org.apache.spark.ml.Predictor.fit(Predictor.scala:118)
... 49 elided
Did anyone have a clue about this issue?
Thanks in advance.
The text was updated successfully, but these errors were encountered: