-
Notifications
You must be signed in to change notification settings - Fork 22
Getting error when trying to run xgboost examples on Spark 3.1.2 #45
Comments
Hi @rpnkj29, Which rapids plugin version are you using? Looks like it's old, Could you try the latested? |
Hi @wbo4958 Thanks for replying. I am using below version of CuDF and Rapids jar: Earlier I tried with latest versions but it did not work as it does not support latest versions as per this link (xgboost : https://github.com/NVIDIA/spark-xgboost-examples/blob/spark-3/examples/notebooks/python/taxi-gpu.ipynb): |
Hi @rpnkj29, the xgboost 1.4.2-0.1.0 should support cudf/rapids 21.08, Please try these jars |
Hi @wbo4958 I tried with newer jars as well.. they give below error for example link (https://github.com/NVIDIA/spark-xgboost-examples/blob/spark-3/examples/notebooks/python/taxi-gpu.ipynb) at this stage (Train the Data with Benchmark) : 21/10/20 12:23:05 INFO SparkContext: Successfully stopped SparkContext 21/10/20 12:23:05 INFO ShutdownHookManager: Shutdown hook called |
Hi @rpnkj29 Could you give more executor/driver logs? |
Hi @wbo4958 I have attached full log from driver pod when it went into error state. Please let me know if more information is required. thanks |
Thx @rpnkj29, It seems you were running xgboost sample on cluster mode (standalone?), the log for driver looks good. The error should be caused by executor side. Could you help to provide the executor log? |
@rpnkj29 Hi there, any update? |
Hi @wbo4958 apologies for the delay. Please find the executor logs for this error. |
Hi @wbo4958 were you able to check executor pods ? |
Hi @wbo4958 any update ? |
I am getting error when trying to run Nyc Taxi or mortgage examples with Spark 3.1.2 operator in Kubernetes. We are submitting our Sparkapplication via Kubectl and getting below error. I tried with different version of spark catalyst jar (3.0.0 and 3.1.2) but still same.
Traceback (most recent call last):
File "/tmp/spark-a0673c21-9c04-4ba0-ae54-13b825af94e7/mortgage.py", line 78, in
model = with_benchmark('Training', lambda: classifier.fit(train_data))
File "/tmp/spark-a0673c21-9c04-4ba0-ae54-13b825af94e7/mortgage.py", line 74, in with_benchmark
result = action()
File "/tmp/spark-a0673c21-9c04-4ba0-ae54-13b825af94e7/mortgage.py", line 78, in
model = with_benchmark('Training', lambda: classifier.fit(train_data))
File "/opt/spark/python/lib/pyspark.zip/pyspark/ml/base.py", line 161, in fit
File "/opt/spark/python/lib/pyspark.zip/pyspark/ml/wrapper.py", line 335, in _fit
File "/opt/spark/python/lib/pyspark.zip/pyspark/ml/wrapper.py", line 332, in _fit_java
File "/opt/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1304, in call
File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 111, in deco
File "/opt/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o82.fit.
: java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: org/apache/spark/sql/catalyst/expressions/TimeSub
at com.nvidia.spark.rapids.shims.spark300.Spark300Shims.getExprs(Spark300Shims.scala:251)
at com.nvidia.spark.rapids.shims.spark301.Spark301Shims.getExprs(Spark301Shims.scala:84)
at com.nvidia.spark.rapids.GpuOverrides$.(GpuOverrides.scala:2544)
at com.nvidia.spark.rapids.GpuOverrides$.(GpuOverrides.scala)
at org.apache.spark.sql.rapids.execution.InternalColumnarRddConverter$.convert(InternalColumnarRddConverter.scala:477)
at com.nvidia.spark.rapids.ColumnarRdd$.convert(ColumnarRdd.scala:47)
at com.nvidia.spark.rapids.ColumnarRdd.convert(ColumnarRdd.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at ml.dmlc.xgboost4j.scala.spark.rapids.GpuUtils$.toColumnarRdd(GpuUtils.scala:39)
at ml.dmlc.xgboost4j.scala.spark.rapids.GpuXGBoost$.trainOnGpuInternal(GpuXGBoost.scala:240)
at ml.dmlc.xgboost4j.scala.spark.rapids.GpuXGBoost$.trainDistributedOnGpu(GpuXGBoost.scala:186)
at ml.dmlc.xgboost4j.scala.spark.rapids.GpuXGBoost$.trainOnGpu(GpuXGBoost.scala:91)
at ml.dmlc.xgboost4j.scala.spark.rapids.GpuXGBoost$.fitOnGpu(GpuXGBoost.scala:52)
at ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier.fit(XGBoostClassifier.scala:170)
at ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier.fit(XGBoostClassifier.scala:41)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoClassDefFoundError: org/apache/spark/sql/catalyst/expressions/TimeSub
... 29 more
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.catalyst.expressions.TimeSub
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 29 more
The text was updated successfully, but these errors were encountered: