Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] add doc for kill_spark_context_on_worker_failure parameter #6097

Merged
merged 2 commits into from
Sep 10, 2020

Conversation

wbo4958
Copy link
Contributor

@wbo4958 wbo4958 commented Sep 8, 2020

doc for kill_spark_context_on_worker_failure


**SparkContext will be stopped by default when XGBoost training task fails**.

To address this issue, we added a parameter **kill_spark_context_on_worker_failure** in `#6019 <https://github.com/dmlc/xgboost/pull/6019>`_ from **XGBoost4j-Spark 1.2.0**. When **kill_spark_context_on_worker_failure** is set to **false**, the SparkContext will not be stopped even XGBoost training task fails, instead, we throw out the exception. So for the users who want to continue using SparkContext should **try catch** the training code.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To address this issue, we added a parameter **kill_spark_context_on_worker_failure** in `#6019 <https://github.com/dmlc/xgboost/pull/6019>`_ from **XGBoost4j-Spark 1.2.0**. When **kill_spark_context_on_worker_failure** is set to **false**, the SparkContext will not be stopped even XGBoost training task fails, instead, we throw out the exception. So for the users who want to continue using SparkContext should **try catch** the training code.
XGBoost4J-Spark 1.2.0+ exposes a parameter **kill_spark_context_on_worker_failure**. Set **kill_spark_context_on_worker_failure** to **false** so that the SparkContext will not be stopping on training failure. Instead of stopping the SparkContext, XGBoost4J-Spark will throw an exception instead. Users who want to re-use the SparkContext should wrap the training code in a try-catch block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx @hcho3

@hcho3 hcho3 merged commit 00b0ad1 into dmlc:master Sep 10, 2020
@wbo4958 wbo4958 deleted the sparkcontext-doc branch December 20, 2021 01:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants