Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RandomForestClassifier got error when training on relative large dataset #1

Open
justinuliu opened this issue Jun 20, 2022 · 0 comments

Comments

@justinuliu
Copy link
Collaborator

justinuliu commented Jun 20, 2022

(accelweka) ~ $ weka -main weka.Run .RandomRBF -n 5000 -a 5000 > RBFa5kn5k.arff
(accelweka) ~ $ weka -memory 32g -main weka.Run weka.classifiers.rapids.CuMLDaskClassifier -learner RandomForestClassifier -t $(pwd)/RBFa5kn5k.arff
java.lang.Exception: Traceback (most recent call last):
File "/home/justinliu/wekafiles/packages/wekaRAPIDS/resources/py/pyRapidsServer.py", line 320, in execute_script
exec(script, _global_env)
File "", line 7, in
File "/home/justinliu/miniconda3/envs/accelweka/lib/python3.9/site-packages/cuml/dask/ensemble/randomforestclassifier.py", line 263, in fit
self._fit(model=self.rfs,
File "/home/justinliu/miniconda3/envs/accelweka/lib/python3.9/site-packages/cuml/dask/ensemble/base.py", line 158, in _fit
wait_and_raise_from_futures(futures)
File "/home/justinliu/miniconda3/envs/accelweka/lib/python3.9/site-packages/cuml/dask/common/utils.py", line 162, in wait_and_raise_from_futures
raise_exception_from_futures(futures)
File "/home/justinliu/miniconda3/envs/accelweka/lib/python3.9/site-packages/cuml/dask/common/utils.py", line 151, in raise_exception_from_futures
raise RuntimeError("%d of %d worker jobs failed: %s" % (
RuntimeError: 1 of 1 worker jobs failed: radix_sort: failed on 2nd step: cudaErrorInvalidValue: invalid argument

    at weka.classifiers.rapids.CuMLDaskClassifier.buildClassifier(CuMLDaskClassifier.java:864)
    at weka.classifiers.evaluation.Evaluation.evaluateModel(Evaluation.java:1632)
    at weka.classifiers.Evaluation.evaluateModel(Evaluation.java:668)
    at weka.classifiers.AbstractClassifier.runClassifier(AbstractClassifier.java:141)
    at weka.classifiers.AbstractClassifier.run(AbstractClassifier.java:547)
    at weka.Run.main(Run.java:349)

This error occurs on GTX1080Ti but not on GTX1660 SUPER

@justinuliu justinuliu changed the title Multi-GPU based RandomForestClassifier got error when training on relative large dataset RandomForestClassifier got error when training on relative large dataset Jun 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant