-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QST] Best practices to achieve greater max_depth and n_trees parameters in RandomForestRegressor #1467
Comments
I believe some work was done by @miguelangel here (max_depth18.pdf) to try some optimizations to the code and squeeze out some memory, achieving: max_depth = 18
n_trees = 30 But more is still left to be desired. |
Hi. Output for max_depth=20: SKLearn accuracy: 0.08841762935540724
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
in
61 print("SKLearn accuracy: ", mean_squared_error(y_test, skl_y_pred))
62
---> 63 cuml_y_pred = cuml_model.predict(X_test)
64 print("CuML accuracy: ", mean_squared_error(y_test, cuml_y_pred))
/opt/anaconda/lib/python3.6/site-packages/cuml/dask/ensemble/randomforestregressor.py in predict(self, X)
397 rslts = list()
398 for d in range(len(f)):
--> 399 rslts.append(f[d].result())
400 indexes.append(0)
401
/opt/anaconda/lib/python3.6/site-packages/distributed/client.py in result(self, timeout)
225 result = self.client.sync(self._result, callback_timeout=timeout, raiseit=False)
226 if self.status == "error":
--> 227 six.reraise(*result)
228 elif self.status == "cancelled":
229 raise result
/opt/anaconda/lib/python3.6/site-packages/six.py in reraise(tp, value, tb)
693 value = tp()
694 if value.__traceback__ is not tb:
--> 695 raise value.with_traceback(tb)
696 raise value
697 finally:
/opt/anaconda/lib/python3.6/site-packages/cuml/dask/ensemble/randomforestregressor.py in _predict()
286 @staticmethod
287 def _predict(model, X, r):
--> 288 return model.predict(X)
289
290 def fit(self, X, y):
cuml/ensemble/randomforestregressor.pyx in cuml.ensemble.randomforestregressor.RandomForestRegressor.predict()
RuntimeError: ('Long error message', 'Exception occured! file=/conda/conda-bld/libcuml_1566588242169/work/cpp/src/decisiontree/decisiontree_impl.cuh line=392: Cannot predict w/ empty tree!\nObtained 37 stack frames\n#0 in /opt/anaconda/lib/python3.6/site-packages/cuml/common/../../../../libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f9d9806556e]\n#1 in /opt/anaconda/lib/python3.6/site-packages/cuml/common/../../../../libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x80) [0x7f9d98066080]\n#2 in /opt/anaconda/lib/python3.6/site-packages/cuml/common/../../../../libcuml++.so(_ZNK2ML12DecisionTree16DecisionTreeBaseIddE7predictERKNS_10cumlHandleEPKNS0_16TreeMetaDataNodeIddEEPKdiiPdb+0x20b) [0x7f9d9809d27b]\n#3 in /opt/anaconda/lib/python3.6/site-packages/cuml/common/../../../../libcuml++.so(_ZNK2ML11rfRegressorIdE7predictERKNS_10cumlHandleEPKdiiPdPKNS_20RandomForestMetaDataIddEEb+0x221) [0x7f9d9823e5a1]\n#4 in /opt/anaconda/lib/python3.6/site-packages/cuml/common/../../..') |
No of trees should not affect the memory consumption. So you can bump those up say 100 trees at depth 16 |
Ok, but how to get deeper trees? |
We are trying to address these and other issues with RF via 2 parallel approaches:
|
@nikiforov-sm I have an implementation for classification. I can give you regression for deep trees, can you manage building cuml from source ? or you would need to wait util we integrate in the 0.12 nightly |
@vishalmehta1991 Yes, please. |
Hi @nikiforov-sm I have tested it to depths of 50. Hopefully works for you as well. |
Great news! Thank you! |
Currently we have issues with building cuml from source. (cuml_dev) root@nvidia-MLT:/opt/jupyter/cuml/cpp/build# cmake .. -DCMAKE_IGNORE_PATH=$CONDA_PREFIX/lib -DCMAKE_INSTALL_PREFIX=/opt/anaconda Error in make: (cuml_dev) root@nvidia-MLT:/opt/jupyter/cuml/cpp/build# make [ 0%] Performing update step for 'cub' protobuf gathered from here: |
@nikiforov-sm Hmm, i dont see this. To build from source i recommend use anaconda. Make sure you update the code.
I use this approach and works well |
Are we need to build in anaconda using Python 3.7? |
Anaconda python 3.6: Comparing specs that have this dependency: 27%|██████████████████████████▌ | 18/67 [23:06<1:02:55, 77.04s/it] Finding shortest conflict path for setuptools: 0%| | 0/1 [00:00<?, ?it/s] |
not sure if we need 3.7, but i typically use 3.7, conda env create --name cuml_dev --file cuml/conda/environments/cuml_dev_cuda10.1.yml |
Thank you! SKLearn accuracy: 0.08639243151504108
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
in
61 print("SKLearn accuracy: ", mean_squared_error(y_test, skl_y_pred))
62
---> 63 cuml_y_pred = cuml_model.predict(X_test)
64 print("CuML accuracy: ", mean_squared_error(y_test, cuml_y_pred))
/opt/anaconda/lib/python3.7/site-packages/cuml/dask/ensemble/randomforestregressor.py in predict(self, X)
401
402 wait(futures)
--> 403 raise_exception_from_futures(futures)
404
405 indexes = list()
/opt/anaconda/lib/python3.7/site-packages/cuml/dask/common/utils.py in raise_exception_from_futures(futures)
129 if errs:
130 raise RuntimeError("%d of %d worker jobs failed: %s" % (
--> 131 len(errs), len(futures), ", ".join(map(str, errs))
132 ))
133
RuntimeError: 8 of 8 worker jobs failed: Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[12:15:45] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!
Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f0aa1565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f0aa15670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f0aa15830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f0aa1562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f0aa138a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aacc29328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aacc24072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
Obtained 28 stack frames
#0 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0aa11611be]
#1 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) [0x7f0aa1162011]
#2 in /opt/anaconda/lib/libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x7ab) [0x7f0aa138a94b]
#3 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aacc29328]
#4 in /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
#5 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#6 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aacc24072]
#7 in /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
#8 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x537e) [0x56328c5f37ae]
#9 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#10 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#11 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#12 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#13 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#14 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#15 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#16 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#17 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#18 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#19 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#20 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#21 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#22 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#23 in /opt/anaconda/bin/python(PyObject_Call+0x6e) [0x56328c54195e]
#24 in /opt/anaconda/bin/python(+0x223037) [0x56328c641037]
#25 in /opt/anaconda/bin/python(+0x1e3468) [0x56328c601468]
#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f0c771626db]
#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f0c76e8b88f]
, Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[12:15:45] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!
Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f0aa1565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f0aa15670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f0aa15830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f0aa1562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f0aa138a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0a93ac1328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0a93abc072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
Obtained 28 stack frames
#0 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0aa11611be]
#1 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) [0x7f0aa1162011]
#2 in /opt/anaconda/lib/libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x7ab) [0x7f0aa138a94b]
#3 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0a93ac1328]
#4 in /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
#5 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#6 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0a93abc072]
#7 in /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
#8 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x537e) [0x56328c5f37ae]
#9 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#10 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#11 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#12 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#13 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#14 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#15 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#16 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#17 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#18 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#19 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#20 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#21 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#22 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#23 in /opt/anaconda/bin/python(PyObject_Call+0x6e) [0x56328c54195e]
#24 in /opt/anaconda/bin/python(+0x223037) [0x56328c641037]
#25 in /opt/anaconda/bin/python(+0x1e3468) [0x56328c601468]
#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f0c771626db]
#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f0c76e8b88f]
, Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[12:15:45] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!
Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f0aa1565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f0aa15670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f0aa15830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f0aa1562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f0aa138a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0a93ac7328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0a93ac2072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
Obtained 28 stack frames
#0 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0aa11611be]
#1 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) [0x7f0aa1162011]
#2 in /opt/anaconda/lib/libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x7ab) [0x7f0aa138a94b]
#3 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0a93ac7328]
#4 in /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
#5 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#6 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0a93ac2072]
#7 in /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
#8 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x537e) [0x56328c5f37ae]
#9 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#10 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#11 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#12 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#13 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#14 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#15 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#16 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#17 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#18 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#19 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#20 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#21 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#22 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#23 in /opt/anaconda/bin/python(PyObject_Call+0x6e) [0x56328c54195e]
#24 in /opt/anaconda/bin/python(+0x223037) [0x56328c641037]
#25 in /opt/anaconda/bin/python(+0x1e3468) [0x56328c601468]
#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f0c771626db]
#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f0c76e8b88f]
, Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[12:15:45] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!
Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f0a99565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f0a995670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f0a995830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f0a99562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f0a9938a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aa4a71328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aa4a6c072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
Obtained 28 stack frames
#0 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0a991611be]
#1 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) [0x7f0a99162011]
#2 in /opt/anaconda/lib/libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x7ab) [0x7f0a9938a94b]
#3 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aa4a71328]
#4 in /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
#5 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#6 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aa4a6c072]
#7 in /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
#8 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x537e) [0x56328c5f37ae]
#9 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#10 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#11 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#12 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#13 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#14 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#15 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#16 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#17 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#18 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#19 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#20 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#21 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#22 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#23 in /opt/anaconda/bin/python(PyObject_Call+0x6e) [0x56328c54195e]
#24 in /opt/anaconda/bin/python(+0x223037) [0x56328c641037]
#25 in /opt/anaconda/bin/python(+0x1e3468) [0x56328c601468]
#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f0c771626db]
#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f0c76e8b88f]
, Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[12:15:45] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!
Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f0a99565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f0a995670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f0a995830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f0a99562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f0a9938a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aa4b6d328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aa4b68072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
Obtained 28 stack frames
#0 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0a991611be]
#1 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) [0x7f0a99162011]
#2 in /opt/anaconda/lib/libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x7ab) [0x7f0a9938a94b]
#3 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aa4b6d328]
#4 in /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
#5 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#6 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aa4b68072]
#7 in /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
#8 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x537e) [0x56328c5f37ae]
#9 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#10 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#11 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#12 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#13 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#14 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#15 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#16 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#17 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#18 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#19 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#20 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#21 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#22 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#23 in /opt/anaconda/bin/python(PyObject_Call+0x6e) [0x56328c54195e]
#24 in /opt/anaconda/bin/python(+0x223037) [0x56328c641037]
#25 in /opt/anaconda/bin/python(+0x1e3468) [0x56328c601468]
#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f0c771626db]
#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f0c76e8b88f]
, Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[12:15:45] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!
Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f0a9d565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f0a9d5670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f0a9d5830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f0a9d562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f0a9d38a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aa8b33328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aa8b2e072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
Obtained 28 stack frames
#0 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0a9d1611be]
#1 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) [0x7f0a9d162011]
#2 in /opt/anaconda/lib/libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x7ab) [0x7f0a9d38a94b]
#3 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aa8b33328]
#4 in /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
#5 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#6 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aa8b2e072]
#7 in /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
#8 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x537e) [0x56328c5f37ae]
#9 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#10 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#11 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#12 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#13 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#14 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#15 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#16 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#17 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#18 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#19 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#20 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#21 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#22 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#23 in /opt/anaconda/bin/python(PyObject_Call+0x6e) [0x56328c54195e]
#24 in /opt/anaconda/bin/python(+0x223037) [0x56328c641037]
#25 in /opt/anaconda/bin/python(+0x1e3468) [0x56328c601468]
#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f0c771626db]
#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f0c76e8b88f]
, Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[12:15:45] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!
Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f0a9d565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f0a9d5670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f0a9d5830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f0a9d562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f0a9d38a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aa8cb0328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aa8cab072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
Obtained 28 stack frames
#0 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0a9d1611be]
#1 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) [0x7f0a9d162011]
#2 in /opt/anaconda/lib/libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x7ab) [0x7f0a9d38a94b]
#3 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aa8cb0328]
#4 in /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
#5 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#6 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aa8cab072]
#7 in /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
#8 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x537e) [0x56328c5f37ae]
#9 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#10 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#11 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#12 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#13 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#14 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#15 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#16 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#17 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#18 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#19 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#20 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#21 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#22 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#23 in /opt/anaconda/bin/python(PyObject_Call+0x6e) [0x56328c54195e]
#24 in /opt/anaconda/bin/python(+0x223037) [0x56328c641037]
#25 in /opt/anaconda/bin/python(+0x1e3468) [0x56328c601468]
#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f0c771626db]
#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f0c76e8b88f]
, Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[12:15:45] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!
Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f0a9d565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f0a9d5670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f0a9d5830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f0a9d562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f0a9d38a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0a8fb07328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0a8fb02072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
Obtained 28 stack frames
#0 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0a9d1611be]
#1 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) [0x7f0a9d162011]
#2 in /opt/anaconda/lib/libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x7ab) [0x7f0a9d38a94b]
#3 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0a8fb07328]
#4 in /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
#5 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#6 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0a8fb02072]
#7 in /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
#8 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x537e) [0x56328c5f37ae]
#9 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#10 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#11 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#12 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#13 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#14 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#15 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#16 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#17 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#18 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#19 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#20 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#21 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#22 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#23 in /opt/anaconda/bin/python(PyObject_Call+0x6e) [0x56328c54195e]
#24 in /opt/anaconda/bin/python(+0x223037) [0x56328c641037]
#25 in /opt/anaconda/bin/python(+0x1e3468) [0x56328c601468]
#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f0c771626db]
#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f0c76e8b88f] |
(base) root@nvidia-MLT:/opt/jupyter/cudf# ./print_env.sh Click here to see environment details
|
@nikiforov-sm sorry am not able to understand the issue here ? are you building cudf ?? For building cuml you dont need to build cudf. you can use the one from conda |
We have no errors at build routines. Traceback (most recent call last): File "", line 1, in File "/opt/anaconda/lib/python3.7/site-packages/cuml/dask/ensemble/randomforestregressor.py", line 403, in predict raise_exception_from_futures(futures) File "/opt/anaconda/lib/python3.7/site-packages/cuml/dask/common/utils.py", line 131, in raise_exception_from_futures len(errs), len(futures), ", ".join(map(str, errs)) RuntimeError: 8 of 8 worker jobs failed: Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[15:35:18] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node! Stack trace returned 10 entries: [bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f173d565d3f] [bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f173d5670b9] [bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f173d5830cd] [bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f173d562926] [bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f173d38a49a] [bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f1749093328] [bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x55ce6e2c59cf] [bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x55ce6e2dcc43] [bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f174908e072] [bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x55ce6e327d2b] |
can you try without dask ? like this. I was able to train at depth of 30-50. import numpy as np max_depth = 30 df = pd.read_csv('test.csv').drop('Unnamed: 0', axis=1) X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, from cuml.ensemble import RandomForestRegressor as cumlRFR |
@vishalmehta1991 Thank you. With dask we have errors: distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor. Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize header, frames = dumps(x, context=context) if wants_context else dumps(x) File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps return {"serializer": "pickle"}, [pickle.dumps(x)] File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 952, in dumps cp.dump(obj) File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump return Pickler.dump(self, obj) File "/opt/anaconda/lib/python3.7/pickle.py", line 437, in dump self.save(obj) File "/opt/anaconda/lib/python3.7/pickle.py", line 524, in save rv = reduce(self.proto) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor. Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize header, frames = dumps(x, context=context) if wants_context else dumps(x) File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps return {"serializer": "pickle"}, [pickle.dumps(x)] File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 952, in dumps cp.dump(obj) File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump return Pickler.dump(self, obj) File "/opt/anaconda/lib/python3.7/pickle.py", line 437, in dump self.save(obj) File "/opt/anaconda/lib/python3.7/pickle.py", line 524, in save rv = reduce(self.proto) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor. Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize header, frames = dumps(x, context=context) if wants_context else dumps(x) File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps return {"serializer": "pickle"}, [pickle.dumps(x)] File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 952, in dumps cp.dump(obj) File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump return Pickler.dump(self, obj) File "/opt/anaconda/lib/python3.7/pickle.py", line 437, in dump self.save(obj) File "/opt/anaconda/lib/python3.7/pickle.py", line 524, in save rv = reduce(self.proto) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor. Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize header, frames = dumps(x, context=context) if wants_context else dumps(x) File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps return {"serializer": "pickle"}, [pickle.dumps(x)] File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 952, in dumps cp.dump(obj) File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump return Pickler.dump(self, obj) File "/opt/anaconda/lib/python3.7/pickle.py", line 437, in dump self.save(obj) File "/opt/anaconda/lib/python3.7/pickle.py", line 524, in save rv = reduce(self.proto) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor. Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize header, frames = dumps(x, context=context) if wants_context else dumps(x) File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps return {"serializer": "pickle"}, [pickle.dumps(x)] File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 952, in dumps cp.dump(obj) File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump return Pickler.dump(self, obj) File "/opt/anaconda/lib/python3.7/pickle.py", line 437, in dump self.save(obj) File "/opt/anaconda/lib/python3.7/pickle.py", line 524, in save rv = reduce(self.proto) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor. Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize header, frames = dumps(x, context=context) if wants_context else dumps(x) File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps return {"serializer": "pickle"}, [pickle.dumps(x)] File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 952, in dumps cp.dump(obj) File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump return Pickler.dump(self, obj) File "/opt/anaconda/lib/python3.7/pickle.py", line 437, in dump self.save(obj) File "/opt/anaconda/lib/python3.7/pickle.py", line 524, in save rv = reduce(self.proto) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor. Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize header, frames = dumps(x, context=context) if wants_context else dumps(x) File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps return {"serializer": "pickle"}, [pickle.dumps(x)] File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 952, in dumps cp.dump(obj) File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump return Pickler.dump(self, obj) File "/opt/anaconda/lib/python3.7/pickle.py", line 437, in dump self.save(obj) File "/opt/anaconda/lib/python3.7/pickle.py", line 524, in save rv = reduce(self.proto) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor. Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize header, frames = dumps(x, context=context) if wants_context else dumps(x) File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps return {"serializer": "pickle"}, [pickle.dumps(x)] File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 952, in dumps cp.dump(obj) File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump return Pickler.dump(self, obj) File "/opt/anaconda/lib/python3.7/pickle.py", line 437, in dump self.save(obj) File "/opt/anaconda/lib/python3.7/pickle.py", line 524, in save rv = reduce(self.proto) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required distributed.nanny - WARNING - Restarting worker distributed.nanny - WARNING - Restarting worker distributed.nanny - WARNING - Restarting worker distributed.nanny - WARNING - Restarting worker distributed.nanny - WARNING - Restarting worker ERROR:Task exception was never retrieved future: exception=TypeError('exceptions must derive from BaseException')> Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/asyncio/tasks.py", line 603, in _wrap_awaitable return (yield from awaitable.__await__()) File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 251, in start response = await self.instantiate() File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 334, in instantiate result = await self.process.start() File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 528, in start msg = await self._wait_until_connected(uid) File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 642, in _wait_until_connected raise msg TypeError: exceptions must derive from BaseException ERROR:Task exception was never retrieved future: exception=TypeError('exceptions must derive from BaseException')> Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/asyncio/tasks.py", line 603, in _wrap_awaitable return (yield from awaitable.__await__()) File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 251, in start response = await self.instantiate() File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 334, in instantiate result = await self.process.start() File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 528, in start msg = await self._wait_until_connected(uid) File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 642, in _wait_until_connected raise msg TypeError: exceptions must derive from BaseException ERROR:Task exception was never retrieved future: exception=TypeError('exceptions must derive from BaseException')> Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/asyncio/tasks.py", line 603, in _wrap_awaitable return (yield from awaitable.__await__()) File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 251, in start response = await self.instantiate() File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 334, in instantiate result = await self.process.start() File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 528, in start msg = await self._wait_until_connected(uid) File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 642, in _wait_until_connected raise msg TypeError: exceptions must derive from BaseException ERROR:Task exception was never retrieved future: exception=TypeError('exceptions must derive from BaseException')> Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/asyncio/tasks.py", line 603, in _wrap_awaitable return (yield from awaitable.__await__()) File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 251, in start response = await self.instantiate() File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 334, in instantiate result = await self.process.start() File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 528, in start msg = await self._wait_until_connected(uid) File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 642, in _wait_until_connected raise msg TypeError: exceptions must derive from BaseException ERROR:Task exception was never retrieved future: exception=TypeError('exceptions must derive from BaseException')> Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/asyncio/tasks.py", line 603, in _wrap_awaitable return (yield from awaitable.__await__()) File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 251, in start response = await self.instantiate() File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 334, in instantiate result = await self.process.start() File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 528, in start msg = await self._wait_until_connected(uid) File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 642, in _wait_until_connected raise msg TypeError: exceptions must derive from BaseException ERROR:Task exception was never retrieved future: exception=TypeError('exceptions must derive from BaseException')> Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/asyncio/tasks.py", line 603, in _wrap_awaitable return (yield from awaitable.__await__()) File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 251, in start response = await self.instantiate() File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 334, in instantiate result = await self.process.start() File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 528, in start msg = await self._wait_until_connected(uid) File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 642, in _wait_until_connected raise msg TypeError: exceptions must derive from BaseException ERROR:Task exception was never retrieved future: exception=TypeError('exceptions must derive from BaseException')> Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/asyncio/tasks.py", line 603, in _wrap_awaitable return (yield from awaitable.__await__()) File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 251, in start response = await self.instantiate() File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 334, in instantiate result = await self.process.start() File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 528, in start msg = await self._wait_until_connected(uid) File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 642, in _wait_until_connected raise msg TypeError: exceptions must derive from BaseException ERROR:Task exception was never retrieved future: exception=TypeError('exceptions must derive from BaseException')> Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/asyncio/tasks.py", line 603, in _wrap_awaitable return (yield from awaitable.__await__()) File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 251, in start response = await self.instantiate() File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 334, in instantiate result = await self.process.start() File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 528, in start msg = await self._wait_until_connected(uid) File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 642, in _wait_until_connected raise msg TypeError: exceptions must derive from BaseException distributed.nanny - WARNING - Restarting worker distributed.nanny - WARNING - Restarting worker distributed.nanny - WARNING - Restarting worker |
Hi, After rebuild from sources from dask_cuda import LocalCUDACluster from dask.distributed import Client cluster = LocalCUDACluster(threads_per_worker=1) if 'c' in globals(): c.close() c = Client(cluster) c.close() return errors: distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor. Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required Could you help us? |
dask-cuda from rapidsai-nightly: (base) root@nvidia-MLT:/opt/jupyter/cuml/python# conda list dask-cuda # packages in environment at /opt/anaconda: Name Version Build Channeldask-cuda 0.12.0a191218 py37_36 rapidsai-nightly |
@nikiforov-sm @oyilmaz-nvidia Shard the data across all workersX_train_dask = dask_cudf.from_cudf(X_train, npartitions=n_partitions) Build and train the modeldask_model = daskRFR(max_depth=max_depth, n_estimators=n_trees, dask_model.fit(X_train_dask, y_train_dask) |
@vishalmehta1991 |
yes daskRFR is rf regressor from cuml.dask.ensemble Dask RFfrom cuml.dask.ensemble import RandomForestRegressor as daskRFR Start clustercluster = LocalCUDACluster(threads_per_worker=1, n_workers=n_partitions) Shared the data across all workersX_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, Build and train the modeldask_model = daskRFR(max_depth=max_depth, n_estimators=n_trees, dask_model.fit(X_train_dask, y_train_dask) |
Hi, thank you for update. from dask_cuda import LocalCUDACluster from dask.distributed import Client n_partitions = 8 cluster = LocalCUDACluster(threads_per_worker=1, n_workers=n_partitions) c = Client(cluster) distributed.nanny - WARNING - Restarting worker distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor. Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required |
@nikiforov-sm did you try with the code-freezed 0.12 branch ? |
@vishalmehta1991 No, I didn't. |
@vishalmehta1991 I'm trying to help get this resolved, but doesn't seem to work for me on 0.12 branch either. Can you provide detailed steps of what you're doing if you're able to run this code successfully? I tried building from CUML branch-0.12 per #1467 (comment), but I'm still hitting errors on the code from the original post: #1467 (comment) and also on Vishal's snippet here: #1467 (comment) Building 0.12 branch
Verify version:
Original Code (Error)There's an error like this from each of the 8 workers:
Vishal's Code Snippet (error)import cuml
import cudf
import dask_cudf
import numpy as np
import pandas as pd
from dask_cuda import LocalCUDACluster
from dask.distributed import Client, wait
from cuml.dask.common import utils as dask_utils
from cuml.dask.ensemble import RandomForestRegressor as daskRFR
if __name__ == '__main__':
# Start cluster
n_partitions = 4
cluster = LocalCUDACluster(threads_per_worker=1, n_workers=n_partitions)
c = Client(cluster)
workers = c.has_what().keys()
# Desired parameters
max_depth = 20
n_trees = 30
rows, cols = 10000, 74
n_streams = len(workers)
# Generate fake data for example's sake
x = np.random.random((rows, cols))
df = pd.DataFrame(x, columns=["C{}".format(i) for i in range(cols)])
X = df.drop(['C2'],1).astype(np.float, 32) #.to_numpy().astype(np.float, 32)
y = pd.DataFrame(df['C2'].astype(np.float, 32))
"""
File "/opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/preprocessing/model_selection.py", line 251, in train_test_split
return X_train, X_test, y_train, y_test
UnboundLocalError: local variable 'X_train' referenced before assignment
"""
X = cudf.DataFrame.from_pandas(X)
y = cudf.DataFrame.from_pandas(y)
# Shared the data across all workers
X_train, X_test, y_train, y_test = cuml.preprocessing.model_selection.train_test_split(X, y,
test_size=0.2)
X_train_dask = dask_cudf.from_cudf(X_train, npartitions=n_partitions)
y_train_dask = dask_cudf.from_cudf(y_train, npartitions=n_partitions)
X_train_dask, y_train_dask = dask_utils.persist_across_workers(c, [X_train_dask, y_train_dask], workers=workers)
# Build and train the model
dask_model = daskRFR(max_depth=max_depth, n_estimators=n_trees,
n_streams=n_streams,n_bins=16,split_algo=0,split_criterion=2)
dask_model.fit(X_train_dask, y_train_dask)
cuml_y_pred_dask = dask_model.predict(X_test.as_matrix()) #X_test is a cudf dataframe I also get similar errors on each worker when running a slightly modified version of Vishal's example above:
|
@rmccorm4 here is a full train code using dask as well as RF single gpu |
@nikiforov-sm does this above snippet work well for you? #1467 (comment) If you're having trouble setting up env / building from source, using containers should make your life easier, for example the "Building 0.12 Branch" section of this comment: #1467 (comment)
This way you can try various configurations without messing up your host environment. |
@rmccorm4 We have multiple errors on row cluster = LocalCUDACluster(threads_per_worker=1, n_workers=n_partitions) c = Client(cluster) like this: distributed.nanny - WARNING - Restarting worker distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor. Traceback (most recent call last): File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL) File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__ File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info TypeError: an integer is required |
@nikiforov-sm I didn't encounter this error when trying the scripts above. Can you reproduce this issue in a container? |
We can't reproduce this issue in a container :) Thank you! We have successfully trained model with max_depth<=28. |
@vishalmehta1991 said he's tested depths up to 50 here: #1467 (comment) So what's the root cause of this gap? Upper bound of |
@rmccorm4 I don't think its an RF issue. Seems to me its more of a dask thing. |
@rmccorm4 @vishalmehta1991 It looks like OOM error on train. If I use max_features='sqrt' to decrease max_features - I can train model with bigger max_depth (<=30) without any errors. And by the way - there is no API to save trained RandomForestRegressor model like XXX.save_model(path)? |
There is pickle support for RF. check example here |
@vishalmehta1991 Is pickle working for 12th version? I have an error: --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in 5 with open(model_path, 'wb') as pf: ----> 6 pickle.dump(fitted, pf) 7 except (TypeError, ValueError) as e: TypeError: can't pickle dict_keys objects |
Hi @nikiforov-sm , Can you share the full script that's causing the failure above? i.e. What is the It looks like per vishal's post above, that the models can be pickled. It seems that your |
Hi @rmccorm4 , raw_df = dask_cudf.read_csv(train_csv,dtype=['float32']*ncols, partitions=10000000) raw_df = raw_df.persist() df = raw_df[x_cols + ['KPI_0']] X_train_dask, y_train_dask = dask_utils.persist_across_workers(c, [df[x_cols], df['KPI_0']], workers=workers) cuml_model = cumlDaskRF( max_depth=30, n_estimators=100, n_streams=1, n_bins=16,split_algo=0,split_criterion=2 ) fitted = cuml_model.fit(X_train_dask, y_train_dask) wait(cuml_model.rfs) I've tested with fitted model (variable fitted) and exact model (variable cuml_model) as in the example https://github.com/rapidsai/cuml/blob/branch-0.12/python/cuml/test/test_pickle.py: model_path='/data/cuml.model' with open(model_path, 'wb') as pf: pickle.dump(cuml_model, pf) raise: --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in 1 with open(model_path, 'wb') as pf: ----> 2 pickle.dump(cuml_model, pf) TypeError: can't pickle dict_keys objects |
We currently do not have the option to pickle dask RF models. |
Thank you! |
FYI, I believe upgrading to CuML 0.13 as mentioned above was the solution, thanks @Salonijain27 ! |
What is your question?
Hi, I have a sample script here that reads in a DF of 10k rows and 74 columns. This is just a toy example to mimic some real data that is being used.
The desire is to have large values for max_depth / n_trees on something like a DGX-1 / DGX-2, but on this toy example the system is hitting GPU OOM errors.
The goal is to use parameters such as these on large datasets:
Are there any tips/tricks that can be done here to better manage the memory to work with large datasets without running OOM?
The text was updated successfully, but these errors were encountered: