Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QST] Best practices to achieve greater max_depth and n_trees parameters in RandomForestRegressor #1467

Closed
rmccorm4 opened this issue Dec 10, 2019 · 44 comments
Labels
? - Needs Triage Need team to review and classify question Further information is requested

Comments

@rmccorm4
Copy link

rmccorm4 commented Dec 10, 2019

What is your question?

Hi, I have a sample script here that reads in a DF of 10k rows and 74 columns. This is just a toy example to mimic some real data that is being used.

The desire is to have large values for max_depth / n_trees on something like a DGX-1 / DGX-2, but on this toy example the system is hitting GPU OOM errors.

import numpy as np
import sklearn

import pandas as pd
import cudf
import cuml

from sklearn import model_selection, datasets

from cuml.dask.common import utils as dask_utils
from dask.distributed import Client, wait
from dask_cuda import LocalCUDACluster
import dask_cudf

from sklearn.metrics import mean_squared_error
from cuml.dask.ensemble import RandomForestRegressor as cumlDaskRF
from sklearn.ensemble import RandomForestRegressor as sklRF

if __name__ == '__main__':
    # Desired parameters
    max_depth = 20
    n_trees = 30
    rows, cols = 10000, 74

    cluster = LocalCUDACluster(threads_per_worker=1)
    if 'c' in globals():
        c.close()
    c = Client(cluster)

    workers = c.has_what().keys()
    n_workers = len(workers)
    n_streams = 8

    # Generate fake data for example's sake
    x = np.random.random((rows, cols))
    df = pd.DataFrame(x, columns=["C{}".format(i) for i in range(cols)])

    X = df.drop(['C2'],1).to_numpy().astype(np.float, 32)
    y = df['C2'].astype(np.float, 32)
    X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=0.2)

    n_partitions = n_workers
    X_train_cudf = cudf.DataFrame.from_pandas(pd.DataFrame(X_train))
    y_train_cudf = cudf.Series(y_train)
    X_train_dask = dask_cudf.from_cudf(X_train_cudf, npartitions=n_partitions)
    y_train_dask = dask_cudf.from_cudf(y_train_cudf, npartitions=n_partitions)
    X_train_dask, y_train_dask = \
      dask_utils.persist_across_workers(c, [X_train_dask, y_train_dask], workers=workers)

    skl_model = sklRF(max_depth=max_depth, n_estimators=n_trees, n_jobs=-1)
    skl_model.fit(X_train, y_train)

    cuml_model = cumlDaskRF(max_depth=max_depth, n_estimators=n_trees,
                            n_streams=n_streams,
                            workers = workers
                           )
    cuml_model.fit(X_train_dask, y_train_dask)

    wait(cuml_model.rfs)

    skl_y_pred = skl_model.predict(X_test)
    print("SKLearn accuracy:  ", mean_squared_error(y_test, skl_y_pred))

    cuml_y_pred = cuml_model.predict(X_test)
    print("CuML accuracy:     ", mean_squared_error(y_test, cuml_y_pred))                                                                                             

The goal is to use parameters such as these on large datasets:

    max_depth = 20
    n_trees = 30

Are there any tips/tricks that can be done here to better manage the memory to work with large datasets without running OOM?

@rmccorm4 rmccorm4 added ? - Needs Triage Need team to review and classify question Further information is requested labels Dec 10, 2019
@rmccorm4
Copy link
Author

I believe some work was done by @miguelangel here (max_depth18.pdf) to try some optimizations to the code and squeeze out some memory, achieving:

max_depth = 18
n_trees = 30

But more is still left to be desired.

@nikiforov-sm
Copy link

nikiforov-sm commented Dec 10, 2019

Hi.

Output for max_depth=20:

SKLearn accuracy:   0.08841762935540724

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
 in 
     61 print("SKLearn accuracy:  ", mean_squared_error(y_test, skl_y_pred))
     62 
---> 63 cuml_y_pred = cuml_model.predict(X_test)
     64 print("CuML accuracy:     ", mean_squared_error(y_test, cuml_y_pred))

/opt/anaconda/lib/python3.6/site-packages/cuml/dask/ensemble/randomforestregressor.py in predict(self, X)
    397         rslts = list()
    398         for d in range(len(f)):
--> 399             rslts.append(f[d].result())
    400             indexes.append(0)
    401 

/opt/anaconda/lib/python3.6/site-packages/distributed/client.py in result(self, timeout)
    225         result = self.client.sync(self._result, callback_timeout=timeout, raiseit=False)
    226         if self.status == "error":
--> 227             six.reraise(*result)
    228         elif self.status == "cancelled":
    229             raise result

/opt/anaconda/lib/python3.6/site-packages/six.py in reraise(tp, value, tb)
    693                 value = tp()
    694             if value.__traceback__ is not tb:
--> 695                 raise value.with_traceback(tb)
    696             raise value
    697         finally:

/opt/anaconda/lib/python3.6/site-packages/cuml/dask/ensemble/randomforestregressor.py in _predict()
    286     @staticmethod
    287     def _predict(model, X, r):
--> 288         return model.predict(X)
    289 
    290     def fit(self, X, y):

cuml/ensemble/randomforestregressor.pyx in cuml.ensemble.randomforestregressor.RandomForestRegressor.predict()

RuntimeError: ('Long error message', 'Exception occured! file=/conda/conda-bld/libcuml_1566588242169/work/cpp/src/decisiontree/decisiontree_impl.cuh line=392: Cannot predict w/ empty tree!\nObtained 37 stack frames\n#0 in /opt/anaconda/lib/python3.6/site-packages/cuml/common/../../../../libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f9d9806556e]\n#1 in /opt/anaconda/lib/python3.6/site-packages/cuml/common/../../../../libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x80) [0x7f9d98066080]\n#2 in /opt/anaconda/lib/python3.6/site-packages/cuml/common/../../../../libcuml++.so(_ZNK2ML12DecisionTree16DecisionTreeBaseIddE7predictERKNS_10cumlHandleEPKNS0_16TreeMetaDataNodeIddEEPKdiiPdb+0x20b) [0x7f9d9809d27b]\n#3 in /opt/anaconda/lib/python3.6/site-packages/cuml/common/../../../../libcuml++.so(_ZNK2ML11rfRegressorIdE7predictERKNS_10cumlHandleEPKdiiPdPKNS_20RandomForestMetaDataIddEEb+0x221) [0x7f9d9823e5a1]\n#4 in /opt/anaconda/lib/python3.6/site-packages/cuml/common/../../..')

@vishalmehta1991
Copy link
Contributor

vishalmehta1991 commented Dec 10, 2019

No of trees should not affect the memory consumption. So you can bump those up say 100 trees at depth 16

@nikiforov-sm
Copy link

Ok, but how to get deeper trees?

@teju85
Copy link
Member

teju85 commented Dec 11, 2019

We are trying to address these and other issues with RF via 2 parallel approaches:

  1. Short term: @vishalmehta1991 will update the existing algo to be more efficient wrt memory footprint. This should be available in 0.12 branch soon.
  2. Long term: myself and Vishal are also designing a more robust (and also potentially faster) implementation. ETA for this is currently unknown.

@vishalmehta1991
Copy link
Contributor

@nikiforov-sm I have an implementation for classification. I can give you regression for deep trees, can you manage building cuml from source ? or you would need to wait util we integrate in the 0.12 nightly

@nikiforov-sm
Copy link

@vishalmehta1991 Yes, please.
I'm ready to try build cuml from source.

@vishalmehta1991
Copy link
Contributor

vishalmehta1991 commented Dec 12, 2019

Hi @nikiforov-sm
Here is a branch you can use https://github.com/vishalmehta1991/cuml/tree/gather-tree-builder

I have tested it to depths of 50. Hopefully works for you as well.
Feel free to write me back if you see issues.

@nikiforov-sm
Copy link

Great news! Thank you!

@nikiforov-sm
Copy link

Currently we have issues with building cuml from source.
We are trying to build and run it in docker-container.

(cuml_dev) root@nvidia-MLT:/opt/jupyter/cuml/cpp/build# cmake .. -DCMAKE_IGNORE_PATH=$CONDA_PREFIX/lib -DCMAKE_INSTALL_PREFIX=/opt/anaconda
-- The CXX compiler identification is GNU 7.4.0
-- The CUDA compiler identification is NVIDIA 10.1.243
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- works
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Setting build type to 'Release' since none specified.
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda (found suitable version "10.1", minimum required is "9.0")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- OpenMP found in
-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.11")
-- ZLib found in /usr/include
-- Manually setting BLAS to
-- Found Protobuf: /usr/local/lib/libprotobuf.a;-lpthread (found version "3.8.0")
-- Found ClangFormat: /opt/anaconda/envs/cuml_dev/bin/clang-format (found suitable exact version "8.0.0")
-- Building with OpenMP support
Auto detection of gpu-archs: 70
-- Building for GPU_ARCHS = 70
-- Enabling the GLIBCXX11 ABI
-- Found NCCL: /usr/lib/x86_64-linux-gnu/libnccl.so (found version "2.5.6")
-- Found UCX: /usr/local/lib/libucp.so
-- Configuring done
-- Generating done
-- Build files have been written to: /opt/jupyter/cuml/cpp/build

Error in make:

(cuml_dev) root@nvidia-MLT:/opt/jupyter/cuml/cpp/build# make [ 0%] Performing update step for 'cub'
[ 0%] No configure step for 'cub'
[ 1%] No build step for 'cub'
[ 1%] No install step for 'cub'
[ 2%] Completed 'cub'
[ 4%] Built target cub
[ 4%] Performing update step for 'cutlass'
[ 4%] No configure step for 'cutlass'
[ 5%] No build step for 'cutlass'
[ 6%] No install step for 'cutlass'
[ 6%] Completed 'cutlass'
[ 8%] Built target cutlass
[ 12%] Built target faiss
[ 13%] Run clang-format on the cpp source files [ 13%] Built target format [ 17%] Built target treelite [ 32%] Built target cuml++ [ 32%] Linking CXX shared library libcuml.so
/usr/bin/ld: /usr/local/lib/libprotobuf.a(arena.o): relocation R_X86_64_TPOFF32 against symbol _ZN6google8protobuf8internal9ArenaImpl13thread_cache_E' can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: /usr/local/lib/libprotobuf.a(descriptor.o): relocation R_X86_64_PC32 against symbol ZZN6google8protobuf8internal16OnShutdownDeleteINS0_25EncodedDescriptorDatabaseEEEPT_S5_ENUlPKvE_4_FUNES7' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value
collect2: error: ld returned 1 exit status
CMakeFiles/cuml.dir/build.make:157: recipe for target 'libcuml.so' failed
make[2]: *** [libcuml.so] Error 1
CMakeFiles/Makefile2:330: recipe for target 'CMakeFiles/cuml.dir/all' failed
make[1]: *** [CMakeFiles/cuml.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2

protobuf gathered from here:
https://github.com/protocolbuffers/protobuf/releases/tag/v3.8.0
https://github.com/protocolbuffers/protobuf/releases/download/v3.8.0/protobuf-all-3.8.0.tar.gz

@vishalmehta1991
Copy link
Contributor

@nikiforov-sm Hmm, i dont see this. To build from source i recommend use anaconda. Make sure you update the code.

  1. git clone https://github.com/vishalmehta1991/cuml.git -b gather-tree-builder
  2. conda activate && conda env update --name base --file cuml/conda/environments/cuml_dev_cuda10.1.yml
  3. cd cuml && ./build.sh

I use this approach and works well

@nikiforov-sm
Copy link

Are we need to build in anaconda using Python 3.7?

@nikiforov-sm
Copy link

Anaconda python 3.6:

Comparing specs that have this dependency: 27%|██████████████████████████▌ | 18/67 [23:06<1:02:55, 77.04s/it]
Finding shortest conflict path for setuptools[version='>=40.0']: 50%|███████████████████████████████████████▌ | 14/28 [08:12<05:24, 23.18s/it]

Finding shortest conflict path for setuptools: 0%| | 0/1 [00:00<?, ?it/s]
Finding shortest conflict path for setuptools: 0%| | 0/1 [00:00<?, ?it/s]
Finding shortest conflict path for libnvstrings[version='>=0.10.0a.1191022,<0.11.0a0']: 91%|██████████████████████████████████████████████████▊ | 39/43 [00:15<00:01, 2.62it/s]
Finding shortest conflict path for libnvstrings[version='>=0.12.0b.191213,<0.13.0a0']: 87%|█████████████████████████████████████████████████▌ | 20/23 [02:11<00:20, 6.93s/it]
Finding shortest conflict path for cudf=0.12: 53%|████████████████████████████████████████████████████▏ | 25/47 [00:07<00:08, 2.69it/s]

@vishalmehta1991
Copy link
Contributor

not sure if we need 3.7, but i typically use 3.7,
Also if the base env does not update due to depedency,
you can always create a new env, which is very quick.

conda env create --name cuml_dev --file cuml/conda/environments/cuml_dev_cuda10.1.yml
conda activate cuml_dev

@nikiforov-sm
Copy link

nikiforov-sm commented Dec 19, 2019

Thank you!
We've made a build.
With test-script from the top we have errors:

SKLearn accuracy:   0.08639243151504108
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
 in 
     61 print("SKLearn accuracy:  ", mean_squared_error(y_test, skl_y_pred))
     62 
---> 63 cuml_y_pred = cuml_model.predict(X_test)
     64 print("CuML accuracy:     ", mean_squared_error(y_test, cuml_y_pred))

/opt/anaconda/lib/python3.7/site-packages/cuml/dask/ensemble/randomforestregressor.py in predict(self, X)
    401 
    402         wait(futures)
--> 403         raise_exception_from_futures(futures)
    404 
    405         indexes = list()

/opt/anaconda/lib/python3.7/site-packages/cuml/dask/common/utils.py in raise_exception_from_futures(futures)
    129     if errs:
    130         raise RuntimeError("%d of %d worker jobs failed: %s" % (
--> 131             len(errs), len(futures), ", ".join(map(str, errs))
    132             ))
    133 

RuntimeError: 8 of 8 worker jobs failed: Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[12:15:45] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!

Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f0aa1565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f0aa15670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f0aa15830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f0aa1562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f0aa138a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aacc29328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aacc24072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]



Obtained 28 stack frames
#0 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0aa11611be]
#1 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) [0x7f0aa1162011]
#2 in /opt/anaconda/lib/libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x7ab) [0x7f0aa138a94b]
#3 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aacc29328]
#4 in /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
#5 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#6 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aacc24072]
#7 in /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
#8 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x537e) [0x56328c5f37ae]
#9 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#10 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#11 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#12 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#13 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#14 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#15 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#16 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#17 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#18 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#19 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#20 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#21 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#22 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#23 in /opt/anaconda/bin/python(PyObject_Call+0x6e) [0x56328c54195e]
#24 in /opt/anaconda/bin/python(+0x223037) [0x56328c641037]
#25 in /opt/anaconda/bin/python(+0x1e3468) [0x56328c601468]
#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f0c771626db]
#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f0c76e8b88f]
, Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[12:15:45] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!

Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f0aa1565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f0aa15670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f0aa15830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f0aa1562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f0aa138a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0a93ac1328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0a93abc072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]



Obtained 28 stack frames
#0 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0aa11611be]
#1 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) [0x7f0aa1162011]
#2 in /opt/anaconda/lib/libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x7ab) [0x7f0aa138a94b]
#3 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0a93ac1328]
#4 in /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
#5 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#6 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0a93abc072]
#7 in /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
#8 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x537e) [0x56328c5f37ae]
#9 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#10 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#11 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#12 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#13 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#14 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#15 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#16 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#17 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#18 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#19 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#20 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#21 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#22 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#23 in /opt/anaconda/bin/python(PyObject_Call+0x6e) [0x56328c54195e]
#24 in /opt/anaconda/bin/python(+0x223037) [0x56328c641037]
#25 in /opt/anaconda/bin/python(+0x1e3468) [0x56328c601468]
#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f0c771626db]
#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f0c76e8b88f]
, Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[12:15:45] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!

Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f0aa1565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f0aa15670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f0aa15830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f0aa1562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f0aa138a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0a93ac7328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0a93ac2072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]



Obtained 28 stack frames
#0 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0aa11611be]
#1 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) [0x7f0aa1162011]
#2 in /opt/anaconda/lib/libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x7ab) [0x7f0aa138a94b]
#3 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0a93ac7328]
#4 in /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
#5 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#6 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0a93ac2072]
#7 in /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
#8 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x537e) [0x56328c5f37ae]
#9 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#10 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#11 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#12 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#13 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#14 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#15 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#16 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#17 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#18 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#19 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#20 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#21 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#22 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#23 in /opt/anaconda/bin/python(PyObject_Call+0x6e) [0x56328c54195e]
#24 in /opt/anaconda/bin/python(+0x223037) [0x56328c641037]
#25 in /opt/anaconda/bin/python(+0x1e3468) [0x56328c601468]
#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f0c771626db]
#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f0c76e8b88f]
, Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[12:15:45] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!

Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f0a99565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f0a995670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f0a995830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f0a99562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f0a9938a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aa4a71328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aa4a6c072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]



Obtained 28 stack frames
#0 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0a991611be]
#1 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) [0x7f0a99162011]
#2 in /opt/anaconda/lib/libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x7ab) [0x7f0a9938a94b]
#3 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aa4a71328]
#4 in /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
#5 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#6 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aa4a6c072]
#7 in /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
#8 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x537e) [0x56328c5f37ae]
#9 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#10 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#11 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#12 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#13 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#14 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#15 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#16 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#17 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#18 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#19 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#20 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#21 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#22 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#23 in /opt/anaconda/bin/python(PyObject_Call+0x6e) [0x56328c54195e]
#24 in /opt/anaconda/bin/python(+0x223037) [0x56328c641037]
#25 in /opt/anaconda/bin/python(+0x1e3468) [0x56328c601468]
#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f0c771626db]
#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f0c76e8b88f]
, Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[12:15:45] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!

Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f0a99565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f0a995670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f0a995830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f0a99562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f0a9938a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aa4b6d328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aa4b68072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]



Obtained 28 stack frames
#0 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0a991611be]
#1 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) [0x7f0a99162011]
#2 in /opt/anaconda/lib/libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x7ab) [0x7f0a9938a94b]
#3 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aa4b6d328]
#4 in /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
#5 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#6 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aa4b68072]
#7 in /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
#8 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x537e) [0x56328c5f37ae]
#9 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#10 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#11 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#12 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#13 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#14 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#15 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#16 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#17 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#18 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#19 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#20 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#21 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#22 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#23 in /opt/anaconda/bin/python(PyObject_Call+0x6e) [0x56328c54195e]
#24 in /opt/anaconda/bin/python(+0x223037) [0x56328c641037]
#25 in /opt/anaconda/bin/python(+0x1e3468) [0x56328c601468]
#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f0c771626db]
#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f0c76e8b88f]
, Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[12:15:45] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!

Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f0a9d565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f0a9d5670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f0a9d5830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f0a9d562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f0a9d38a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aa8b33328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aa8b2e072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]



Obtained 28 stack frames
#0 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0a9d1611be]
#1 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) [0x7f0a9d162011]
#2 in /opt/anaconda/lib/libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x7ab) [0x7f0a9d38a94b]
#3 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aa8b33328]
#4 in /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
#5 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#6 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aa8b2e072]
#7 in /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
#8 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x537e) [0x56328c5f37ae]
#9 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#10 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#11 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#12 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#13 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#14 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#15 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#16 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#17 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#18 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#19 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#20 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#21 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#22 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#23 in /opt/anaconda/bin/python(PyObject_Call+0x6e) [0x56328c54195e]
#24 in /opt/anaconda/bin/python(+0x223037) [0x56328c641037]
#25 in /opt/anaconda/bin/python(+0x1e3468) [0x56328c601468]
#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f0c771626db]
#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f0c76e8b88f]
, Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[12:15:45] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!

Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f0a9d565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f0a9d5670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f0a9d5830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f0a9d562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f0a9d38a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aa8cb0328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aa8cab072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]



Obtained 28 stack frames
#0 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0a9d1611be]
#1 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) [0x7f0a9d162011]
#2 in /opt/anaconda/lib/libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x7ab) [0x7f0a9d38a94b]
#3 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0aa8cb0328]
#4 in /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
#5 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#6 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0aa8cab072]
#7 in /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
#8 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x537e) [0x56328c5f37ae]
#9 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#10 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#11 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#12 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#13 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#14 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#15 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#16 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#17 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#18 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#19 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#20 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#21 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#22 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#23 in /opt/anaconda/bin/python(PyObject_Call+0x6e) [0x56328c54195e]
#24 in /opt/anaconda/bin/python(+0x223037) [0x56328c641037]
#25 in /opt/anaconda/bin/python(+0x1e3468) [0x56328c601468]
#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f0c771626db]
#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f0c76e8b88f]
, Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[12:15:45] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!

Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f0a9d565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f0a9d5670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f0a9d5830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f0a9d562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f0a9d38a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0a8fb07328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0a8fb02072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]



Obtained 28 stack frames
#0 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0a9d1611be]
#1 in /opt/anaconda/lib/libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71) [0x7f0a9d162011]
#2 in /opt/anaconda/lib/libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x7ab) [0x7f0a9d38a94b]
#3 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f0a8fb07328]
#4 in /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x56328c5359cf]
#5 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#6 in /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f0a8fb02072]
#7 in /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x56328c597d2b]
#8 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x537e) [0x56328c5f37ae]
#9 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#10 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#11 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#12 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#13 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#14 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#15 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#16 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x1e20) [0x56328c5f0250]
#17 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#18 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#19 in /opt/anaconda/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x56328c59679b]
#20 in /opt/anaconda/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x56328c5eead0]
#21 in /opt/anaconda/bin/python(_PyFunction_FastCallDict+0x10b) [0x56328c53550b]
#22 in /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x56328c54cc43]
#23 in /opt/anaconda/bin/python(PyObject_Call+0x6e) [0x56328c54195e]
#24 in /opt/anaconda/bin/python(+0x223037) [0x56328c641037]
#25 in /opt/anaconda/bin/python(+0x1e3468) [0x56328c601468]
#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f0c771626db]
#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f0c76e8b88f]

@nikiforov-sm
Copy link

(base) root@nvidia-MLT:/opt/jupyter/cudf# ./print_env.sh

Click here to see environment details

 **git***
 commit 724e237f28bef239fb55b3c30a98215ec9380c7e (HEAD -> branch-0.12, origin/branch-0.12, origin/HEAD)
 Merge: fbb273b15 a31a57332
 Author: Mark Harris <mharris@nvidia.com>
 Date:   Thu Dec 19 13:13:52 2019 +1100

 Merge pull request #3629 from rgsl888prabhu/hash_map_test_fail

 [REVIEW] Fix hash map test failure
 **git submodules***
 -b165e1fb11eeea64ccf95053e40f2424312599cc thirdparty/cub
 -63f644be44201467e3938d59ed9d89cc8725c35d thirdparty/jitify
 -39125e0e476b960c2001f1ec76a3441335ff91b2 thirdparty/libcudacxx

 ***OS Information***
 DISTRIB_ID=Ubuntu
 DISTRIB_RELEASE=18.04
 DISTRIB_CODENAME=bionic
 DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS"
 NAME="Ubuntu"
 VERSION="18.04.3 LTS (Bionic Beaver)"
 ID=ubuntu
 ID_LIKE=debian
 PRETTY_NAME="Ubuntu 18.04.3 LTS"
 VERSION_ID="18.04"
 HOME_URL="https://www.ubuntu.com/"
 SUPPORT_URL="https://help.ubuntu.com/"
 BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
 PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
 VERSION_CODENAME=bionic
 UBUNTU_CODENAME=bionic
 Linux nvidia-MLT.gksm.local 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

 ***GPU Information***
 Thu Dec 19 18:56:17 2019
 +-----------------------------------------------------------------------------+
 | NVIDIA-SMI 410.129      Driver Version: 410.129      CUDA Version: 10.1     |
 |-------------------------------+----------------------+----------------------+
 | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
 | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
 |===============================+======================+======================|
 |   0  Tesla V100-SXM3...  On   | 00000000:34:00.0 Off |                    0 |
 | N/A   32C    P0    66W / 350W |   1016MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |   1  Tesla V100-SXM3...  On   | 00000000:36:00.0 Off |                    0 |
 | N/A   34C    P0    66W / 350W |   1014MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |   2  Tesla V100-SXM3...  On   | 00000000:39:00.0 Off |                    0 |
 | N/A   37C    P0    66W / 350W |   1014MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |   3  Tesla V100-SXM3...  On   | 00000000:3B:00.0 Off |                    0 |
 | N/A   39C    P0    65W / 350W |   1014MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |   4  Tesla V100-SXM3...  On   | 00000000:57:00.0 Off |                    0 |
 | N/A   34C    P0    68W / 350W |   1014MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |   5  Tesla V100-SXM3...  On   | 00000000:59:00.0 Off |                    0 |
 | N/A   38C    P0    67W / 350W |   1014MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |   6  Tesla V100-SXM3...  On   | 00000000:5C:00.0 Off |                    0 |
 | N/A   34C    P0    67W / 350W |   1014MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |   7  Tesla V100-SXM3...  On   | 00000000:5E:00.0 Off |                    0 |
 | N/A   39C    P0    68W / 350W |   1014MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |   8  Tesla V100-SXM3...  On   | 00000000:B7:00.0 Off |                    0 |
 | N/A   33C    P0    78W / 350W |   6454MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |   9  Tesla V100-SXM3...  On   | 00000000:B9:00.0 Off |                    0 |
 | N/A   34C    P0    80W / 350W |   6611MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |  10  Tesla V100-SXM3...  On   | 00000000:BC:00.0 Off |                    0 |
 | N/A   41C    P0    82W / 350W |   6043MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |  11  Tesla V100-SXM3...  On   | 00000000:BE:00.0 Off |                    0 |
 | N/A   41C    P0    81W / 350W |   6303MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |  12  Tesla V100-SXM3...  On   | 00000000:E0:00.0 Off |                    0 |
 | N/A   34C    P0    78W / 350W |   6303MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |  13  Tesla V100-SXM3...  On   | 00000000:E2:00.0 Off |                    0 |
 | N/A   33C    P0    82W / 350W |   6045MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |  14  Tesla V100-SXM3...  On   | 00000000:E5:00.0 Off |                    0 |
 | N/A   41C    P0    82W / 350W |   6127MiB / 32480MiB |      1%      Default |
 +-------------------------------+----------------------+----------------------+
 |  15  Tesla V100-SXM3...  On   | 00000000:E7:00.0 Off |                    0 |
 | N/A   42C    P0    82W / 350W |   7755MiB / 32480MiB |      1%      Default |
 +-------------------------------+----------------------+----------------------+

 +-----------------------------------------------------------------------------+
 | Processes:                                                       GPU Memory |
 |  GPU       PID   Type   Process name                             Usage      |
 |=============================================================================|
 +-----------------------------------------------------------------------------+

 ***CPU***
 Architecture:        x86_64
 CPU op-mode(s):      32-bit, 64-bit
 Byte Order:          Little Endian
 CPU(s):              96
 On-line CPU(s) list: 0-95
 Thread(s) per core:  2
 Core(s) per socket:  24
 Socket(s):           2
 NUMA node(s):        2
 Vendor ID:           GenuineIntel
 CPU family:          6
 Model:               85
 Model name:          Intel(R) Xeon(R) Platinum 8168 CPU @ 2.70GHz
 Stepping:            4
 CPU MHz:             2274.754
 CPU max MHz:         3700.0000
 CPU min MHz:         1200.0000
 BogoMIPS:            5400.00
 Virtualization:      VT-x
 L1d cache:           32K
 L1i cache:           32K
 L2 cache:            1024K
 L3 cache:            33792K
 NUMA node0 CPU(s):   0-23,48-71
 NUMA node1 CPU(s):   24-47,72-95
 Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke md_clear flush_l1d

 ***CMake***
 /opt/anaconda/bin/cmake
 cmake version 3.14.5

 CMake suite maintained and supported by Kitware (kitware.com/cmake).

 ***g++***
 /usr/bin/g++
 g++ (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
 Copyright (C) 2017 Free Software Foundation, Inc.
 This is free software; see the source for copying conditions.  There is NO
 warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


 ***nvcc***
 /usr/local/cuda/bin/nvcc
 nvcc: NVIDIA (R) Cuda compiler driver
 Copyright (c) 2005-2019 NVIDIA Corporation
 Built on Sun_Jul_28_19:07:16_PDT_2019
 Cuda compilation tools, release 10.1, V10.1.243

 ***Python***
 /opt/anaconda/bin/python
 Python 3.7.3

 ***Environment Variables***
 PATH                            : /opt/anaconda/bin:/opt/anaconda/condabin:/opt/anaconda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
 LD_LIBRARY_PATH                 : /opt/oracle/instantclient_19_3
 NUMBAPRO_NVVM                   :
 NUMBAPRO_LIBDEVICE              :
 CONDA_PREFIX                    : /opt/anaconda
 PYTHON_PATH                     :

 ***conda packages***
 /opt/anaconda/bin/conda
 # packages in environment at /opt/anaconda:
 #
 # Name                    Version                   Build  Channel
 _anaconda_depends         2019.03                  py37_0
 _ipyw_jlab_nb_ext_conf    0.1.0                    py37_0
 _libgcc_mutex             0.1                        main    conda-forge
 alabaster                 0.7.12                   py37_0
 anaconda                  custom                   py37_1
 anaconda-client           1.7.2                    py37_0
 anaconda-navigator        1.9.7                    py37_0
 anaconda-project          0.8.2                    py37_0
 arrow-cpp                 0.15.0           py37h5ac5442_0    conda-forge
 asn1crypto                0.24.0                   py37_0
 astroid                   2.2.5                    py37_0
 astropy                   3.1.2            py37h7b6447c_0
 atomicwrites              1.3.0                    py37_1
 attrs                     19.1.0                   py37_1
 babel                     2.6.0                    py37_0
 backcall                  0.1.0                    py37_0
 backports                 1.0                      py37_1
 backports.os              0.1.1                    py37_0
 backports.shutil_get_terminal_size 1.0.0                    py37_2
 beautifulsoup4            4.7.1                    py37_1
 bitarray                  0.8.3            py37h14c3975_0
 bkcharts                  0.2                      py37_0
 blas                      2.14                   openblas    conda-forge
 bleach                    3.1.0                    py37_0
 blosc                     1.15.0               hd408876_0
 bokeh                     1.0.4                    py37_0
 boost-cpp                 1.70.0               ha2d47e9_1    conda-forge
 boto                      2.49.0                   py37_0
 bottleneck                1.2.1            py37h035aef0_1
 brotli                    1.0.7             he1b5a44_1000    conda-forge
 bzip2                     1.0.8                h516909a_2    conda-forge
 c-ares                    1.15.0            h516909a_1001    conda-forge
 ca-certificates           2019.11.28           hecc5488_0    conda-forge
 cairo                     1.14.12              h8948797_3
 certifi                   2019.11.28               py37_0    conda-forge
 cffi                      1.12.2           py37h2e261b9_1
 chardet                   3.0.4                    py37_1
 click                     7.0                      py37_0
 cloudpickle               0.8.0                    py37_0
 clyent                    1.2.2                    py37_1
 cmake                     3.14.5               hf94ab9c_0    conda-forge
 colorama                  0.4.1                    py37_0
 conda                     4.8.0                    py37_1    conda-forge
 conda-build               3.17.8                   py37_0
 conda-env                 2.6.0                         1
 conda-package-handling    1.6.0            py37h516909a_1    conda-forge
 conda-verify              3.1.1                    py37_0
 contextlib2               0.5.5                    py37_0
 cryptography              2.6.1            py37h1ba5d50_0
 cudatoolkit               10.1.243             h6bb024c_0    nvidia
 cudf                      0.12.0b191219          py37_375    rapidsai-nightly
 cudnn                     7.6.0                cuda10.1_0    nvidia
 cupy                      6.6.0            py37ha7c4746_1    conda-forge
 curl                      7.64.0               hbc83047_2
 cx_oracle                 7.3.0            py37h516909a_0    conda-forge
 cycler                    0.10.0                   py37_0
 cython                    0.29.14          py37he1b5a44_0    conda-forge
 cytoolz                   0.9.0.1          py37h14c3975_1
 dask                      2.8.1                      py_0    conda-forge
 dask-core                 2.8.1                      py_0    conda-forge
 dask-cuda                 0.12.0a191218           py37_36    rapidsai-nightly
 dask-cudf                 0.12.0b191219          py37_375    rapidsai-nightly
 dask-glm                  0.2.0                      py_1    conda-forge
 dask-ml                   1.1.1                      py_0    conda-forge
 dbus                      1.13.6               h746ee38_0
 decorator                 4.4.0                    py37_1
 defusedxml                0.5.0                    py37_1
 distributed               2.8.1                      py_0    conda-forge
 dlpack                    0.2                  he1b5a44_1    conda-forge
 docutils                  0.14                     py37_0
 double-conversion         3.1.5                he1b5a44_2    conda-forge
 entrypoints               0.3                      py37_0
 et_xmlfile                1.0.1                    py37_0
 expat                     2.2.6                he6710b0_0
 fastavro                  0.22.8           py37h516909a_0    conda-forge
 fastcache                 1.0.2            py37h14c3975_2
 fastrlock                 0.4             py37he1b5a44_1000    conda-forge
 filelock                  3.0.10                   py37_0
 flask                     1.0.2                    py37_1
 fontconfig                2.13.0               h9420a91_0
 freetype                  2.9.1                h8a8886c_1
 fribidi                   1.0.5                h7b6447c_0
 fsspec                    0.6.2                      py_0    conda-forge
 future                    0.17.1                   py37_0
 get_terminal_size         1.0.0                haa9412d_0
 gevent                    1.4.0            py37h7b6447c_0
 gflags                    2.2.2             he1b5a44_1002    conda-forge
 glib                      2.56.2               hd408876_0
 glob2                     0.6                      py37_1
 glog                      0.4.0                he1b5a44_1    conda-forge
 gmp                       6.1.2                h6c8ec71_1
 gmpy2                     2.0.8            py37h10f8cd9_2
 graphite2                 1.3.13               h23475e2_0
 greenlet                  0.4.15           py37h7b6447c_0
 grpc-cpp                  1.23.0               h18db393_0    conda-forge
 gst-plugins-base          1.14.0               hbbd80ab_1
 gstreamer                 1.14.0               hb453b48_1
 h5py                      2.9.0            py37h7918eee_0
 harfbuzz                  1.8.8                hffaf4a1_0
 hdf5                      1.10.4               hb1b8bf9_0
 heapdict                  1.0.0                    py37_2
 html5lib                  1.0.1                    py37_0
 icu                       58.2                 h9c2bf20_1
 idna                      2.8                      py37_0
 imageio                   2.5.0                    py37_0
 imagesize                 1.1.0                    py37_0
 importlib_metadata        1.3.0                    py37_0    conda-forge
 intel-openmp              2019.3                      199
 ipykernel                 5.1.0            py37h39e3cac_0
 ipython                   7.4.0            py37h39e3cac_0
 ipython_genutils          0.2.0                    py37_0
 ipywidgets                7.4.2                    py37_0
 isort                     4.3.16                   py37_0
 itsdangerous              1.1.0                    py37_0
 jbig                      2.1                  hdba287a_0
 jdcal                     1.4                      py37_0
 jedi                      0.13.3                   py37_0
 jeepney                   0.4                      py37_0
 jinja2                    2.10                     py37_0
 joblib                    0.14.1                     py_0    conda-forge
 jpeg                      9b                   h024ee3a_2
 jsonschema                3.0.1                    py37_0
 jupyter                   1.0.0                    py37_7
 jupyter_client            5.2.4                    py37_0
 jupyter_console           6.0.0                    py37_0
 jupyter_contrib_core      0.3.3                      py_2    conda-forge
 jupyter_contrib_nbextensions 0.5.1                    py37_0    conda-forge
 jupyter_core              4.4.0                    py37_0
 jupyter_highlight_selected_word 0.2.0                 py37_1000    conda-forge
 jupyter_kernel_gateway    2.4.0                      py_0    conda-forge
 jupyter_latex_envs        1.4.4                 py37_1000    conda-forge
 jupyter_nbextensions_configurator 0.4.1                    py37_0    conda-forge
 jupyterlab                0.35.4           py37hf63ae98_0
 jupyterlab_server         0.2.0                    py37_0
 keyring                   18.0.0                   py37_0
 kiwisolver                1.0.1            py37hf484d3e_0
 krb5                      1.16.1               h173b8e3_7
 lazy-object-proxy         1.3.1            py37h14c3975_2
 libarchive                3.3.3             hb44662c_1005    conda-forge
 libblas                   3.8.0               14_openblas    conda-forge
 libcblas                  3.8.0               14_openblas    conda-forge
 libclang                  8.0.0                hc9558a2_6    conda-forge
 libcudf                   0.12.0b191219      cuda10.1_375    rapidsai-nightly
 libcumlprims              0.12.0a191218        cuda10.1_0    rapidsai-nightly
 libcurl                   7.64.0               h20c2e04_2
 libedit                   3.1.20181209         hc058e9b_0
 libevent                  2.1.10               h72c5cf5_0    conda-forge
 libffi                    3.2.1                hd88cf55_4
 libgcc-ng                 8.2.0                hdf63c60_1
 libgfortran-ng            7.3.0                hdf63c60_0
 liblapack                 3.8.0               14_openblas    conda-forge
 liblapacke                3.8.0               14_openblas    conda-forge
 liblief                   0.9.0                h7725739_2
 libnvstrings              0.12.0b191219      cuda10.1_375    rapidsai-nightly
 libopenblas               0.3.7                h5ec1e0e_5    conda-forge
 libpng                    1.6.36               hbc83047_0
 libprotobuf               3.8.0                h8b12597_0    conda-forge
 librmm                    0.12.0a191218       cuda10.1_72    rapidsai-nightly
 libsodium                 1.0.16               h1bed415_0
 libssh2                   1.8.0                h1ba5d50_4
 libstdcxx-ng              8.2.0                hdf63c60_1
 libtiff                   4.0.9                he6b73bb_1    conda-forge
 libtool                   2.4.6                h7b6447c_5
 libuuid                   1.0.3                h1bed415_2
 libuv                     1.34.0               h516909a_0    conda-forge
 libxcb                    1.13                 h1bed415_1
 libxml2                   2.9.9                he19cac6_0
 libxslt                   1.1.33               h7d1a2b0_0
 llvmlite                  0.29.0           py37hfd453ef_1    conda-forge
 locket                    0.2.0                    py37_1
 lxml                      4.3.2            py37hefd8a0e_0
 lz4-c                     1.8.3             he1b5a44_1001    conda-forge
 lzo                       2.10                 h49e0be7_2
 markupsafe                1.1.1            py37h7b6447c_0
 matplotlib                3.0.3            py37h5429711_0
 mccabe                    0.6.1                    py37_1
 mistune                   0.8.4            py37h7b6447c_0
 mkl                       2019.3                      199
 mkl-service               2.0.2            py37h516909a_0    conda-forge
 mkl_fft                   1.0.13           py37h516909a_1    conda-forge
 mkl_random                1.0.4            py37hf2d7682_0    conda-forge
 more-itertools            6.0.0                    py37_0
 mpc                       1.1.0                h10f8cd9_1
 mpfr                      4.0.1                hdf1c602_3
 mpmath                    1.1.0                    py37_0
 msgpack-python            0.6.1            py37hfd86e86_1
 multipledispatch          0.6.0                    py37_0
 navigator-updater         0.2.1                    py37_0
 nbconvert                 5.4.1                    py37_3
 nbformat                  4.4.0                    py37_0
 nccl                      2.5.6.1              h51cf6c1_0    conda-forge
 ncurses                   6.1                  he6710b0_1
 networkx                  2.2                      py37_1
 nltk                      3.4                      py37_1
 nose                      1.3.7                    py37_2
 notebook                  5.7.8                    py37_0
 numba                     0.45.1           py37hb3f55d8_0    conda-forge
 numexpr                   2.7.0            py37hb3f55d8_0    conda-forge
 numpy                     1.17.3           py37h95a1406_0    conda-forge
 numpy-base                1.17.4           py37h2f8d375_0
 numpydoc                  0.8.0                    py37_0
 nvstrings                 0.12.0b191219          py37_375    rapidsai-nightly
 olefile                   0.46                     py37_0
 openpyxl                  2.6.1                    py37_1
 openssl                   1.1.1d               h516909a_0    conda-forge
 packaging                 19.0                     py37_0
 pandas                    0.24.2           py37he6710b0_0
 pandoc                    2.2.3.2                       0
 pandocfilters             1.4.2                    py37_1
 pango                     1.42.4               h049681c_0
 parquet-cpp               1.5.1                         2    conda-forge
 parso                     0.3.4                    py37_0
 partd                     0.3.10                   py37_1
 patchelf                  0.9                  he6710b0_3
 path.py                   11.5.0                   py37_0
 pathlib2                  2.3.3                    py37_0
 patsy                     0.5.1                    py37_0
 pcre                      8.43                 he6710b0_0
 pep8                      1.7.1                    py37_0
 pexpect                   4.6.0                    py37_0
 pickleshare               0.7.5                    py37_0
 pillow                    5.4.1            py37h34e0f95_0
 pip                       19.0.3                   py37_0
 pixman                    0.38.0               h7b6447c_0
 pkginfo                   1.5.0.1                  py37_0
 pluggy                    0.13.0                   py37_0    conda-forge
 ply                       3.11                     py37_0
 prometheus_client         0.6.0                    py37_0
 prompt_toolkit            2.0.9                    py37_0
 protobuf                  3.8.0            py37he1b5a44_2    conda-forge
 psutil                    5.6.1            py37h7b6447c_0
 ptyprocess                0.6.0                    py37_0
 py                        1.8.0                    py37_0
 py-lief                   0.9.0            py37h7725739_2
 pyarrow                   0.15.0           py37h8b68381_1    conda-forge
 pycodestyle               2.5.0                    py37_0
 pycosat                   0.6.3            py37h14c3975_0
 pycparser                 2.19                     py37_0
 pycrypto                  2.6.1            py37h14c3975_9
 pycurl                    7.43.0.2         py37h1ba5d50_0
 pyflakes                  2.1.1                    py37_0
 pygments                  2.3.1                    py37_0
 pylint                    2.3.1                    py37_0
 pynvml                    8.0.3                      py_0    conda-forge
 pyodbc                    4.0.26           py37he6710b0_0
 pyopenssl                 19.0.0                   py37_0
 pyparsing                 2.3.1                    py37_0
 pyqt                      5.9.2            py37h05f1152_2
 pyrsistent                0.14.11          py37h7b6447c_0
 pysocks                   1.6.8                    py37_0
 pytables                  3.5.1            py37h71ec239_0
 pytest                    5.3.2                    py37_0    conda-forge
 pytest-arraydiff          0.3              py37h39e3cac_0
 pytest-astropy            0.5.0                    py37_0
 pytest-doctestplus        0.3.0                    py37_0
 pytest-openfiles          0.3.2                    py37_0
 pytest-remotedata         0.3.1                    py37_0
 python                    3.7.3                h0371630_0
 python-dateutil           2.8.0                    py37_0
 python-libarchive-c       2.8                      py37_6
 pytz                      2018.9                   py37_0
 pywavelets                1.0.2            py37hdd07704_0
 pyyaml                    5.1              py37h7b6447c_0
 pyzmq                     18.0.0           py37he6710b0_0
 qt                        5.9.7                h5867ecd_1
 qtawesome                 0.5.7                    py37_1
 qtconsole                 4.4.3                    py37_0
 qtpy                      1.7.0                    py37_1
 re2                       2019.12.01           he1b5a44_0    conda-forge
 readline                  7.0                  h7b6447c_5
 requests                  2.21.0                   py37_0
 rhash                     1.3.6             h14c3975_1001    conda-forge
 rmm                       0.12.0a191218           py37_72    rapidsai-nightly
 rope                      0.12.0                   py37_0
 ruamel_yaml               0.15.46          py37h14c3975_0
 scikit-image              0.14.2           py37he6710b0_0
 scikit-learn              0.22             py37hcdab131_1    conda-forge
 scipy                     1.4.0            py37h921218d_0    conda-forge
 seaborn                   0.9.0                    py37_0
 secretstorage             3.1.1                    py37_0
 send2trash                1.5.0                    py37_0
 setuptools                40.8.0                   py37_0
 simplegeneric             0.8.1                    py37_2
 singledispatch            3.4.0.3                  py37_0
 sip                       4.19.8           py37hf484d3e_0
 six                       1.12.0                   py37_0
 snappy                    1.1.7                hbae5bb6_3
 snowballstemmer           1.2.1                    py37_0
 sortedcollections         1.1.2                    py37_0
 sortedcontainers          2.1.0                    py37_0
 soupsieve                 1.8                      py37_0
 sphinx                    1.8.5                    py37_0
 sphinxcontrib             1.0                      py37_1
 sphinxcontrib-websupport  1.1.0                    py37_1
 spyder                    3.3.3                    py37_0
 spyder-kernels            0.4.2                    py37_0
 sqlalchemy                1.3.1            py37h7b6447c_0
 sqlite                    3.27.2               h7b6447c_0
 statsmodels               0.10.2           py37hc1659b7_0    conda-forge
 sympy                     1.3                      py37_0
 tblib                     1.3.2                    py37_0
 terminado                 0.8.1                    py37_1
 testpath                  0.4.2                    py37_0
 thrift-cpp                0.12.0            hf3afdfd_1004    conda-forge
 tk                        8.6.8                hbc83047_0
 toolz                     0.9.0                    py37_0
 tornado                   6.0.2            py37h7b6447c_0
 tqdm                      4.31.1                   py37_1
 traitlets                 4.3.2                    py37_0
 umap-learn                0.3.10                   py37_0    conda-forge
 unicodecsv                0.14.1                   py37_0
 unixodbc                  2.3.7                h14c3975_0
 uriparser                 0.9.3                he1b5a44_1    conda-forge
 urllib3                   1.24.1                   py37_0
 wcwidth                   0.1.7                    py37_0
 webencodings              0.5.1                    py37_1
 werkzeug                  0.14.1                   py37_0
 wheel                     0.33.1                   py37_0
 widgetsnbextension        3.4.2                    py37_0
 wrapt                     1.11.1           py37h7b6447c_0
 wurlitzer                 1.0.2                    py37_0
 xlrd                      1.2.0                    py37_0
 xlsxwriter                1.1.5                    py37_0
 xlwt                      1.3.0                    py37_0
 xz                        5.2.4                h14c3975_4
 yaml                      0.1.7                had09818_2
 zeromq                    4.3.1                he6710b0_3
 zict                      0.1.4                    py37_0
 zipp                      0.6.0                      py_0    conda-forge
 zlib                      1.2.11               h7b6447c_3
 zstd                      1.4.0                h3b9ef0a_0    conda-forge

@vishalmehta1991
Copy link
Contributor

@nikiforov-sm sorry am not able to understand the issue here ? are you building cudf ??

For building cuml you dont need to build cudf. you can use the one from conda

@nikiforov-sm
Copy link

We have no errors at build routines.
Currently we have a problem with script from the top of page:

Traceback (most recent call last):
  File "", line 1, in 
  File "/opt/anaconda/lib/python3.7/site-packages/cuml/dask/ensemble/randomforestregressor.py", line 403, in predict
    raise_exception_from_futures(futures)
  File "/opt/anaconda/lib/python3.7/site-packages/cuml/dask/common/utils.py", line 131, in raise_exception_from_futures
    len(errs), len(futures), ", ".join(map(str, errs))
RuntimeError: 8 of 8 worker jobs failed: Exception occured! file=/opt/jupyter/cuml/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[15:35:18] /opt/jupyter/cuml/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!

Stack trace returned 10 entries:
[bt] (0) /opt/anaconda/lib/libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x17f) [0x7f173d565d3f]
[bt] (1) /opt/anaconda/lib/libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x39) [0x7f173d5670b9]
[bt] (2) /opt/anaconda/lib/libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x3a0d) [0x7f173d5830cd]
[bt] (3) /opt/anaconda/lib/libcuml++.so(TreeliteModelBuilderCommitModel+0x146) [0x7f173d562926]
[bt] (4) /opt/anaconda/lib/libcuml++.so(void ML::build_treelite_forest(void**, ML::RandomForestMetaData const*, int, int, std::vector >&)+0x2fa) [0x7f173d38a49a]
[bt] (5) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x1d328) [0x7f1749093328]
[bt] (6) /opt/anaconda/bin/python(_PyObject_FastCallDict+0x9f) [0x55ce6e2c59cf]
[bt] (7) /opt/anaconda/bin/python(_PyObject_Call_Prepend+0x63) [0x55ce6e2dcc43]
[bt] (8) /opt/anaconda/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x18072) [0x7f174908e072]
[bt] (9) /opt/anaconda/bin/python(_PyObject_FastCallKeywords+0x49b) [0x55ce6e327d2b] 

@vishalmehta1991
Copy link
Contributor

can you try without dask ? like this. I was able to train at depth of 30-50.

import numpy as np
import cudf as pd

max_depth = 30
n_trees = 100
n_streams = 10

df = pd.read_csv('test.csv').drop('Unnamed: 0', axis=1)
X = df.drop(['C2'],1).astype(np.float32)
y = df['C2'].astype(np.int32)

X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y,
test_size=0.2)

from cuml.ensemble import RandomForestRegressor as cumlRFR
cuml_model = cumlRFR(max_depth=max_depth, n_estimators=n_trees,
n_streams=n_streams,n_bins=16,split_algo=0)
cuml_model.fit(X_train, y_train)
cuml_y_pred = cuml_model.predict(X_test,predict_model='GPU')
print("CuML accuracy: ", mean_squared_error(y_test, cuml_y_pred))

@nikiforov-sm
Copy link

@vishalmehta1991 Thank you.
Without dask works fine.
How can we use dask now?

With dask we have errors:

distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor.
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps
    result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize
    header, frames = dumps(x, context=context) if wants_context else dumps(x)
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps
    return {"serializer": "pickle"}, [pickle.dumps(x)]
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps
    return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 952, in dumps
    cp.dump(obj)
  File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump
    return Pickler.dump(self, obj)
  File "/opt/anaconda/lib/python3.7/pickle.py", line 437, in dump
    self.save(obj)
  File "/opt/anaconda/lib/python3.7/pickle.py", line 524, in save
    rv = reduce(self.proto)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor.
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps
    result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize
    header, frames = dumps(x, context=context) if wants_context else dumps(x)
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps
    return {"serializer": "pickle"}, [pickle.dumps(x)]
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps
    return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 952, in dumps
    cp.dump(obj)
  File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump
    return Pickler.dump(self, obj)
  File "/opt/anaconda/lib/python3.7/pickle.py", line 437, in dump
    self.save(obj)
  File "/opt/anaconda/lib/python3.7/pickle.py", line 524, in save
    rv = reduce(self.proto)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor.
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps
    result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize
    header, frames = dumps(x, context=context) if wants_context else dumps(x)
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps
    return {"serializer": "pickle"}, [pickle.dumps(x)]
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps
    return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 952, in dumps
    cp.dump(obj)
  File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump
    return Pickler.dump(self, obj)
  File "/opt/anaconda/lib/python3.7/pickle.py", line 437, in dump
    self.save(obj)
  File "/opt/anaconda/lib/python3.7/pickle.py", line 524, in save
    rv = reduce(self.proto)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor.
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps
    result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize
    header, frames = dumps(x, context=context) if wants_context else dumps(x)
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps
    return {"serializer": "pickle"}, [pickle.dumps(x)]
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps
    return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 952, in dumps
    cp.dump(obj)
  File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump
    return Pickler.dump(self, obj)
  File "/opt/anaconda/lib/python3.7/pickle.py", line 437, in dump
    self.save(obj)
  File "/opt/anaconda/lib/python3.7/pickle.py", line 524, in save
    rv = reduce(self.proto)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor.
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps
    result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize
    header, frames = dumps(x, context=context) if wants_context else dumps(x)
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps
    return {"serializer": "pickle"}, [pickle.dumps(x)]
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps
    return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 952, in dumps
    cp.dump(obj)
  File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump
    return Pickler.dump(self, obj)
  File "/opt/anaconda/lib/python3.7/pickle.py", line 437, in dump
    self.save(obj)
  File "/opt/anaconda/lib/python3.7/pickle.py", line 524, in save
    rv = reduce(self.proto)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor.
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps
    result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize
    header, frames = dumps(x, context=context) if wants_context else dumps(x)
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps
    return {"serializer": "pickle"}, [pickle.dumps(x)]
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps
    return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 952, in dumps
    cp.dump(obj)
  File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump
    return Pickler.dump(self, obj)
  File "/opt/anaconda/lib/python3.7/pickle.py", line 437, in dump
    self.save(obj)
  File "/opt/anaconda/lib/python3.7/pickle.py", line 524, in save
    rv = reduce(self.proto)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor.
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps
    result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize
    header, frames = dumps(x, context=context) if wants_context else dumps(x)
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps
    return {"serializer": "pickle"}, [pickle.dumps(x)]
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps
    return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 952, in dumps
    cp.dump(obj)
  File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump
    return Pickler.dump(self, obj)
  File "/opt/anaconda/lib/python3.7/pickle.py", line 437, in dump
    self.save(obj)
  File "/opt/anaconda/lib/python3.7/pickle.py", line 524, in save
    rv = reduce(self.proto)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor.
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps
    result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 191, in serialize
    header, frames = dumps(x, context=context) if wants_context else dumps(x)
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps
    return {"serializer": "pickle"}, [pickle.dumps(x)]
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 51, in dumps
    return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 952, in dumps
    cp.dump(obj)
  File "/opt/anaconda/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump
    return Pickler.dump(self, obj)
  File "/opt/anaconda/lib/python3.7/pickle.py", line 437, in dump
    self.save(obj)
  File "/opt/anaconda/lib/python3.7/pickle.py", line 524, in save
    rv = reduce(self.proto)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
ERROR:Task exception was never retrieved
future:  exception=TypeError('exceptions must derive from BaseException')>
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/asyncio/tasks.py", line 603, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 251, in start
    response = await self.instantiate()
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 334, in instantiate
    result = await self.process.start()
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 528, in start
    msg = await self._wait_until_connected(uid)
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 642, in _wait_until_connected
    raise msg
TypeError: exceptions must derive from BaseException
ERROR:Task exception was never retrieved
future:  exception=TypeError('exceptions must derive from BaseException')>
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/asyncio/tasks.py", line 603, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 251, in start
    response = await self.instantiate()
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 334, in instantiate
    result = await self.process.start()
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 528, in start
    msg = await self._wait_until_connected(uid)
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 642, in _wait_until_connected
    raise msg
TypeError: exceptions must derive from BaseException
ERROR:Task exception was never retrieved
future:  exception=TypeError('exceptions must derive from BaseException')>
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/asyncio/tasks.py", line 603, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 251, in start
    response = await self.instantiate()
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 334, in instantiate
    result = await self.process.start()
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 528, in start
    msg = await self._wait_until_connected(uid)
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 642, in _wait_until_connected
    raise msg
TypeError: exceptions must derive from BaseException
ERROR:Task exception was never retrieved
future:  exception=TypeError('exceptions must derive from BaseException')>
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/asyncio/tasks.py", line 603, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 251, in start
    response = await self.instantiate()
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 334, in instantiate
    result = await self.process.start()
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 528, in start
    msg = await self._wait_until_connected(uid)
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 642, in _wait_until_connected
    raise msg
TypeError: exceptions must derive from BaseException
ERROR:Task exception was never retrieved
future:  exception=TypeError('exceptions must derive from BaseException')>
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/asyncio/tasks.py", line 603, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 251, in start
    response = await self.instantiate()
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 334, in instantiate
    result = await self.process.start()
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 528, in start
    msg = await self._wait_until_connected(uid)
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 642, in _wait_until_connected
    raise msg
TypeError: exceptions must derive from BaseException
ERROR:Task exception was never retrieved
future:  exception=TypeError('exceptions must derive from BaseException')>
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/asyncio/tasks.py", line 603, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 251, in start
    response = await self.instantiate()
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 334, in instantiate
    result = await self.process.start()
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 528, in start
    msg = await self._wait_until_connected(uid)
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 642, in _wait_until_connected
    raise msg
TypeError: exceptions must derive from BaseException
ERROR:Task exception was never retrieved
future:  exception=TypeError('exceptions must derive from BaseException')>
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/asyncio/tasks.py", line 603, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 251, in start
    response = await self.instantiate()
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 334, in instantiate
    result = await self.process.start()
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 528, in start
    msg = await self._wait_until_connected(uid)
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 642, in _wait_until_connected
    raise msg
TypeError: exceptions must derive from BaseException
ERROR:Task exception was never retrieved
future:  exception=TypeError('exceptions must derive from BaseException')>
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/asyncio/tasks.py", line 603, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 251, in start
    response = await self.instantiate()
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 334, in instantiate
    result = await self.process.start()
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 528, in start
    msg = await self._wait_until_connected(uid)
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/nanny.py", line 642, in _wait_until_connected
    raise msg
TypeError: exceptions must derive from BaseException
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker

@nikiforov-sm
Copy link

Hi,

After rebuild from sources

from dask_cuda import LocalCUDACluster
from dask.distributed import Client
cluster = LocalCUDACluster(threads_per_worker=1)
if 'c' in globals():
    c.close()
c = Client(cluster)
c.close()

return errors:

distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor.
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps
    result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

Could you help us?

@nikiforov-sm
Copy link

dask-cuda from rapidsai-nightly:

(base) root@nvidia-MLT:/opt/jupyter/cuml/python# conda list dask-cuda # packages in environment at /opt/anaconda:

Name Version Build Channel

dask-cuda 0.12.0a191218 py37_36 rapidsai-nightly

@vishalmehta1991
Copy link
Contributor

@nikiforov-sm @oyilmaz-nvidia
I was able to run dask model like this

Shard the data across all workers

X_train_dask = dask_cudf.from_cudf(X_train, npartitions=n_partitions)
y_train_dask = dask_cudf.from_cudf(y_train, npartitions=n_partitions)
X_train_dask, y_train_dask = dask_utils.persist_across_workers(c, [X_train_dask, y_train_dask], workers=workers)

Build and train the model

dask_model = daskRFR(max_depth=max_depth, n_estimators=n_trees,
n_streams=n_streams,n_bins=16,split_algo=0,split_criterion=2)

dask_model.fit(X_train_dask, y_train_dask)
cuml_y_pred_dask = dask_model.predict(X_test.as_matrix()) #X_test is a cudf dataframe

@nikiforov-sm
Copy link

@vishalmehta1991
Is daskRFR an alias for RandomForestRegressor from cuml.dask.ensemble in your example?
Could you, please, provide full script?

@vishalmehta1991
Copy link
Contributor

vishalmehta1991 commented Jan 22, 2020

yes daskRFR is rf regressor from cuml.dask.ensemble
here is the full:

Dask RF

from cuml.dask.ensemble import RandomForestRegressor as daskRFR
from cuml.dask.common import utils as dask_utils
from dask.distributed import Client, wait
from dask_cuda import LocalCUDACluster
import dask_cudf

Start cluster

cluster = LocalCUDACluster(threads_per_worker=1, n_workers=n_partitions)
c = Client(cluster)
workers = c.has_what().keys()

Shared the data across all workers

X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y,
test_size=0.2)
X_train_dask = dask_cudf.from_cudf(X_train, npartitions=n_partitions)
y_train_dask = dask_cudf.from_cudf(y_train, npartitions=n_partitions)
X_train_dask, y_train_dask = dask_utils.persist_across_workers(c, [X_train_dask, y_train_dask], workers=workers)

Build and train the model

dask_model = daskRFR(max_depth=max_depth, n_estimators=n_trees,
n_streams=n_streams,n_bins=16,split_algo=0,split_criterion=2)

dask_model.fit(X_train_dask, y_train_dask)
cuml_y_pred_dask = dask_model.predict(X_test.as_matrix()) #X_test is a cudf dataframe

@nikiforov-sm
Copy link

nikiforov-sm commented Jan 27, 2020

Hi, thank you for update.
We have errors after these few rows:

from dask_cuda import LocalCUDACluster
from dask.distributed import Client
n_partitions = 8
cluster = LocalCUDACluster(threads_per_worker=1, n_workers=n_partitions)
c = Client(cluster)

distributed.nanny - WARNING - Restarting worker
distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor.
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps
    result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

@vishalmehta1991
Copy link
Contributor

@nikiforov-sm did you try with the code-freezed 0.12 branch ?
Also seem like some pickle error @Salonijain27 ?

@nikiforov-sm
Copy link

@vishalmehta1991 No, I didn't.
We've installed libraries from file of environment from the repository only without changing any version.

@rmccorm4
Copy link
Author

rmccorm4 commented Jan 31, 2020

@vishalmehta1991 I'm trying to help get this resolved, but doesn't seem to work for me on 0.12 branch either. Can you provide detailed steps of what you're doing if you're able to run this code successfully?


I tried building from CUML branch-0.12 per #1467 (comment), but I'm still hitting errors on the code from the original post: #1467 (comment) and also on Vishal's snippet here: #1467 (comment)

Building 0.12 branch

# Use RAPIDS container for easier reproducibility
nvidia-docker run -it -v `pwd`:/mnt --workdir=/mnt nvcr.io/nvidia/rapidsai/rapidsai:0.11-cuda10.0-runtime-ubuntu18.04
# Clone CUML source
git clone https://github.com/rapidsai/cuml
# Switch to 0.12 branch
git checkout branch-0.12
# Build dev env for 0.12 branch
conda env create --name cuml_dev --file /mnt/cuml/conda/environments/cuml_dev_cuda10.0.yml
conda activate cuml_dev
# Install cuml 0.12
conda install cuml

Verify version:

(cuml_dev) root@f99f97486476:/mnt# python -c "import cuml; print(cuml.__version__)"
0.12.0a+773.ge764252

Original Code (Error)

There's an error like this from each of the 8 workers:

(cuml_dev) $ python original.py
...
SKLearn accuracy:   0.0856196149694842
distributed.worker - WARNING -  Compute Failed
Function:  _predict
args:      (RandomForestRegressor(n_estimators=4, max_depth=20, handle=<cuml.common.handle.Handle object at 0x7f212eb9b770>, max_features='auto', n_bins=8, n_streams=8, split_algo=1, split_criterion=2, bootstrap=True, bootstrap_features=False, verbose=False, min_rows_per_node=2, rows_sample=1.0, max_leaves=-1, accuracy_metric='mse', min_impurity_decrease=0.0, quantile_per_tree=False, seed=0), array([[0.9405078 , 0.5626202 , 0.82645825, ..., 0.45711115, 0.80462799,
        0.92827166],
       [0.85446839, 0.53313684, 0.85503426, ..., 0.96904807, 0.66100485,
        0.93062759],
       [0.1721737 , 0.73070096, 0.23297057, ..., 0.57237385, 0.33928629,
        0.13889585],
       ...,
       [0.65332578, 0.74442212, 0.20840173, ..., 0.08406665, 0.33502194,
        0.94996344],
       [0.10918214, 0.75263554, 0.56264772, ..., 0.65330256, 0.0200657 ,
        0.02755087],
       [0.781481  , 0.73409352, 0.2241026 , ..., 0.55295024, 0.51877916,
        0.88945637]]), 0.507484466418739)
kwargs:    {}
Exception: RuntimeError("Exception occured! file=/conda/conda-bld/libcuml_1580378874315/work/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[19:30:28] /conda/conda-bld/libcuml_1580378874315/work/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!\n\nStack trace returned 10 entries:\n[bt] (0) /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/common/../../../../libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x1bc) [0x7f203f2794bc]\n[bt] (1) /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/common/../../../../libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x28) [0x7f203f27a818]\n[bt] (2) /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/common/../../../../libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x4136) [0x7f203f297fa6]\n[bt] (3) /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/common/../../../../libcuml++.so(TreeliteModelBuilderCommitModel+0x13b) [0x7f203f27626b]\n[bt] (4) /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/common/../../../../libcuml++.so(void ML::build_treelite_forest<float, float>(void**, ML::RandomForestMetaData<float, float> const*, int, int, std::vector<unsigned char, std::allocator<unsigned char> >&)+0x49f) [0x7f203f083c1f]\n[bt] (5) /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x17c7f) [0x7f2138027c7f]\n[bt] (6) /opt/conda/envs/cuml_dev/bin/python(_PyObject_FastCallDict+0x9f) [0x556fafc50c6f]\n[bt] (7) /opt/conda/envs/cuml_dev/bin/python(_PyObject_Call_Prepend+0x63) [0x556fafc70313]\n[bt] (8) /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x19388) [0x7f2138029388]\n[bt] (9) /opt/conda/envs/cuml_dev/bin/python(_PyObject_FastCallKeywords+0x49b) [0x556fafcbb85b]\n\n\n\nObtained 28 stack frames\n#0 in /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/common/../../../../libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f203ee5a32e]\n#1 in /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/common/../../../../libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x80) [0x7f203ee5ae40]\n#2 in /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/common/../../../../libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x909) [0x7f203f084089]\n#3 in /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x17c7f) [0x7f2138027c7f]\n#4 in /opt/conda/envs/cuml_dev/bin/python(_PyObject_FastCallDict+0x9f) [0x556fafc50c6f]\n#5 in /opt/conda/envs/cuml_dev/bin/python(_PyObject_Call_Prepend+0x63) [0x556fafc70313]\n#6 in /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x19388) [0x7f2138029388]\n#7 in /opt/conda/envs/cuml_dev/bin/python(_PyObject_FastCallKeywords+0x49b) [0x556fafcbb85b]\n#8 in /opt/conda/envs/cuml_dev/bin/python(_PyEval_EvalFrameDefault+0x5379) [0x556fafd100b9]\n#9 in /opt/conda/envs/cuml_dev/bin/python(_PyFunction_FastCallDict+0x10b) [0x556fafc5079b]\n#10 in /opt/conda/envs/cuml_dev/bin/python(_PyEval_EvalFrameDefault+0x1f4f) [0x556fafd0cc8f]\n#11 in /opt/conda/envs/cuml_dev/bin/python(_PyFunction_FastCallDict+0x10b) [0x556fafc5079b]\n#12 in /opt/conda/envs/cuml_dev/bin/python(_PyEval_EvalFrameDefault+0x1f4f) [0x556fafd0cc8f]\n#13 in /opt/conda/envs/cuml_dev/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x556fafca2f7b]\n#14 in /opt/conda/envs/cuml_dev/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x556fafd0b3e0]\n#15 in /opt/conda/envs/cuml_dev/bin/python(_PyFunction_FastCallDict+0x10b) [0x556fafc5079b]\n#16 in /opt/conda/envs/cuml_dev/bin/python(_PyEval_EvalFrameDefault+0x1f4f) [0x556fafd0cc8f]\n#17 in /opt/conda/envs/cuml_dev/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x556fafca2f7b]\n#18 in /opt/conda/envs/cuml_dev/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x556fafd0b3e0]\n#19 in /opt/conda/envs/cuml_dev/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x556fafca2f7b]\n#20 in /opt/conda/envs/cuml_dev/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x556fafd0b3e0]\n#21 in /opt/conda/envs/cuml_dev/bin/python(_PyFunction_FastCallDict+0x10b) [0x556fafc5079b]\n#22 in /opt/conda/envs/cuml_dev/bin/python(_PyObject_Call_Prepend+0x63) [0x556fafc70313]\n#23 in /opt/conda/envs/cuml_dev/bin/python(PyObject_Call+0x6e) [0x556fafc6206e]\n#24 in /opt/conda/envs/cuml_dev/bin/python(+0x224917) [0x556fafd5f917]\n#25 in /opt/conda/envs/cuml_dev/bin/python(+0x1e3368) [0x556fafd1e368]\n#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f2177b786db]\n#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f21778a188f]\n")

Vishal's Code Snippet (error)

import cuml
import cudf
import dask_cudf
import numpy as np
import pandas as pd
from dask_cuda import LocalCUDACluster
from dask.distributed import Client, wait
from cuml.dask.common import utils as dask_utils
from cuml.dask.ensemble import RandomForestRegressor as daskRFR

if __name__ == '__main__':
    # Start cluster
    n_partitions = 4
    cluster = LocalCUDACluster(threads_per_worker=1, n_workers=n_partitions)
    c = Client(cluster)
    workers = c.has_what().keys()

    # Desired parameters
    max_depth = 20
    n_trees = 30
    rows, cols = 10000, 74
    n_streams = len(workers)

    # Generate fake data for example's sake
    x = np.random.random((rows, cols))
    df = pd.DataFrame(x, columns=["C{}".format(i) for i in range(cols)])
    X = df.drop(['C2'],1).astype(np.float, 32) #.to_numpy().astype(np.float, 32)
    y = pd.DataFrame(df['C2'].astype(np.float, 32))
    """
    File "/opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/preprocessing/model_selection.py", line 251, in train_test_split
        return X_train, X_test, y_train, y_test
    UnboundLocalError: local variable 'X_train' referenced before assignment
    """
    X = cudf.DataFrame.from_pandas(X)
    y = cudf.DataFrame.from_pandas(y)

    # Shared the data across all workers
    X_train, X_test, y_train, y_test = cuml.preprocessing.model_selection.train_test_split(X, y,
            test_size=0.2)
    X_train_dask = dask_cudf.from_cudf(X_train, npartitions=n_partitions)
    y_train_dask = dask_cudf.from_cudf(y_train, npartitions=n_partitions)
    X_train_dask, y_train_dask = dask_utils.persist_across_workers(c, [X_train_dask, y_train_dask], workers=workers)

    # Build and train the model
    dask_model = daskRFR(max_depth=max_depth, n_estimators=n_trees,
            n_streams=n_streams,n_bins=16,split_algo=0,split_criterion=2)

    dask_model.fit(X_train_dask, y_train_dask)
    cuml_y_pred_dask = dask_model.predict(X_test.as_matrix()) #X_test is a cudf dataframe

I also get similar errors on each worker when running a slightly modified version of Vishal's example above:

(cuml_dev) $ python vishal.py
...
distributed.worker - WARNING -  Compute Failed
Function:  _predict
args:      (RandomForestRegressor(n_estimators=8, max_depth=20, handle=<cuml.common.handle.Handle object at 0x7f6da56ddb90>, max_features='auto', n_bins=16, n_streams=4, split_algo=0, split_criterion=2, bootstrap=True, bootstrap_features=False, verbose=False, min_rows_per_node=2, rows_sample=1.0, max_leaves=-1, accuracy_metric='mse', min_impurity_decrease=0.0, quantile_per_tree=False, seed=0), array([[0.51445416, 0.46609572, 0.40908901, ..., 0.801352  , 0.8571766 ,
        0.97171996],
       [0.52920684, 0.57777807, 0.38054712, ..., 0.02989455, 0.84592101,
        0.51838237],
       [0.77971376, 0.39299739, 0.53437106, ..., 0.65130207, 0.33598852,
        0.42132736],
       ...,
       [0.34358309, 0.06082903, 0.33059765, ..., 0.82576507, 0.36605432,
        0.81006008],
       [0.17678472, 0.55156779, 0.33741966, ..., 0.66669902, 0.11728904,
        0.61993128],
       [0.6684218 , 0.1548672 , 0.14697938, ..., 0.68550973, 0.75773336,
        0.66218533]]), 0.47515081126142533)
kwargs:    {}
Exception: RuntimeError("Exception occured! file=/conda/conda-bld/libcuml_1580378874315/work/cpp/src/randomforest/randomforest.cu line=324: TREELITE FAIL: call='TreeliteModelBuilderCommitModel(model_builder, model)'. Reason:[19:47:54] /conda/conda-bld/libcuml_1580378874315/work/cpp/build/treelite/src/treelite/src/frontend/builder.cc:440: Impossible thing happened: model has no leaf node!\n\nStack trace returned 10 entries:\n[bt] (0) /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/common/../../../../libcuml++.so(dmlc::StackTrace[abi:cxx11]()+0x1bc) [0x7f6e08dd94bc]\n[bt] (1) /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/common/../../../../libcuml++.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x28) [0x7f6e08dda818]\n[bt] (2) /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/common/../../../../libcuml++.so(treelite::frontend::ModelBuilder::CommitModel(treelite::Model*)+0x4136) [0x7f6e08df7fa6]\n[bt] (3) /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/common/../../../../libcuml++.so(TreeliteModelBuilderCommitModel+0x13b) [0x7f6e08dd626b]\n[bt] (4) /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/common/../../../../libcuml++.so(void ML::build_treelite_forest<float, float>(void**, ML::RandomForestMetaData<float, float> const*, int, int, std::vector<unsigned char, std::allocator<unsigned char> >&)+0x49f) [0x7f6e08be3c1f]\n[bt] (5) /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x17c7f) [0x7f6dbd0a6c7f]\n[bt] (6) /opt/conda/envs/cuml_dev/bin/python(_PyObject_FastCallDict+0x9f) [0x561c9d537c6f]\n[bt] (7) /opt/conda/envs/cuml_dev/bin/python(_PyObject_Call_Prepend+0x63) [0x561c9d557313]\n[bt] (8) /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x19388) [0x7f6dbd0a8388]\n[bt] (9) /opt/conda/envs/cuml_dev/bin/python(_PyObject_FastCallKeywords+0x49b) [0x561c9d5a285b]\n\n\n\nObtained 28 stack frames\n#0 in /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/common/../../../../libcuml++.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f6e089ba32e]\n#1 in /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/common/../../../../libcuml++.so(_ZN8MLCommon9ExceptionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x80) [0x7f6e089bae40]\n#2 in /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/common/../../../../libcuml++.so(_ZN2ML21build_treelite_forestIffEEvPPvPKNS_20RandomForestMetaDataIT_T0_EEiiRSt6vectorIhSaIhEE+0x909) [0x7f6e08be4089]\n#3 in /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x17c7f) [0x7f6dbd0a6c7f]\n#4 in /opt/conda/envs/cuml_dev/bin/python(_PyObject_FastCallDict+0x9f) [0x561c9d537c6f]\n#5 in /opt/conda/envs/cuml_dev/bin/python(_PyObject_Call_Prepend+0x63) [0x561c9d557313]\n#6 in /opt/conda/envs/cuml_dev/lib/python3.7/site-packages/cuml/ensemble/randomforestregressor.cpython-37m-x86_64-linux-gnu.so(+0x19388) [0x7f6dbd0a8388]\n#7 in /opt/conda/envs/cuml_dev/bin/python(_PyObject_FastCallKeywords+0x49b) [0x561c9d5a285b]\n#8 in /opt/conda/envs/cuml_dev/bin/python(_PyEval_EvalFrameDefault+0x5379) [0x561c9d5f70b9]\n#9 in /opt/conda/envs/cuml_dev/bin/python(_PyFunction_FastCallDict+0x10b) [0x561c9d53779b]\n#10 in /opt/conda/envs/cuml_dev/bin/python(_PyEval_EvalFrameDefault+0x1f4f) [0x561c9d5f3c8f]\n#11 in /opt/conda/envs/cuml_dev/bin/python(_PyFunction_FastCallDict+0x10b) [0x561c9d53779b]\n#12 in /opt/conda/envs/cuml_dev/bin/python(_PyEval_EvalFrameDefault+0x1f4f) [0x561c9d5f3c8f]\n#13 in /opt/conda/envs/cuml_dev/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x561c9d589f7b]\n#14 in /opt/conda/envs/cuml_dev/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x561c9d5f23e0]\n#15 in /opt/conda/envs/cuml_dev/bin/python(_PyFunction_FastCallDict+0x10b) [0x561c9d53779b]\n#16 in /opt/conda/envs/cuml_dev/bin/python(_PyEval_EvalFrameDefault+0x1f4f) [0x561c9d5f3c8f]\n#17 in /opt/conda/envs/cuml_dev/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x561c9d589f7b]\n#18 in /opt/conda/envs/cuml_dev/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x561c9d5f23e0]\n#19 in /opt/conda/envs/cuml_dev/bin/python(_PyFunction_FastCallKeywords+0xfb) [0x561c9d589f7b]\n#20 in /opt/conda/envs/cuml_dev/bin/python(_PyEval_EvalFrameDefault+0x6a0) [0x561c9d5f23e0]\n#21 in /opt/conda/envs/cuml_dev/bin/python(_PyFunction_FastCallDict+0x10b) [0x561c9d53779b]\n#22 in /opt/conda/envs/cuml_dev/bin/python(_PyObject_Call_Prepend+0x63) [0x561c9d557313]\n#23 in /opt/conda/envs/cuml_dev/bin/python(PyObject_Call+0x6e) [0x561c9d54906e]\n#24 in /opt/conda/envs/cuml_dev/bin/python(+0x224917) [0x561c9d646917]\n#25 in /opt/conda/envs/cuml_dev/bin/python(+0x1e3368) [0x561c9d605368]\n#26 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f6e2bdb06db]\n#27 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f6e2bad988f]\n")

@vishalmehta1991
Copy link
Contributor

vishalmehta1991 commented Jan 31, 2020

@rmccorm4 here is a full train code using dask as well as RF single gpu
dask_example.pdf

@rmccorm4
Copy link
Author

rmccorm4 commented Feb 4, 2020

@nikiforov-sm does this above snippet work well for you? #1467 (comment)

If you're having trouble setting up env / building from source, using containers should make your life easier, for example the "Building 0.12 Branch" section of this comment: #1467 (comment)

Building 0.12 branch

# Use RAPIDS container for easier reproducibility
nvidia-docker run -it -v `pwd`:/mnt --workdir=/mnt nvcr.io/nvidia/rapidsai/rapidsai:0.11-cuda10.0-runtime-ubuntu18.04
# Clone CUML source
git clone https://github.com/rapidsai/cuml
# Switch to 0.12 branch
git checkout branch-0.12
# Build dev env for 0.12 branch
conda env create --name cuml_dev --file /mnt/cuml/conda/environments/cuml_dev_cuda10.0.yml
conda activate cuml_dev
# Install cuml 0.12
conda install cuml

This way you can try various configurations without messing up your host environment.

@nikiforov-sm
Copy link

nikiforov-sm commented Feb 4, 2020

@rmccorm4 We have multiple errors on row

cluster = LocalCUDACluster(threads_per_worker=1, n_workers=n_partitions)
c = Client(cluster)

like this:

distributed.nanny - WARNING - Restarting worker
distributed.nanny - ERROR - Failed while trying to start worker process: Could not serialize object of type RandomForestRegressor.
Traceback (most recent call last):
  File "/opt/anaconda/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 38, in dumps
    result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "cuml/ensemble/randomforestregressor.pyx", line 373, in cuml.ensemble.randomforestregressor.RandomForestRegressor.__getstate__
  File "cuml/ensemble/randomforestregressor.pyx", line 441, in cuml.ensemble.randomforestregressor.RandomForestRegressor._get_model_info
TypeError: an integer is required

@rmccorm4
Copy link
Author

rmccorm4 commented Feb 4, 2020

@nikiforov-sm I didn't encounter this error when trying the scripts above. Can you reproduce this issue in a container?

@nikiforov-sm
Copy link

@rmccorm4

Can you reproduce this issue in a container?

We can't reproduce this issue in a container :) Thank you!

We have successfully trained model with max_depth<=28.
We have warnings like "distributed.nanny - WARNING - Restarting worker" with script hanging and without any train-result on max_depth > 28.

@rmccorm4
Copy link
Author

rmccorm4 commented Feb 6, 2020

@vishalmehta1991 said he's tested depths up to 50 here: #1467 (comment)

So what's the root cause of this gap? Upper bound of max_depth=28 as opposed to >= 50?

@vishalmehta1991
Copy link
Contributor

@rmccorm4 I don't think its an RF issue. Seems to me its more of a dask thing.

@nikiforov-sm
Copy link

@rmccorm4 @vishalmehta1991 It looks like OOM error on train.
I can train model on 180GB dataset (36 features) with max_depth=19, but with max_depth=20 there are errors like:
RuntimeError: 12 of 16 worker jobs failed: RMM_ERROR_OUT_OF_MEMORY

If I use max_features='sqrt' to decrease max_features - I can train model with bigger max_depth (<=30) without any errors.

And by the way - there is no API to save trained RandomForestRegressor model like XXX.save_model(path)?

@vishalmehta1991
Copy link
Contributor

There is pickle support for RF. check example here
https://github.com/rapidsai/cuml/blob/branch-0.12/python/cuml/test/test_pickle.py

@nikiforov-sm
Copy link

nikiforov-sm commented Feb 10, 2020

@vishalmehta1991 Is pickle working for 12th version?

I have an error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
 in 
      5     with open(model_path, 'wb') as pf:
----> 6         pickle.dump(fitted, pf)
      7 except (TypeError, ValueError) as e:

TypeError: can't pickle dict_keys objects

@rmccorm4
Copy link
Author

Hi @nikiforov-sm ,

Can you share the full script that's causing the failure above? i.e. What is the fitted variable?

It looks like per vishal's post above, that the models can be pickled. It seems that your fitted variable is a dict_keys object, and maybe not a model object.

@nikiforov-sm
Copy link

Hi @rmccorm4 ,

raw_df = dask_cudf.read_csv(train_csv,dtype=['float32']*ncols, partitions=10000000)
raw_df = raw_df.persist()

df = raw_df[x_cols + ['KPI_0']]
X_train_dask, y_train_dask = dask_utils.persist_across_workers(c, [df[x_cols], df['KPI_0']], workers=workers)

cuml_model = cumlDaskRF(
                        max_depth=30, n_estimators=100,
                        n_streams=1,
                        n_bins=16,split_algo=0,split_criterion=2
                       )
fitted = cuml_model.fit(X_train_dask, y_train_dask)
wait(cuml_model.rfs)

I've tested with fitted model (variable fitted) and exact model (variable cuml_model) as in the example https://github.com/rapidsai/cuml/blob/branch-0.12/python/cuml/test/test_pickle.py:

model_path='/data/cuml.model'

with open(model_path, 'wb') as pf:
            pickle.dump(cuml_model, pf)

raise:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
 in 
      1 with open(model_path, 'wb') as pf:
----> 2     pickle.dump(cuml_model, pf)

TypeError: can't pickle dict_keys objects

@Salonijain27
Copy link
Contributor

We currently do not have the option to pickle dask RF models.

@nikiforov-sm
Copy link

Thank you!
We have successfully train dask cuml RF model with depth more than 30!
We can close this issue.

@rmccorm4
Copy link
Author

rmccorm4 commented Mar 13, 2020

I checked out the issue and I believe the problem was that in 0.12 the option to build sparse representation of the cuML RF in FIL was not available to the user and the code would by default create a dense representation of cuML forest in FIL. This would cause the system to run out of memory when max_depth > 16.

This issue has been addressed and the fix has been merged into cuml-0.13. We now provide the user with the option of either creating a sparse or dense representation of the cuML forest in FIL by using the variable, fil_sparse_format. Furthermore, by default the sparse representation is used provided the value of the algo variable is right. If algo=auto or if algo=naïve then the sparse implementation is created else the dense implementation is created in FIL.

I ran the code that was provided in the file: repo.py and was able to successfully run it by using the nightly release of cuml.
In order to install cuml-0.13 nightly please run:

conda install -c rapidsai-nightly -c nvidia -c conda-forge \
    -c defaults cuml=0.13 python=3.7 cudatoolkit=10.0

FYI, I believe upgrading to CuML 0.13 as mentioned above was the solution, thanks @Salonijain27 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants