Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check failed: this->HistogramExists(nidx) #4330

Closed
pseudotensor opened this issue Apr 3, 2019 · 7 comments · Fixed by #4347
Closed

Check failed: this->HistogramExists(nidx) #4330

pseudotensor opened this issue Apr 3, 2019 · 7 comments · Fixed by #4347

Comments

@pseudotensor
Copy link
Contributor

v0.82:

fail.zip

$ python fail.py
Traceback (most recent call last):
  File "fail.py", line 24, in <module>
    model.fit(X, y, sample_weight=sample_weight, **kwargs)
  File "/home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/sklearn.py", line 406, in fit
    callbacks=callbacks)
  File "/home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/training.py", line 227, in train
    xgb_model=xgb_model, callbacks=callbacks)
  File "/home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/training.py", line 74, in _train_internal
    bst.update(dtrain, i, obj)
  File "/home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/core.py", line 1113, in update
    dtrain.handle))
  File "/home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/core.py", line 178, in _check_call
    raise XGBoostError(_LIB.XGBGetLastError())
xgboost.core.XGBoostError: b'[16:17:25] /root/repo/xgboost/src/tree/updater_gpu_hist.cu:1084: Exception in gpu_hist: [16:17:25] /root/repo/xgboost/src/tree/updater_gpu_hist.cu:339: Check failed: this->HistogramExists(nidx) \n\nStack trace returned 10 entries:\n[bt] (0) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(dmlc::StackTrace(unsigned long)+0x54) [0x7f1e715940d4]\n[bt] (1) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x1d) [0x7f1e7159486d]\n[bt] (2) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(xgboost::tree::DeviceHistogram<xgboost::detail::GradientPairInternal<double> >::GetNodeHistogram(int)+0x1ec) [0x7f1e717f126c]\n[bt] (3) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(xgboost::tree::DeviceShard<xgboost::detail::GradientPairInternal<double> >::EvaluateSplits(std::vector<int, std::allocator<int> >, xgboost::RegTree const&, xgboost::common::ColumnSampler*, std::vector<xgboost::tree::ValueConstraint, std::allocator<xgboost::tree::ValueConstraint> > const&, unsigned long)+0xc3a) [0x7f1e717fb8aa]\n[bt] (4) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(xgboost::tree::GPUHistMakerSpecialised<xgboost::detail::GradientPairInternal<double> >::EvaluateSplits(std::vector<int, std::allocator<int> >, xgboost::RegTree*)+0xfc) [0x7f1e717fbfbc]\n[bt] (5) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(xgboost::tree::GPUHistMakerSpecialised<xgboost::detail::GradientPairInternal<double> >::UpdateTree(xgboost::HostDeviceVector<xgboost::detail::GradientPairInternal<float> >*, xgboost::DMatrix*, xgboost::RegTree*)+0xe18) [0x7f1e71809cb8]\n[bt] (6) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(xgboost::tree::GPUHistMakerSpecialised<xgboost::detail::GradientPairInternal<double> >::Update(xgboost::HostDeviceVector<xgboost::detail::GradientPairInternal<float> >*, xgboost::DMatrix*, std::vector<xgboost::RegTree*, std::allocator<xgboost::RegTree*> > const&)+0x117) [0x7f1e7180a457]\n[bt] (7) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(xgboost::gbm::GBTree::BoostNewTrees(xgboost::HostDeviceVector<xgboost::detail::GradientPairInternal<float> >*, xgboost::DMatrix*, int, std::vector<std::unique_ptr<xgboost::RegTree, std::default_delete<xgboost::RegTree> >, std::allocator<std::unique_ptr<xgboost::RegTree, std::default_delete<xgboost::RegTree> > > >*)+0x781) [0x7f1e7160e171]\n[bt] (8) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(xgboost::gbm::GBTree::DoBoost(xgboost::DMatrix*, xgboost::HostDeviceVector<xgboost::detail::GradientPairInternal<float> >*, xgboost::ObjFunction*)+0x8c3) [0x7f1e7160f3a3]\n[bt] (9) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(xgboost::LearnerImpl::UpdateOneIter(int, xgboost::DMatrix*)+0x3b3) [0x7f1e7161de03]\n\n\n\n\nStack trace returned 10 entries:\n[bt] (0) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(dmlc::StackTrace(unsigned long)+0x54) [0x7f1e715940d4]\n[bt] (1) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x1d) [0x7f1e7159486d]\n[bt] (2) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(xgboost::tree::GPUHistMakerSpecialised<xgboost::detail::GradientPairInternal<double> >::Update(xgboost::HostDeviceVector<xgboost::detail::GradientPairInternal<float> >*, xgboost::DMatrix*, std::vector<xgboost::RegTree*, std::allocator<xgboost::RegTree*> > const&)+0x3e0) [0x7f1e7180a720]\n[bt] (3) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(xgboost::gbm::GBTree::BoostNewTrees(xgboost::HostDeviceVector<xgboost::detail::GradientPairInternal<float> >*, xgboost::DMatrix*, int, std::vector<std::unique_ptr<xgboost::RegTree, std::default_delete<xgboost::RegTree> >, std::allocator<std::unique_ptr<xgboost::RegTree, std::default_delete<xgboost::RegTree> > > >*)+0x781) [0x7f1e7160e171]\n[bt] (4) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(xgboost::gbm::GBTree::DoBoost(xgboost::DMatrix*, xgboost::HostDeviceVector<xgboost::detail::GradientPairInternal<float> >*, xgboost::ObjFunction*)+0x8c3) [0x7f1e7160f3a3]\n[bt] (5) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(xgboost::LearnerImpl::UpdateOneIter(int, xgboost::DMatrix*)+0x3b3) [0x7f1e7161de03]\n[bt] (6) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/site-packages/xgboost/./lib/libxgboost.so(XGBoosterUpdateOneIter+0x35) [0x7f1e715a2d15]\n[bt] (7) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call_unix64+0x4c) [0x7f1ef2ec1e40]\n[bt] (8) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call+0x2eb) [0x7f1ef2ec18ab]\n[bt] (9) /home/jon/.pyenv/versions/3.6.4/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(_ctypes_callproc+0x2cf) [0x7f1ef3107d2f]\n\n'

@trivialfis
Copy link
Member

Oops, the old histogram node is overwritten by histogram recycling.

@pseudotensor
Copy link
Contributor Author

Happening quite often for us, any plans to fix?

@trivialfis
Copy link
Member

Will try to fix it.

@pseudotensor
Copy link
Contributor Author

Thanks!

@pseudotensor
Copy link
Contributor Author

Need any additional help? Can you repro the problem ok with the pickle/script I gave?

@trivialfis
Copy link
Member

trivialfis commented Apr 9, 2019

@pseudotensor While trying to solve this one, I got another issue in histogram allocation, which shows histogram memory map nidx_map_ is of size 0 while histogram data data_ is not empty. Will take some time. Thanks for the help.

@pseudotensor
Copy link
Contributor Author

Thanks!

@lock lock bot locked as resolved and limited conversation to collaborators Jul 9, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants