-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible bug between colsample_bynode
and gpu_hist
#8824
Comments
@RAMitchell Could you please help take a look when you are available? |
Same issue -- which version would be an ideal second best to use? The latest nightly suffers from the same. |
Do you have a reproducible example for nightly? |
OPs example gives me the same problem with this build: https://s3-us-west-2.amazonaws.com/xgboost-nightly-builds/release_1.7.0/xgboost-1.7.4%2B36ad160501251336bfe69b602acc37ab3ec32d69-py3-none-manylinux2014_x86_64.whl |
@afogarty85 Can you use the nightly build from the development branch? https://s3-us-west-2.amazonaws.com/xgboost-nightly-builds/master/xgboost-2.0.0.dev0%2B4d665b3fb02ccb136d43853726d24ab11b5274ef-py3-none-manylinux2014_x86_64.whl |
The 2.0.0 dev seems to be working for the toy example above and a slightly modified version below for categorical data:
|
Great! I'll go ahead and close this issue. |
I am not sure how to replicate it with dummy data yet, but I am getting gpu crashes with |
@afogarty85 Please open a new issue if you find problems with |
I have same error with gpu_hist and colsample_bynode on xgb version 1.7.4 removing colsample_bynode parameter fixed it for me. Alternatively it seems updating xgb would also fix it thanks for the issue! |
Should be fixed in the latest patch release 1.7.6 |
Introduction
It seems that setting a
colsample_bynode
causes GPU training to fail. I've tested this out on linux and windows machines. The same script was functional on tag1.7.3
, so I believe this was introduced during the1.7.4
patch release.Reproduction
Linux System Information
nvidia-smi
nvidia-smi -L
lscpu
Stack Trace
Thanks in advance!
Nandish Gupta
Senior AI Engineer at SolasAI
The text was updated successfully, but these errors were encountered: