-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slowdown with sparse data + bagging on versions 3+ #3637
Comments
@shiyu1994 can you help to check the efficiency of bagging? |
The synthesized dataset has only 1 multi-value feature group, which results in single thread execution in the Lines 789 to 801 in d1014ea
PR #3720 should fix this. |
@shiyu1994 Can we close this issue? |
Confirmed that the regression is gone on our real dataset. In fact we got quite a nice speedup! 😄 Thanks for the fix! |
Really nice to hear that! |
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
How you are using LightGBM?
LightGBM component: Python package
Environment info
Operating System: macOS 10.14.6 (also see it on Ubuntu 18.04, but didn't compile the master branch there)
CPU/GPU model: x86-64/No GPU
C++ compiler version:
Java version: None
CMake version:
Python version:
R version: None
LightGBM version or commit hash: 3.0, 3.1, and
44a6fb7ffa646b469fc10475b3526c61239682ac
(latest master as of writing this)Error message and / or logs
None
Reproducible example(s)
Running this script:
on 2.3.1 gives:
but on master gives:
We see similar slowdowns on our production data; I'll note that on 2.3.1 LightGBM easily saturates all cores, but on 3.1 it appears to get stuck using 1-1.5 cores.
The text was updated successfully, but these errors were encountered: