-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
voting parallel thread not safe #1089
Comments
thanks @weidong8405347 |
i commented this openmp,still not work.i set thread_num=1 is can work, and comment the openmp in findbestsplits and some other openmp,still can not work |
@guolinke can you run voting parallel successfully? you can set thread_num=48 it will corrupt, if you set thread_num=8, it sometimes will not corrupt |
@weidong8405347 I just try it, and everything is fine. |
@weidong8405347 just try it in ubuntu14.04 with the same setting. And everything is still fine. |
@guolinke may be is the system problem,system version is my company inside version num_threads = 48 |
@guolinke can you tell me your gcc version to compile lightGBM, maybe someother env cause the problem |
|
@guolinke my gcc version is 4.8.5 and libc version is 2.17 |
@weidong8405347 you also can try to use the latest version: https://github.com/Microsoft/LightGBM |
@guolinke thank you, i will try the stable version , recently i was tried the latest version |
@guolinke i try the stable version, it cause new problem,may be my data is special |
can you compile with debug mode and provide more information? |
with debug mode the information list: |
@weidong8405347 it seems your errors are quite random. I saw 3 different errors, and most of them seems should be correct
|
@guolinke thanks, i re-run it, the errors changed every run time. i also run the different machine. but out machines are the same,so errors also happen in the other machine. i try the enable_sparse=false, but still error, the error changed |
@guolinke i try the example, it can run normally, may be my data is special |
@guolinke i try the lambda rank demo, it occur error,that the details #3 0x00007fe8ff030503 in _int_free () from /lib64/libc.so.6 |
@weidong8405347 you can try the latest code. |
Environment info
Operating System: linux
CPU: Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz
C++/Python/R version: C++ gcc 4.8.5
Error Message:
[100:106236] Signal: Segmentation fault (11)
[100:106236] Signal code: Address not mapped (1)
[100:106236] Failing at address: 0xfffffffa0a283128
[100:106236] [ 0] /lib64/libpthread.so.0(+0xf370)[0x7f8fc12dc370]
[100:106236] [ 1] /lib64/libc.so.6(+0x8975d)[0x7f8fc0f9475d]
[100:106236] [ 2] ./bin/lightgbm[0x4841f8]
[100:106236] [ 3] /lib64/libgomp.so.1(+0xdde5)[0x7f8fc170cde5]
[100:106236] [ 4] /lib64/libpthread.so.0(+0x7dc5)[0x7f8fc12d4dc5]
[100:106236] [ 5] /lib64/libc.so.6(clone+0x6d)[0x7f8fc100274d]
[100:106236] *** End of error message ***
Reproducible examples
when num_threads =1 will not corrupt, use openmp cause thread unsafety
Steps to reproduce
1.set num_threads > 1
2.tree_learner = voting
3.mpirun -np 2 lightgbm
The text was updated successfully, but these errors were encountered: