Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

封装CAPI程序在hadoop上运行报错。 #11183

Closed
Angus07 opened this issue Jun 5, 2018 · 5 comments
Closed

封装CAPI程序在hadoop上运行报错。 #11183

Angus07 opened this issue Jun 5, 2018 · 5 comments
Assignees
Labels
User 用于标记用户问题

Comments

@Angus07
Copy link

Angus07 commented Jun 5, 2018

MP: Error #100: Fatal system error detected.
OMP: System error #22: Invalid argument
Thread [140464343873280] Forwarding seq_pool_fc1_l1,
*** Aborted at 1528170677 (unix time) try "date -d @1528170677" if you are using GNU date ***
PC: @ 0x0 (unknown)
*** SIGABRT (@0x1f90000a769) received by PID 42857 (TID 0x7fc06751f700) from PID 42857; stack trace: ***
@ 0x7fc096787160 (unknown)
@ 0x7fc08295e3f7 __GI_raise
@ 0x7fc08295f7d8 __GI_abort
@ 0x7fc095e70c53 __kmp_abort_process
@ 0x7fc095e5f2fb __kmp_fatal
@ 0x7fc095e20be2 KMPNativeAffinity::Mask::set_system_affinity()
@ 0x7fc095ea4396 __kmp_affinity_bind_thread
@ 0x7fc095e19700 _INTERNAL_26_______src_kmp_affinity_cpp_fbd1eedd::__kmp_affinity_create_x2apicid_map()
@ 0x7fc095e0ed60 _INTERNAL_26_______src_kmp_affinity_cpp_fbd1eedd::__kmp_aux_affinity_initialize()
@ 0x7fc095e0e566 __kmp_affinity_initialize
@ 0x7fc095e6fb32 __kmp_middle_initialize
@ 0x7fc095e59dae __kmp_api_omp_get_num_procs
@ 0x7fc08ced338d mkl_serv_domain_get_max_threads
@ 0x7fc08cec2ccc mkl_blas_sgemv
@ 0x7fc08cdaa014 mkl_blas_sgemm_host
@ 0x7fc08cd7f9b6 mkl_blas_sgemm
@ 0x7fc08cd021c3 sgemm
@ 0x7fc08ccc5681 cblas_sgemm
@ 0x7fc08383cc49 paddle::gemm<>()
@ 0x7fc083852b9b paddle::CpuMatrix::mul()
@ 0x7fc08385f459 paddle::CpuMatrix::mul()
@ 0x7fc0836602f5 paddle::FullyConnectedLayer::forward()
@ 0x7fc0836d6c2d paddle::NeuralNetwork::forward()
@ 0x7fc08355a0f6 paddle_gradient_machine_forward
@ 0x420e43 du::paddle::Classifier::predict()
@ 0x423870 du::paddle::TopicTagger::cal_tag()
@ 0x42448f du::paddle::TopicTagger::predict()
@ 0x408e35 tag()
@ 0x409a48 run_thread()
@ 0x7fc09677f1c3 start_thread
@ 0x7fc082a1012d __clone
@ 0x0 (unknown)

报错信息如上,在本地是可以跑的,测试了和在集群上相同的数据,但是在集群上一跑就出错。
在集群运行的时候上传了相关的so,二进制文件,以及字典。

@jacquesqiao
Copy link
Member

可能是集群上的机器不支持mkl,你试试不带mkl的版本?

@Angus07
Copy link
Author

Angus07 commented Jun 5, 2018

CONFIGS('baidu/third-party/paddle-capi@paddle-capi-cpu-v0.2@git_tag',NeedOutput())
我是通过这个配置项编译的,如果不用mkl的,该怎么配置呢?

@jacquesqiao jacquesqiao added the User 用于标记用户问题 label Jun 5, 2018
@Angus07
Copy link
Author

Angus07 commented Jun 5, 2018

cpu_noavx_openblas 编译这个版本的时候报错了。该怎么解决呢?

err:/home/opt/gcc-4.8.2.bpkg-r4/gcc-4.8.2.bpkg-r4/sbin/../lib/gcc/x86_64-baidu-linux-gnu/4.8.2/../../../../x86_64-baidu-linux-gnu/bin/ld: bc_out/baidu/third-party/paddle-capi-noavx/output/lib/libpaddle_capi_engine.a(MathFunctions.cpp.o): undefined reference to symbol 'dlsym@@GLIBC_2.2.5'
/home/opt/gcc-4.8.2.bpkg-r4/gcc-4.8.2.bpkg-r4/sbin/../lib/gcc/x86_64-baidu-linux-gnu/4.8.2/../../../../x86_64-baidu-linux-gnu/bin/ld: note: 'dlsym@@GLIBC_2.2.5' is defined in DSO /opt/compiler/gcc-4.8.2/lib/libdl.so.2 so try adding it to the linker command line
/opt/compiler/gcc-4.8.2/lib/libdl.so.2: could not read symbols: Invalid operation
collect2: error: ld returned 1 exit status

@jacquesqiao
Copy link
Member

@jacquesqiao jacquesqiao self-assigned this Jun 5, 2018
@shanyi15
Copy link
Collaborator

您好,此issue在近一个月内暂无更新,我们将于今天内关闭。若在关闭后您仍需跟进提问,可重新开启此问题,我们将在24小时内回复您。因关闭带来的不便我们深表歉意,请您谅解~感谢您对PaddlePaddle的支持!
Hello, this issue has not been updated in the past month. We will close it today for the sake of other user‘s experience. If you still need to follow up on this question after closing, please feel free to reopen it. In that case, we will get back to you within 24 hours. We apologize for the inconvenience caused by the closure and thank you so much for your support of PaddlePaddle Group!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
User 用于标记用户问题
Projects
None yet
Development

No branches or pull requests

3 participants