Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nccl.h not found when compiling from source #8504

Closed
yu239-zz opened this issue Feb 23, 2018 · 8 comments · Fixed by #9833
Closed

nccl.h not found when compiling from source #8504

yu239-zz opened this issue Feb 23, 2018 · 8 comments · Fixed by #9833
Assignees

Comments

@yu239-zz
Copy link

yu239-zz commented Feb 23, 2018

See the reopened #5035.

@helinwang
Copy link
Contributor

CC: @tonyyang-svail @dzhwinter could you guys take a look, we probably should support compiling without docker.

@helinwang
Copy link
Contributor

helinwang commented Feb 23, 2018

Hi @yu239 , we switched to nccl2, please install the dependency using sudo apt-get install libnccl2 libnccl-dev (more info is here: http://docs.nvidia.com/deeplearning/sdk/nccl-install-guide/index.html).

nccl2 is closed source comparing to nccl1, so we can not use cmake to download the source and compile. Maybe manually install it using apt-get is the best solution.

@typhoonzero
Copy link
Contributor

@helinwang Is it necessary to add a build option to switch off nccl dependency?

@dzhwinter
Copy link
Contributor

dzhwinter commented Feb 24, 2018

The nccl2 is not open sourced anymore, NVIDIA provide the cuda docker image with ppa(the apt source url) included inside, so we can make a apt install command. @helinwang
For build paddle out of docker, you need to install nccl2 like cudnn manually. Here is the nccl2 download link https://developer.nvidia.com/nccl

@luotao1
Copy link
Contributor

luotao1 commented Feb 24, 2018

Should we remain nccl1?

@wangkuiyi
Copy link
Collaborator

wangkuiyi commented Feb 25, 2018

Does our codebase (develop branch) still depend on NCCL 1? If not, let us remove nccl.cmake. @luotao1

@luotao1
Copy link
Contributor

luotao1 commented Feb 25, 2018

Our codebase still works well with NCCL 1, and @dzhwinter will update the nccl.cmake later to get the compatibility of the NCCL 1 and 2.

@dzhwinter
Copy link
Contributor

In our codebase, we provide NCCL as a DSO(Dynamic Shared Library) library. It means that we only use a nccl.h to compile, no more static library is depended.

According to the NCCL installl guide https://docs.nvidia.com/deeplearning/sdk/pdf/NCCL-Installation-Guide.pdf , we have the dependency relation below.
image

nccl2.1.4(latest) -> cuda9.0 or higher
nccl2.1.2 -> cuda8.0
nccl1.x -> cuda7.0 or higher

To make our Multi-GPU supported in more platform, we still need the nccl1 to compatible with older CUDA version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants