Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix CI build #5848

Closed
wants to merge 3 commits into from
Closed

Conversation

jacquesqiao
Copy link
Member

@jacquesqiao jacquesqiao commented Nov 22, 2017

Fixes #5846

@jacquesqiao jacquesqiao changed the title fix nccl build [test ci] Nov 22, 2017
@@ -1,3 +1,3 @@
cc_library(dynamic_loader SRCS dynamic_loader.cc DEPS glog gflags)
cc_library(dynamic_loader SRCS dynamic_loader.cc DEPS glog gflags nccl)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw the following error messages in TeamCity log:

[07:09:48]CMake Error at cmake/generic.cmake:189 (add_dependencies):
[07:09:48]  The dependency target "nccl" of target "dynamic_loader" does not exist.

It seems that the target nccl is not defined.

I also noticed that in nccl.cmake

if(NOT WITH_GPU)
return()
endif()

It seems that if WITH_GPU is not defined, nccl target is not defined.

I am wondering if you might want to change cc_libarary(dynamic_loader, here into nv_library(dynamic_loader?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Ci now failed for

	[ 14%] Building NVCC (Device) object CMakeFiles/warpctc.dir/src/warpctc_generated_reduce.cu.o
[09:28:41]	In file included from /paddle/paddle/platform/enforce.h:39:0,
[09:28:41]	from /paddle/paddle/platform/gpu_info.cc:19:
[09:28:41]	/paddle/paddle/platform/dynload/nccl.h:18:18: fatal error: nccl.h: No such file or directory
[09:28:41]	#include <nccl.h>
[09:28:41]	^
[09:28:41]	compilation terminated.
...
	In file included from /paddle/paddle/platform/enforce.h:39:0,
[09:28:41]	from /paddle/paddle/platform/dynload/dynamic_loader.cc:22:
[09:28:41]	/paddle/paddle/platform/dynload/nccl.h:18:18: fatal error: nccl.h: No such file or directory
[09:28:41]	#include <nccl.h>
[09:28:41]	^
[09:28:41]	compilation terminated.
[09:28:41]	

I think it's because some modules that depend on nccl are built before nccl is dowloaded, so try to add nccl as dependency, but yes it's not right now, still trying.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is because we modified the build order, firstly we build with_gpu=off(nccl will not build), and then we build with_gpu=on(nccl will build now), the second compile will use the cache of the first build, but the dependency is not right then. some module start to build before nccl is downloaded.

@jacquesqiao jacquesqiao changed the title [test ci] fix CI build Nov 23, 2017
@jacquesqiao
Copy link
Member Author

fix with #5889

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants