-
-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
have cuda_library output RDC #125
Conversation
535c076
to
ce78428
Compare
This somehow allows cc_library to depend on an incomplete cuda_library, is there a way to avoid it? |
Can you please elaborate? Or illustrate with an example? If |
@@ -87,12 +91,19 @@ def _cuda_library_impl(ctx): | |||
pic_lib = pic_libs, | |||
objects = objects, | |||
pic_objects = pic_objects, | |||
rdc_objects = rdc_objects, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you missing rdc_pic_objects
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
cuda/private/rules/cuda_library.bzl
Outdated
@@ -104,7 +115,7 @@ cuda_library = rule( | |||
"hdrs": attr.label_list(allow_files = ALLOW_CUDA_HDRS), | |||
"deps": attr.label_list(providers = [[CcInfo], [CudaInfo]]), | |||
"alwayslink": attr.bool(default = False), | |||
"rdc": attr.bool(default = False, doc = "whether to perform relocateable device code linking, otherwise, normal device link."), | |||
"rdc": attr.bool(default = False, doc = "Whether to produce and consume relocateable device code. Transitive deps that contain device code must all either be cuda_objects or cuda_library that has rdc = True. If False, all device code must be in the same translation unit. May have performance implications. See https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#using-separate-compilation-in-cuda."), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please break this line a little bit, way too long for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Indeed. The only limitation left is this cannot support circular object dependency due to the mandatory device link, which by itself is really stupid so I don't think user will need it generally. |
Backlog for memo here: Originally, it is modeled as
For non-rdc and non-dlto case, the device executable is automatically linked already. This is something like Thus, allow |
This allows a `cuda_library` that was built with `rdc=True` to be depended upon by another such library. This is convenient as such a library can be consumed by either another `cuda_library` OR a `cc_library`. Note: if we have this, I'm not sure why we need a separate cuda_objects rule anymore. Perhaps it can be removed in a later PR. Change-Id: I1014d28a0ab3a9c76b788821211b13c4a9956d2a
ce78428
to
104b3bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the improvement.
@cloudhan would you be OK with me deleting cuda_objects? Seems useless now. |
No |
Why? |
Compile the newly added example in #125 with `_wrapper_device_link` with rdc causes artifact name conflict. ``` cuda_library( name = "b", srcs = ["b.cu"], hdrs = ["b.cuh"], rdc = True, ) ``` The reason is that our compile always compile to <target_output_dir>/_objs: cuda_library(name="b") --> artifact base name is `b` during `compile` with `compile` for following srcs "b.cu" ------------------> artifact `<target_output_dir>/_objs/b.rdc.o` `fatbin.cu` ------rdc----> artifact `<target_output_dir>/_objs/b.rdc.o` After this PR: "b.cu" ------------------> artifact `<target_output_dir>/_objs/b.rdc.o` `fatbin.cu` ------rdc----> artifact `<target_output_dir>/_objs/_dlink/b.rdc.o`
OK, so when building a shared library, we inevitably
So this will cause misusing easily and is unfixable in some condition. I will revert this PR. |
Can you add an example that illustrates the problem? |
1 is not very important, because those imcomplete objects are not exposed by rules_cuda/cuda/private/rules/cuda_library.bzl Lines 38 to 48 in 22a46e6
Lets consider the issue 162, build the shared library will cause
And |
@garymm #164 (comment) This should address all the issue we are facing. Let me fix this. |
This allows a
cuda_library
that was built withrdc=True
to bedepended upon by another such library. This is convenient as such a
library can be consumed by either another
cuda_library
OR acc_library
.Note: if we have this, I'm not sure why we need a separate cuda_objects
rule anymore. Perhaps it can be removed in a later PR.
Change-Id: I1014d28a0ab3a9c76b788821211b13c4a9956d2a