You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
Algorithm (namely transform, but others also can) failed in shared mode build. When linking is static (CMake's BUILD_SHARED_LIBS setting is OFF), then all is fine. Behavior doesn't depend on whether __host__ __device__ lambda is pased, or functor with __host__ __device__ operator () is passed.
Error message:
terminate called after throwing an instance of 'thrust::system::system_error'
what(): parallel_for failed: cudaErrorInvalidDeviceFunction: invalid device function
Settings: CMAKE_CUDA_HOST_COMPILER=clang++, CMAKE_CUDA_ARCHITECTURES=86, CMAKE_CUDA_FLAGS=--expt-relaxed-constexpr --extended-lambda, CMAKE_CUDA_RUNTIME_LIBRARY is STATIC or SHARED in conjunction with BUILD_SHARED_LIBS value, CMAKE_CUDA_RESOLVE_DEVICE_SYMBOLS is OFF, CMAKE_CUDA_SEPARABLE_COMPILATION is ON, CMAKE_POSITION_INDEPENDENT_CODE is ON, nvidia-smi said that CUDA is 11.7, Thrust configures as thrust_create_target(Thrust FROM_OPTIONS) with HOST_SYSTEM=CPP and DEVICE_SYSTEM=CUDA.
The problem arise when I moved function setTriangles from .cu file to .cuh and made it template function. From that point it began to be used from yet another (second) .so file. I suspect it resulted in generation of a CUDA instrumentation code in both .so files, but something goes wrong: either one of .sos didn't load theirs initialization code or both load, but somehow conflicting.
How can this be fixed?
The text was updated successfully, but these errors were encountered:
During debugging some runs (w/o rebuild) are failed with the error, some of ones are succeeded. How CUDA instrumentation code works? Is there lazy PTX/cubin loading of some kind? May it be too late?
Is there a way to trace CUDA initialization?
Sorry for the late reply, I'm just catching up on github notifs today.
This is a known issue when using Thrust and CUB from shared libraries. See #1401 for more info and some workarounds. The "official" workaround is to use the macros in this header to wrap Thrust/CUB in a unique namespace per-library, but some users in #1401 have also reported compiler flags that worked for their situation.
Algorithm (namely
transform
, but others also can) failed in shared mode build. When linking is static (CMake'sBUILD_SHARED_LIBS
setting isOFF
), then all is fine. Behavior doesn't depend on whether__host__ __device__
lambda is pased, or functor with__host__ __device__ operator ()
is passed.Error message:
Settings:
CMAKE_CUDA_HOST_COMPILER=clang++
,CMAKE_CUDA_ARCHITECTURES=86
,CMAKE_CUDA_FLAGS=--expt-relaxed-constexpr --extended-lambda
,CMAKE_CUDA_RUNTIME_LIBRARY
isSTATIC
orSHARED
in conjunction withBUILD_SHARED_LIBS
value,CMAKE_CUDA_RESOLVE_DEVICE_SYMBOLS
isOFF
,CMAKE_CUDA_SEPARABLE_COMPILATION
isON
,CMAKE_POSITION_INDEPENDENT_CODE
isON
,nvidia-smi
said that CUDA is 11.7,Thrust
configures asthrust_create_target(Thrust FROM_OPTIONS)
withHOST_SYSTEM=CPP
andDEVICE_SYSTEM=CUDA
.The problem arise when I moved function setTriangles from
.cu
file to.cuh
and made it template function. From that point it began to be used from yet another (second).so
file. I suspect it resulted in generation of a CUDA instrumentation code in both.so
files, but something goes wrong: either one of.so
s didn't load theirs initialization code or both load, but somehow conflicting.How can this be fixed?
The text was updated successfully, but these errors were encountered: