-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
asan: problem calling NVIDIA CUDA libraries #629
Comments
I've recently added a flag protect_shadow_gap exactly for this purpose. Note that the cuda driver is closed source and no one know what it does, why it maps a large chunk at a fixed address and what will happen with that mapping at run-time. So, you are in a warranty void zone :( |
What an (awesome) coincidence; thanks for the swift reply! My test case now works, I'll be testing it out on my more complex CUDA back-end soon. Let's see whether hell really breaks loose 😄 |
FTR, the allocations done in cuInit() are probably related to CUDA's unified-virtual-address-space. |
As you suspected, disabling However, it seems that the CUDA allocations are very much fixed: always 0x200000000-0xd00000000. I tried altering the shadow offset ( Seeing how there's
Then I hooked
Any pointers? Thanks! |
Try ASAN_OPTIONS=protect_shadow_gap=0:replace_intrin=0:detect_leaks=0. |
Here is my test (derived from http://llvm.org/docs/CompileCudaWithLLVM.html): axpy.cu:
axpy_main.cc:
Build:
Run:
W/o protect_shadow_gap=0 cuInit() returns 2 W/o replace_intrin=0 (which disables checks inside memset interceptor) I get this:
This failure should be easy to fix, but for now let's use replace_intrin=0 W/o detect_leaks=0 I get this at the end:
Now I'd like to have a test that does something more interesting and fails somewhere else. |
Concerning your test-case, I'm seeing mostly identical behaviour, except for the case without That said, shouldn't my two attempted workarounds have succeeded (changing the shadow offset, and forcing a split shadow allocation by mmapping the memory)? I'm willing to have a closer look at both, if it's unexpected behaviour of course. FTR, it seems like device pointers returned by |
My libcuda is 346.96. Looks like they've introduced a leak :) Instrumenting the host (GPU) code with asan is most likely not going to work today, Your attempted workarounds might have worked, need to look closer why they didn't. Apparently, the large blob at 0x200000000 is used by cuda's allocator as an arena. |
This may be fixed in the next clang release, or at least there are LLVM contributors interested in this issue: https://reviews.llvm.org/D24640 (sanitizer skipped on NVPTX) |
I've also made |
I am not getting any errors with kcc's test program/options, but the following short test program is giving a SEGV with the same options:
Compilation flags:
Error:
|
Has there been any progress on this front? |
I think you actually wanted to call cudaMallocHost, not cudaMalloc. After I replaced the latter with the former, your example ran fine with both clang-3.9 and clang 6.0.
I personally managed to successfully run under asan a binary computing a heavy TesnorFlow graph on a GPU using clang-6.0 and |
Hello, Why don't asan provide an environment variable or a flag that isn't a compile time that would let the end user decide where the shadow memory shall be? |
There is now a compile-time flag to enable "dynamic shadow": The default mode (on Linux) still uses the fixed shadow offset |
The driver I am using it doing an mmap of 500 MBytes at 0x3fffff000. This hits the default shadow memory. I tried using the force dynamic shadow flag with clang 6 but it doesn't work. Shadow memory is still at 0x7fff8000. My idea is that if I can move the shadow memory outside my mmap address it would be safer than disabling the shadow memory protection. |
CUDA has its own memory allocation routines, so I would assume it's for its own heap. |
We already do this in ont_core_cpp since it was more of an issue when talking to the CUDA API, but since torch talks to CUDA for us it gets to decide how to handle the issue and it looks like it just prints a warning and continues on assuming no CUDA support. This is fine for end users, but for automated testing it means we were skipping tests without realising. See google/sanitizers#629 for more details.
I'm having an issue using ASAN with the NVIDIA CUDA libraries on a x86_64 system, returning bogus when calling
cuInit
. Given this small test-case (compiled withclang -lcuda
):When using
-fsanitize=address
, this call returns 2 (an impossible return value for this call) instead of the expected 0. I tried tweaking almost every ASAN option, to no avail.A verbose log:
System specs: clang 3.5.0 from Debian repositories, on a x86_64 system running linux 3.16.0 with NVIDIA drivers 340.65 (corresponding with CUDA 6.0). On an otherwise identical 32-bit system, the error does not occur.
Also tried with llvm/clang/compiler-rt from svn trunk, same issue.
The text was updated successfully, but these errors were encountered: