Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvcc crashes when compiling some simple tests #2005

Closed
pazner opened this issue Aug 18, 2020 · 8 comments · Fixed by #2027
Closed

nvcc crashes when compiling some simple tests #2005

pazner opened this issue Aug 18, 2020 · 8 comments · Fixed by #2027

Comments

@pazner
Copy link
Contributor

pazner commented Aug 18, 2020

nvcc crashes when compiling some simple tests. This looks like it may be a problem with nvcc, but a workaround or fix in Catch would be great.

Reproduction steps

File test.cu:

#define CATCH_CONFIG_MAIN
#include "catch.hpp"

std::string f() { return "123"; }

TEST_CASE("A")
{
   REQUIRE(f() == std::string("123"));
}

Compiling with nvcc --std=c++11 test.cu results in:

nvcc error   : 'cicc' died due to signal 11 (Invalid memory reference)
nvcc error   : 'cicc' core dumped

Platform information:

  • OS: RHEL 7.6
  • Compiler+version: Cuda compilation tools, release 10.1, V10.1.168, tested with host compilers GCC v8.3.1 and GCC v4.8.5
  • Catch version: v2.13.0
@pazner
Copy link
Contributor Author

pazner commented Aug 18, 2020

Some possible workarounds that don't trigger this behavior include:

auto x = f();
auto y = std::string("123");
REQUIRE(x == y);

and

auto x = f();
REQUIRE(x == "123");

On the other hand, this does not work:

auto x = f();
REQUIRE(x == std::string("123"));

pazner added a commit to mfem/mfem that referenced this issue Aug 18, 2020
Certain REQUIRE statements were causing nvcc (specifically cicc) to
crash when compiling.

This can be temporarily worked around by introducing temporary
variables.

See catchorg/Catch2#2005
@griwes
Copy link

griwes commented Aug 19, 2020

Is this reproducible on a non-ancient version of NVCC? I'd be happy to file an internal bug against the compiler if that's the case and someone can reduce the test case to a minimum ;)

@pazner
Copy link
Contributor Author

pazner commented Aug 19, 2020

Is this reproducible on a non-ancient version of NVCC? I'd be happy to file an internal bug against the compiler if that's the case and someone can reduce the test case to a minimum ;)

I can reproduce it with nvcc 11.0, V11.0.194 (CUDA 11.0.2). The following simple test reproduces it:

#define CATCH_CONFIG_MAIN
#include "catch.hpp"
#include <string>
TEST_CASE()
{
   REQUIRE(std::string("x") == "x");
}

I think this could probably be made even simpler (e.g. using something simpler than std::string), but I don't entirely understand what's going on.

@horenmar
Copy link
Member

If the nvidia team provides a reasonable workaround I am willing to merge it, but I am not debugging a segfault in nvcc.

@pazner At a glance, the compiler does not like rvalues in REQUIRE. There two more things to try and reduce the bug.

  1. See if the segfault happens without linking, and don't define CATCH_CONFIG_MAIN in the file. If yes, then that's a lot less code that needs to be further pruned down.
  2. Try it with a much simpler type than full std::string.

@pazner
Copy link
Contributor Author

pazner commented Aug 19, 2020

If the nvidia team provides a reasonable workaround I am willing to merge it, but I am not debugging a segfault in nvcc.

Yes, of course that makes sense. Thanks.

@pazner At a glance, the compiler does not like rvalues in REQUIRE. There two more things to try and reduce the bug.

  1. See if the segfault happens without linking, and don't define CATCH_CONFIG_MAIN in the file. If yes, then that's a lot less code that needs to be further pruned down.

It still segfaults without linking and without CATCH_CONFIG_MAIN.

  1. Try it with a much simpler type than full std::string.

This is the simplest example I have found so far:

#include "catch.hpp"
struct A
{
   int x;
   A() : x(10) { }
};
struct B
{
   A *a;
   B(A *a_) : a(a_) { }
};
TEST_CASE()
{
   REQUIRE(B(new A).a->x == 10); // nvcc segfaults
}

Interestingly, the following two tests don't trigger the segfault:

REQUIRE(A().x == 10); // works
A a;
REQUIRE(B(&a).a->x == 10); // works

@griwes, do you think the above example is sufficiently minimal to file an internal bug report? Also I'd be happy to hear if you know of or hear of any workarounds.

@griwes
Copy link

griwes commented Aug 21, 2020

It looks sufficiently minimal (together with the information about it being reproducible without catch's main) for me to run creduce on it once I'm back from vacation early next month and then file a bug, yes ;) (The cases that don't crash cicc are super helpful for creducing it, thanks!)

@griwes
Copy link

griwes commented Sep 12, 2020

Minimized, with creduce, to:

struct A {
  int x;
  A();
};
struct B {
  A *a;
  B(A *);
};
void d() { A a; __builtin_constant_p(EXPR); }

Segfaults with -DEXPR="B(new A).a->x", works with -DEXPR="A().x" and -DEXPR="B(&a).a->x". Taking a guess, I further minimized it to

struct A {
  A();
};
void d() { __builtin_constant_p(new A); }

which is a really nice reproducer, if I say so myself ;> Filed as an internal bug 3123443, "nvcc segfaults when encountering a call to __builtin_constant_p with an argument that involves a new-expression trying to invoke a user-provided constructor".

@horenmar
Copy link
Member

If you open up a PR to deactivate the use of __builtin_constant_p for nvcc, I'll merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants