Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New ExecutionEngine/OrcLazy tests segfault on 32-bit x86 #50636

Closed
llvmbot opened this issue Jul 31, 2021 · 15 comments
Closed

New ExecutionEngine/OrcLazy tests segfault on 32-bit x86 #50636

llvmbot opened this issue Jul 31, 2021 · 15 comments
Labels
bugzilla Issues migrated from bugzilla

Comments

@llvmbot
Copy link
Member

llvmbot commented Jul 31, 2021

Bugzilla Link 51292
Resolution FIXED
Resolved on Oct 11, 2021 20:29
Version trunk
OS Linux
Blocks #51489
Reporter LLVM Bugzilla Contributor
CC @weliveindetail,@tstellar
Fixed by commit(s) c5ab55f c8905f1

Extended Description

I get the following test failures when building LLVM for 32-bit x86 via -DLLVM_BUILD_32_BITS=ON. I've tried bisecting them and they seem to have failed since their introduction.

FAIL: LLVM :: ExecutionEngine/OrcLazy/debug-descriptor-elf-minimal.ll (21 of 23)
******************** TEST 'LLVM :: ExecutionEngine/OrcLazy/debug-descriptor-elf-minimal.ll' FAILED ********************
Script:

: 'RUN: at line 1'; /srv/p2p/tmp/llvm32/bin/lli --jit-kind=orc-lazy --per-module-lazy --jit-linker=rtdyld --generate=__dump_jit_debug_descriptor /home/mgorny/git/llvm-project/llvm/test/ExecutionEngine/OrcLazy/debug-descriptor-elf-minimal.ll | /srv/p2p/tmp/llvm32/bin/FileCheck /home/mgorny/git/llvm-project/llvm/test/ExecutionEngine/OrcLazy/debug-descriptor-elf-minimal.ll
: 'RUN: at line 4'; /srv/p2p/tmp/llvm32/bin/lli --jit-kind=orc-lazy --per-module-lazy --jit-linker=jitlink --generate=__dump_jit_debug_descriptor /home/mgorny/git/llvm-project/llvm/test/ExecutionEngine/OrcLazy/debug-descriptor-elf-minimal.ll | /srv/p2p/tmp/llvm32/bin/FileCheck /home/mgorny/git/llvm-project/llvm/test/ExecutionEngine/OrcLazy/debug-descriptor-elf-minimal.ll

Exit Code: 2

Command Output (stderr):

PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0. Program arguments: /srv/p2p/tmp/llvm32/bin/lli --jit-kind=orc-lazy --per-module-lazy --jit-linker=rtdyld --generate=__dump_jit_debug_descriptor /home/mgorny/git/llvm-project/llvm/test/ExecutionEngine/OrcLazy/debug-descriptor-elf-minimal.ll
#​0 0xfffffffff555eba7 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/mgorny/git/llvm-project/llvm/lib/Support/Unix/Signals.inc:569:3
#​1 0xfffffffff555ed0f PrintStackTraceSignalHandler(void*) /home/mgorny/git/llvm-project/llvm/lib/Support/Unix/Signals.inc:632:1
#​2 0xfffffffff555cbd9 llvm::sys::RunSignalHandlers() /home/mgorny/git/llvm-project/llvm/lib/Support/Signals.cpp:97:20
#​3 0xfffffffff555cd63 SignalHandler(int) /home/mgorny/git/llvm-project/llvm/lib/Support/Unix/Signals.inc:407:1
#​4 0xfffffffff7fbe560 (linux-gate.so.1+0x560)
#​5 0xfffffffff713aff8
FileCheck error: '' is empty.
FileCheck command line: /srv/p2p/tmp/llvm32/bin/FileCheck /home/mgorny/git/llvm-project/llvm/test/ExecutionEngine/OrcLazy/debug-descriptor-elf-minimal.ll

--


FAIL: LLVM :: ExecutionEngine/OrcLazy/debug-objects-elf-minimal.ll (22 of 23)
******************** TEST 'LLVM :: ExecutionEngine/OrcLazy/debug-objects-elf-minimal.ll' FAILED ********************
Script:

: 'RUN: at line 3'; /srv/p2p/tmp/llvm32/bin/lli --jit-kind=orc-lazy --per-module-lazy --jit-linker=rtdyld --generate=__dump_jit_debug_objects /home/mgorny/git/llvm-project/llvm/test/ExecutionEngine/OrcLazy/debug-objects-elf-minimal.ll | /srv/p2p/tmp/llvm32/bin/llvm-dwarfdump --diff - | /srv/p2p/tmp/llvm32/bin/FileCheck /home/mgorny/git/llvm-project/llvm/test/ExecutionEngine/OrcLazy/debug-objects-elf-minimal.ll
: 'RUN: at line 6'; /srv/p2p/tmp/llvm32/bin/lli --jit-kind=orc-lazy --per-module-lazy --jit-linker=jitlink --generate=__dump_jit_debug_objects /home/mgorny/git/llvm-project/llvm/test/ExecutionEngine/OrcLazy/debug-objects-elf-minimal.ll | /srv/p2p/tmp/llvm32/bin/llvm-dwarfdump --diff - | /srv/p2p/tmp/llvm32/bin/FileCheck /home/mgorny/git/llvm-project/llvm/test/ExecutionEngine/OrcLazy/debug-objects-elf-minimal.ll
: 'RUN: at line 37'; /srv/p2p/tmp/llvm32/bin/lli --jit-kind=orc-lazy --per-module-lazy --jit-linker=rtdyld --generate=__dump_jit_debug_objects /home/mgorny/git/llvm-project/llvm/test/ExecutionEngine/OrcLazy/debug-objects-elf-minimal.ll | /srv/p2p/tmp/llvm32/bin/llvm-objdump --section-headers - | /srv/p2p/tmp/llvm32/bin/FileCheck --check-prefix=CHECK_LOAD_ADDR /home/mgorny/git/llvm-project/llvm/test/ExecutionEngine/OrcLazy/debug-objects-elf-minimal.ll
: 'RUN: at line 41'; /srv/p2p/tmp/llvm32/bin/lli --jit-kind=orc-lazy --per-module-lazy --jit-linker=jitlink --generate=__dump_jit_debug_objects /home/mgorny/git/llvm-project/llvm/test/ExecutionEngine/OrcLazy/debug-objects-elf-minimal.ll | /srv/p2p/tmp/llvm32/bin/llvm-objdump --section-headers - | /srv/p2p/tmp/llvm32/bin/FileCheck --check-prefix=CHECK_LOAD_ADDR /home/mgorny/git/llvm-project/llvm/test/ExecutionEngine/OrcLazy/debug-objects-elf-minimal.ll

Exit Code: 2

Command Output (stderr):

PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0. Program arguments: /srv/p2p/tmp/llvm32/bin/lli --jit-kind=orc-lazy --per-module-lazy --jit-linker=rtdyld --generate=__dump_jit_debug_objects /home/mgorny/git/llvm-project/llvm/test/ExecutionEngine/OrcLazy/debug-objects-elf-minimal.ll
#​0 0xfffffffff54a2ba7 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/mgorny/git/llvm-project/llvm/lib/Support/Unix/Signals.inc:569:3
#​1 0xfffffffff54a2d0f PrintStackTraceSignalHandler(void*) /home/mgorny/git/llvm-project/llvm/lib/Support/Unix/Signals.inc:632:1
#​2 0xfffffffff54a0bd9 llvm::sys::RunSignalHandlers() /home/mgorny/git/llvm-project/llvm/lib/Support/Signals.cpp:97:20
#​3 0xfffffffff54a0d63 SignalHandler(int) /home/mgorny/git/llvm-project/llvm/lib/Support/Unix/Signals.inc:407:1
#​4 0xfffffffff7f02560 (linux-gate.so.1+0x560)
#​5 0xfffffffff707eff8
error: -: The file was not recognized as a valid object file
FileCheck error: '' is empty.
FileCheck command line: /srv/p2p/tmp/llvm32/bin/FileCheck /home/mgorny/git/llvm-project/llvm/test/ExecutionEngine/OrcLazy/debug-objects-elf-minimal.ll

--

@llvmbot
Copy link
Member Author

llvmbot commented Jul 31, 2021

Unfortunately, I wasn't able to get a better backtrace. I suspect the crash is happening on some wrong EIP.

(gdb) bt
#​0 0xf705c008 in ?? ()
Backtrace stopped: Cannot access memory at address 0x5d

@weliveindetail
Copy link
Contributor

I can reproduce it locally and will have a closer look soon.

@weliveindetail
Copy link
Contributor

In the failing examples, lli emits call-through trampolines for the x86_64 ABI where it's supposed to use i386 in 32-bit processes. In createLocalLazyCallThroughManager() [1] the ABI is selected based on the target triple and it doesn't respect the bitness of the running processing.

In the specific case here, the loaded IR file declares the triple explicitly [2]. This forces emission of ELF objects even on non-native platforms (the tested feature only works with ELF). In the triple x86_64 is passed as the architecture. Target selection fails if we try and pass an unknown architecture instead.

In cases where the triple is omitted, lli attempts to detect it using sys::getProcessTriple() [3] which basically returns the LLVM_HOST_TRIPLE. We would get the required i386 arch only on 32-bit native hosts and it would continue to fail for the majority of cases where we run a 32-bit process on a 64-bit host.

Process triple detection is an issue that came up from time to time. To the best of my knowledge the host triple is the best we can provide "for the moment". While I agree that we'll need to challenge this at some point, I don't think it's the solution here.

Instead, I wonder if it's acceptable to pass on the CMake LLVM_BUILD_32_BITS option as a C++ preprocessor flag and use it to switch the ABI setting in ORC. I'll prepare a review.

[1] https://github.com/llvm/llvm-project/blob/e78bf49a58ed0bec/llvm/lib/ExecutionEngine/Orc/LazyReexports.cpp#L106
[2] https://github.com/llvm/llvm-project/blob/e78bf49a58ed0bec/llvm/test/ExecutionEngine/OrcLazy/debug-descriptor-elf-minimal.ll#L13
[3] https://github.com/llvm/llvm-project/blob/e78bf49a58ed0bec/llvm/lib/ExecutionEngine/Orc/JITTargetMachineBuilder.cpp#L27

@llvmbot
Copy link
Member Author

llvmbot commented Aug 5, 2021

Instead, I wonder if it's acceptable to pass on the CMake
LLVM_BUILD_32_BITS option as a C++ preprocessor flag and use it to switch
the ABI setting in ORC. I'll prepare a review.

This isn't going to work for us. We have a native tooling to force -m32 when building for 32-bit multilib. I've pointed LLVM_BUILD_32_BITS out as an easy reproducer.

Given that the whole LLVM is built explicitly as 32-bit executables, shouldn't it default to the 32-bit ABI?

@weliveindetail
Copy link
Contributor

This isn't going to work for us. We have a native tooling to force -m32
when building for 32-bit multilib. I've pointed LLVM_BUILD_32_BITS out as
an easy reproducer.

Well, that's fair. But we need some way to either tell ORC to use the i386 ABI in this case or to disable the tests (we can always add equivalent tests that report i386 in their triple).

Given that the whole LLVM is built explicitly as 32-bit executables,
shouldn't it default to the 32-bit ABI?

The question is how ORC determines that at runtime. Currently it uses the host triple and that causes this bug (64-bit hosts will still report x86_64 in 32-bit builds).

Of course, we could check the pointer size and make a special case for i386 vs. x86_64 in ORC. I believe there are more potential differences between host triple and process triple. It feels more hacky to me than the LLVM_BUILD_32_BITS solution.

For the moment, here is the patch I mentioned:
https://reviews.llvm.org/D107569

@llvmbot
Copy link
Member Author

llvmbot commented Aug 5, 2021

I've already commented on the patch. I think using i386 will have the same effect.

@llvmbot
Copy link
Member Author

llvmbot commented Aug 6, 2021

When building via our ebuild, we get:

LLVM_HOST_TRIPLE:STRING=i686-pc-linux-gnu

So I'm afraid that's "not it".

Does changing LLVM_HOST_TRIPLE actually make it work for you?

@weliveindetail
Copy link
Contributor

Does changing LLVM_HOST_TRIPLE actually make it work for you?
The segfault I ran into (and I guess it's the one you report here) happened due to an invalid op-code in the x86_64 trampoline. Changing LLVM_HOST_TRIPLE fixes that by writing a trampoline with valid 32-bit instructions. When returning from the trampoline code, however, I run into a new segfault. My current guess is that there is something wrong with the trampoline code. I didn't have time to debug it, but in a way the code coverage report is telling:

http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/ExecutionEngine/Orc/OrcABISupport.cpp.html#L404

When building via our ebuild, we get:

LLVM_HOST_TRIPLE:STRING=i686-pc-linux-gnu

So I'm afraid that's "not it".
We have no ABI support for this in ORC right now, but we're only emitting trampolines and jump stubs, so maybe i386 is compatible?

I see three distinct tasks here:
(1) Disable the two tests for the GDB JIT interface on non-x86_64 hosts as well as for non-64-bit builds.
(2) Extend ABI support in ORC (like i686 here).
(3) Handle 32-bit processes on 64-bit hosts and revisit process- vs. host-triple issues.

I think I can do (1) today. It will be the quick fix to get the test suite green. From next week on I am working on a client project. I won't find the time to work on (2) or (3) anytime soon, but I am happy to try and answer questions and review patches.

@weliveindetail
Copy link
Contributor

When building via our ebuild, we get:

LLVM_HOST_TRIPLE:STRING=i686-pc-linux-gnu
Interesting, then these tests shouldn't run at all right?
https://github.com/llvm/llvm-project/blob/dbce6a8d9d7c78e6/llvm/test/ExecutionEngine/OrcLazy/lit.local.cfg#L3

@llvmbot
Copy link
Member Author

llvmbot commented Aug 6, 2021

When building via our ebuild, we get:

LLVM_HOST_TRIPLE:STRING=i686-pc-linux-gnu
Interesting, then these tests shouldn't run at all right?
https://github.com/llvm/llvm-project/blob/dbce6a8d9d7c78e6/llvm/test/
ExecutionEngine/OrcLazy/lit.local.cfg#L3

if config.root.host_arch not in ['i386', ...

so they should be run on i386. Am I missing something?

I think that the first priority is to "fix" the test failures on i386. If you think it's reasonable to just mark these tests UNSUPPORTED, I'm happy with that. However, if these tests merely indicate that something can actually segfault at runtime, I'd prefer seeing ORC fixed at least to error out properly.

@weliveindetail
Copy link
Contributor

so they should be run on i386. Am I missing something?
You said you have i686 right?

I think that the first priority is to "fix" the test failures on i386. If
you think it's reasonable to just mark these tests UNSUPPORTED, I'm happy
with that.
Yes, the debug plugin and tests are made for 64-bit on x86_64: https://reviews.llvm.org/D107640

However, if these tests merely indicate that something can actually segfault at runtime, I'd prefer seeing ORC fixed at least to error out properly.
I agree that this is be an important improvement. Looks like the JITLink debug plugin doesn't handle it correctly at this point. I will take care of it once I find the time.

@llvmbot
Copy link
Member Author

llvmbot commented Aug 6, 2021

so they should be run on i386. Am I missing something?
You said you have i686 right?

I dare say in this context all i*86 are equivalent.

@llvmbot
Copy link
Member Author

llvmbot commented Aug 27, 2021

@​tstellar, could you backport c5ab55f to 13.0.0, please?

@tstellar
Copy link
Collaborator

tstellar commented Sep 2, 2021

Merged: c8905f1

@tstellar
Copy link
Collaborator

mentioned in issue #51489

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

3 participants