Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AArch64 OrcJIT segfault #1000

Open
gmarkall opened this issue Oct 10, 2023 · 4 comments
Open

AArch64 OrcJIT segfault #1000

gmarkall opened this issue Oct 10, 2023 · 4 comments
Labels

Comments

@gmarkall
Copy link
Member

The conda-forge AArch64 build of llvmlite 0.41 is failing tests with a segfault. See conda-forge/llvmlite-feedstock#74

I note that there seems to be an issue in one of the OrcJIT tests, which only reproduces on AArch64 as far as I am aware so far:

test_add_ir_module (llvmlite.tests.test_binding.TestOrcLLJIT) ...
==10660== Conditional jump or move depends on uninitialised value(s)
==10660==    at 0x9662550: llvm::AArch64TargetMachine::AArch64TargetMachine(llvm::Target const&, llvm::Triple const&, llvm::StringRef, llvm::StringRef, llvm::TargetOptions const&, llvm::Optional<llvm::Reloc::Model>, llvm::Optional<llvm::CodeModel::Model>, llvm::CodeGenOpt::Level, bool, bool) (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/libLLVM-14.so)
==10660==    by 0x9662993: llvm::RegisterTargetMachine<llvm::AArch64leTargetMachine>::Allocator(llvm::Target const&, llvm::Triple const&, llvm::StringRef, llvm::StringRef, llvm::TargetOptions const&, llvm::Optional<llvm::Reloc::Model>, llvm::Optional<llvm::CodeModel::Model>, llvm::CodeGenOpt::Level, bool) (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/libLLVM-14.so)
==10660==    by 0x93F166F: llvm::orc::JITTargetMachineBuilder::createTargetMachine() (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/libLLVM-14.so)
==10660==    by 0x940917F: llvm::orc::LLJIT::LLJIT(llvm::orc::LLJITBuilderState&, llvm::Error&) (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/libLLVM-14.so)
==10660==    by 0x66C629B: LLVMPY_CreateLLJITCompiler (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/python3.10/site-packages/llvmlite/binding/libllvmlite.so)
==10660==    by 0x5ED2EBF: ffi_call_SYSV (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/libffi.so.8.1.0)
==10660==    by 0x5ED2437: ffi_call_int (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/libffi.so.8.1.0)
==10660==    by 0x5EAC4F7: _ctypes_callproc (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/python3.10/lib-dynload/_ctypes.cpython-310-aarch64-linux-gnu.so)
==10660==    by 0x5EA55BF: PyCFuncPtr_call (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/python3.10/lib-dynload/_ctypes.cpython-310-aarch64-linux-gnu.so)
==10660==    by 0x1840E7: _PyObject_Call (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==10660==    by 0x16EB0B: _PyEval_EvalFrameDefault (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==10660==    by 0x21F44F: _PyEval_Vector (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==10660== 
==10660== Conditional jump or move depends on uninitialised value(s)
==10660==    at 0x9662550: llvm::AArch64TargetMachine::AArch64TargetMachine(llvm::Target const&, llvm::Triple const&, llvm::StringRef, llvm::StringRef, llvm::TargetOptions const&, llvm::Optional<llvm::Reloc::Model>, llvm::Optional<llvm::CodeModel::Model>, llvm::CodeGenOpt::Level, bool, bool) (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/libLLVM-14.so)
==10660==    by 0x9662993: llvm::RegisterTargetMachine<llvm::AArch64leTargetMachine>::Allocator(llvm::Target const&, llvm::Triple const&, llvm::StringRef, llvm::StringRef, llvm::TargetOptions const&, llvm::Optional<llvm::Reloc::Model>, llvm::Optional<llvm::CodeModel::Model>, llvm::CodeGenOpt::Level, bool) (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/libLLVM-14.so)
==10660==    by 0x93F166F: llvm::orc::JITTargetMachineBuilder::createTargetMachine() (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/libLLVM-14.so)
==10660==    by 0x940381B: llvm::orc::LLJIT::createCompileFunction(llvm::orc::LLJITBuilderState&, llvm::orc::JITTargetMachineBuilder) (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/libLLVM-14.so)
==10660==    by 0x9408CF3: llvm::orc::LLJIT::LLJIT(llvm::orc::LLJITBuilderState&, llvm::Error&) (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/libLLVM-14.so)
==10660==    by 0x66C629B: LLVMPY_CreateLLJITCompiler (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/python3.10/site-packages/llvmlite/binding/libllvmlite.so)
==10660==    by 0x5ED2EBF: ffi_call_SYSV (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/libffi.so.8.1.0)
==10660==    by 0x5ED2437: ffi_call_int (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/libffi.so.8.1.0)
==10660==    by 0x5EAC4F7: _ctypes_callproc (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/python3.10/lib-dynload/_ctypes.cpython-310-aarch64-linux-gnu.so)
==10660==    by 0x5EA55BF: PyCFuncPtr_call (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/python3.10/lib-dynload/_ctypes.cpython-310-aarch64-linux-gnu.so)
==10660==    by 0x1840E7: _PyObject_Call (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==10660==    by 0x16EB0B: _PyEval_EvalFrameDefault (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==10660== 

However, the actual segfault in the tests on conda-forge occurs here:

test_global_ctors_dtors (llvmlite.tests.test_binding.TestOrcLLJIT) ... Fatal Python error: Segmentation fault
@gmarkall
Copy link
Member Author

With my local build and a debug build of LLVM, the only valgrind error I'm seeing is:

$ valgrind python -m unittest llvmlite.tests.test_binding.TestGlobalConstructors.test_emit_assembly -v

test_emit_assembly (llvmlite.tests.test_binding.TestGlobalConstructors)
Test TargetMachineRef.emit_assembly() ... ==8877== Conditional jump or move depends on uninitialised value(s)
==8877==    at 0x19A288: PyLong_FromLong (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x64E057F: _ctypes_callproc (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/python3.10/lib-dynload/_ctypes.cpython-310-aarch64-linux-gnu.so)
==8877==    by 0x64D95BF: PyCFuncPtr_call (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/python3.10/lib-dynload/_ctypes.cpython-310-aarch64-linux-gnu.so)
==8877==    by 0x1840E7: _PyObject_Call (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x16EB0B: _PyEval_EvalFrameDefault (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x21F44F: _PyEval_Vector (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x1845A7: _PyObject_FastCallDictTstate (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x1848EB: _PyObject_Call_Prepend (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x1D643F: slot_tp_call (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x1843BF: _PyObject_MakeTpCall (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x17423F: _PyEval_EvalFrameDefault (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x21F44F: _PyEval_Vector (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877== 
==8877== Use of uninitialised value of size 8
==8877==    at 0x19A360: PyLong_FromLong (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x64E057F: _ctypes_callproc (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/python3.10/lib-dynload/_ctypes.cpython-310-aarch64-linux-gnu.so)
==8877==    by 0x64D95BF: PyCFuncPtr_call (in /home/gmarkall/mambaforge/envs/llvm041debug/lib/python3.10/lib-dynload/_ctypes.cpython-310-aarch64-linux-gnu.so)
==8877==    by 0x1840E7: _PyObject_Call (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x16EB0B: _PyEval_EvalFrameDefault (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x21F44F: _PyEval_Vector (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x1845A7: _PyObject_FastCallDictTstate (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x1848EB: _PyObject_Call_Prepend (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x1D643F: slot_tp_call (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x1843BF: _PyObject_MakeTpCall (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x17423F: _PyEval_EvalFrameDefault (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877==    by 0x21F44F: _PyEval_Vector (in /home/gmarkall/mambaforge/envs/llvm041debug/bin/python3.10)
==8877== 

@esc esc transferred this issue from numba/numba Oct 12, 2023
gmarkall added a commit to gmarkall/llvmlite that referenced this issue Oct 12, 2023
Since OrcJIT is still experimental, and there seem to be some issues on
platforms we don't regularly test on (e.g. Issue numba#1000), it seems
prudent to simply skip the OrcJIT tests on non-x86 platforms.
@gmarkall
Copy link
Member Author

gmarkall commented Oct 12, 2023

See also #1002.

Update: Fortunately it was not necessary to remove OrcJIT support as proposed in #1002.

Xunop added a commit to Xunop/archriscv-packages that referenced this issue Oct 19, 2023
OrcJIT is still in the experimental stage, and there are some issues
where the OrcJIT test fails when tested on non-x86 platforms (e.g. Issue [numba#1000](numba/llvmlite#1000)),
so skip writing this test on non-x86 platforms.
Xunop added a commit to Xunop/archriscv-packages that referenced this issue Oct 19, 2023
OrcJIT is still in the experimental stage, and there are some issues
where the OrcJIT test fails when tested on non-x86 platforms (e.g. Issue [numba#1000](numba/llvmlite#1000)),
so skip writing this test on non-x86 platforms.
Xunop added a commit to Xunop/archriscv-packages that referenced this issue Oct 19, 2023
Skip OrcJIT tests on riscv64.

OrcJIT is still in the experimental stage, and there are some issues
where the OrcJIT test fails when tested on non-x86 platforms (e.g. Issue [numba#1000](numba/llvmlite#1000)),
so skip writing this test on non-x86 platforms.
felixonmars pushed a commit to felixonmars/archriscv-packages that referenced this issue Oct 19, 2023
Skip OrcJIT tests on riscv64.

OrcJIT is still in the experimental stage, and there are some issues
where the OrcJIT test fails when tested on non-x86 platforms (e.g. Issue [numba#1000](numba/llvmlite#1000)),
so skip writing this test on non-x86 platforms.
@gmarkall gmarkall added the bug label Oct 26, 2023
@dlee992
Copy link
Contributor

dlee992 commented Aug 15, 2024

HA, interesting.

I do have a Mac with M-series chips.

I think I can reproduce this bug if I run this orcjit test on my machine?

➜  ~ uname -a
Darwin dalis-MacBook-Pro.local 23.6.0 Darwin Kernel Version 23.6.0: Mon Jul 29 21:13:04 PDT 2024; root:xnu-10063.141.2~1/RELEASE_ARM64_T6020 arm64

Or maybe we should simply retest it on LLVM 15+, maybe it's resolved by higher version.

@gmarkall
Copy link
Member Author

It's certainly worth a try!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants