Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Aarch64] SelectionDAG asserts with "include/llvm/CodeGen/MachineOperand.h:557: int64_t llvm::MachineOperand::getImm() const: Assertion `isImm() && "Wrong MachineOperand accessor"' failed (MLIR test on NVIDIA GH2000) #96394

Closed
brnorris03 opened this issue Jun 22, 2024 · 21 comments
Assignees

Comments

@brnorris03
Copy link

brnorris03 commented Jun 22, 2024

The check-mlir target fails on an NVIDIA GH2000 server (https://www.nvidia.com/en-us/data-center/grace-hopper-superchip/; Armv9 cpu). My LLVM (commit ecf2a53) configuration, which works fine elsewhere:

cmake -G Ninja ../llvm \
   -DLLVM_ENABLE_PROJECTS=mlir \
   -DLLVM_TARGETS_TO_BUILD="host" \
   -DCMAKE_BUILD_TYPE=Release \
   -DLLVM_ENABLE_ASSERTIONS=ON \
   -DLLVM_ENABLE_RTTI=ON \
   -DLLVM_ENABLE_LIBEDIT=OFF

Output log and cmake cache are attached.

err.txt
CMakeCache.txt

@sjarus
Copy link
Contributor

sjarus commented Jun 24, 2024

@banach-space do you have any idea what's happening here ?

@joker-eph
Copy link
Collaborator

This is something that should be reproducible anywhere if you can provide the LLVM IR.

@brnorris03
Copy link
Author

Here is the LLVM IR for one of the failing tests

mlir-opt mlir/test/mlir-cpu-runner/async.mlir -pass-pipeline="builtin.module(async-to-async-runtime,func.func(async-runtime-ref-counting,async-runtime-ref-counting-opt),convert-async-to-llvm,func.func(convert-linalg-to-loops,convert-scf-to-cf),finalize-memref-to-llvm,func.func(convert-arith-to-llvm),convert-func-to-llvm,reconcile-unrealized-casts)"

async.llvm.mlir.txt

@banach-space
Copy link
Contributor

banach-space commented Jun 25, 2024

I am afk ATM, so replying briefly.

Thanks for reporting this! The MLIR tests are run by this Aarch64 LLVM buildbot:

The buildbot is “green” which suggests that these errors might be specific to your platform. (Note: I am unable to verify that these specific tests are indeed run - I will double check when I am back at my desk later this week).

Now, it looks like all tests are hitting an assert triggered by this

# |  #9 0x0000be726770e15c llvm::AArch64GenSubtargetInfo::resolveSchedClass(unsigned int, llvm::MachineInstr const*, llvm::TargetSchedModel const*) const (/proj/work/bnorris/upstream/llvm-project/build/bin/mlir-cpu-runner+0x100e15c)

This feels like either an LLVM backend bug or JIT misconfiguration. Could you try generating an LLVM IR file for one of the failing tests (just use mlir-translate to translate the mlir-opt output)? Next, try “running” that with LLVM’s lli (instead of mlir-cpu-runner). If that also crashes then it’s likely an LLVM issue. Otherwise, I’d look into mlir-cpu-runner.

HTH,
Andrzej

@brnorris03
Copy link
Author

brnorris03 commented Jun 25, 2024 via email

@brnorris03
Copy link
Author

The LLVM IR for the abovementioned tests is attached here.
math-polynomial-approx.ll.txt
async.ll.txt

@banach-space
Copy link
Contributor

Quick question - why do you want to use llc? I mean this:

✗ llc math-polynomial-approx.ll

:) In my previous reply I meant lli rather than llc.

As for the errors that you are seeing:

llc: error: llc: math-polynomial-approx.ll:1979:77: error: unterminated attribute group

This will be something daft. I would try running opt before llc (it’s been a while since I’ve played with these, so I might be wrong).

As for this:

PromoteIntegerResult #0: t54: i8,ch = llvm.coro.suspend t51, TargetConstant:i64<58>, t42, Constant:i1<0>

… most likely a missing target and/or feature flag when invoking llc.

I should be able to help more once I’ve got access to something I can compile code with 😅

@joker-eph
Copy link
Collaborator

joker-eph commented Jun 25, 2024

lli will the JIT infrastructure to compile and execute. The error is in the backend during code generation, so the minimal unit-test is using llc which is exclusively running the backend in isolation.

(opt may be needed in some cases, like when coroutine are used indeed: we should attach here the IR suitable as input to llc, this is likely what we need for a unit-test with the bug fix in the LLVM backend)

@llvmbot
Copy link
Member

llvmbot commented Jun 25, 2024

@llvm/issue-subscribers-backend-aarch64

Author: Boyana Norris (brnorris03)

The `check-mlir` target fails on an NVIDIA GH2000 server (https://www.nvidia.com/en-us/data-center/grace-hopper-superchip/; Armv9 cpu). My LLVM (commit ecf2a53) configuration, which works fine elsewhere:
cmake -G Ninja ../llvm \
   -DLLVM_ENABLE_PROJECTS=mlir \
   -DLLVM_TARGETS_TO_BUILD="host" \
   -DCMAKE_BUILD_TYPE=Release \
   -DLLVM_ENABLE_ASSERTIONS=ON \
   -DLLVM_ENABLE_RTTI=ON \
   -DLLVM_ENABLE_LIBEDIT=OFF

Output log and cmake cache are attached.

err.txt
CMakeCache.txt

@joker-eph joker-eph changed the title mlir-cpu-runner tests (check-mlir target) fail on NVIDIA GH2000 [Aarch64] SelectionDAG asserts with "include/llvm/CodeGen/MachineOperand.h:557: int64_t llvm::MachineOperand::getImm() const: Assertion `isImm() && "Wrong MachineOperand accessor"' failed (MLIR test on NVIDIA GH2000) Jun 25, 2024
@brnorris03
Copy link
Author

brnorris03 commented Jun 25, 2024

Actually, my bad -- /proj/work/bnorris/upstream/llvm-project/build/bin/llc math-polynomial-approx.ll is fine -- I accidentally ran with the wrong llc with that file, apologies! So while the lit test fails, I can manually compile math-polynomial-approx.mlir and run it with lli.

/proj/work/bnorris/upstream/llvm-project/build/bin/lli --entry-function=main --dlopen=/proj/work/bnorris/upstream/llvm-project/build/lib/libmlir_c_runner_utils.so math-polynomial-approx.ll

The async.mlir test still fails when I run with lli.

/proj/work/bnorris/upstream/llvm-project/build/bin/lli --entry-function=main --dlopen=/proj/work/bnorris/upstream/llvm-project/build/lib/libmlir_c_runner_utils.so async.ll                
PromoteIntegerResult #0: t54: i8,ch = llvm.coro.suspend t51, TargetConstant:i64<58>, t42, Constant:i1<0>

LLVM ERROR: Do not know how to promote this operator!
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /proj/work/bnorris/upstream/llvm-project/build/bin/lli --entry-function=main --dlopen=/proj/work/bnorris/upstream/llvm-project/build/lib/libmlir_c_runner_utils.so async.ll
1.      Running pass 'Function Pass Manager' on module 'async.ll'.
2.      Running pass 'AArch64 Instruction Selection' on function '@async_execute_fn'
 #0 0x0000b7f10616c874 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x146c874)
 #1 0x0000b7f10616a0c0 llvm::sys::RunSignalHandlers() (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x146a0c0)
 #2 0x0000b7f10616a218 SignalHandler(int) Signals.cpp:0:0
 #3 0x0000e289f5c709d0 (linux-vdso.so.1+0x9d0)
 #4 0x0000e289f57cf200 __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #5 0x0000e289f578a67c gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #6 0x0000e289f5777130 abort ./stdlib/abort.c:81:7
 #7 0x0000b7f1060c7db8 (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x13c7db8)
 #8 0x0000b7f1060c7e44 (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x13c7e44)
 #9 0x0000b7f10606b6d0 llvm::DAGTypeLegalizer::PromoteIntegerResult(llvm::SDNode*, unsigned int) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x136b6d0)
#10 0x0000b7f105f0e6a0 llvm::DAGTypeLegalizer::run() (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x120e6a0)
#11 0x0000b7f105f0f1a0 llvm::SelectionDAG::LegalizeTypes() (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x120f1a0)
#12 0x0000b7f105ded594 llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x10ed594)
#13 0x0000b7f105df0d50 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x10f0d50)
#14 0x0000b7f105df20e4 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x10f20e4)
#15 0x0000b7f105de531c llvm::SelectionDAGISelLegacy::runOnMachineFunction(llvm::MachineFunction&) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x10e531c)
#16 0x0000b7f1053d30dc llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (.part.0) MachineFunctionPass.cpp:0:0
#17 0x0000b7f1059b5038 llvm::FPPassManager::runOnFunction(llvm::Function&) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xcb5038)
#18 0x0000b7f1059b5190 llvm::FPPassManager::runOnModule(llvm::Module&) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xcb5190)
#19 0x0000b7f1059b5d50 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xcb5d50)
#20 0x0000b7f105c04228 llvm::orc::SimpleCompiler::operator()(llvm::Module&) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xf04228)
#21 0x0000b7f105c04794 llvm::orc::ConcurrentIRCompiler::operator()(llvm::Module&) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xf04794)
#22 0x0000b7f105c727bc llvm::orc::IRCompileLayer::emit(std::unique_ptr<llvm::orc::MaterializationResponsibility, std::default_delete<llvm::orc::MaterializationResponsibility>>, llvm::orc::ThreadSafeModule) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xf727bc)
#23 0x0000b7f105d07098 llvm::orc::IRTransformLayer::emit(std::unique_ptr<llvm::orc::MaterializationResponsibility, std::default_delete<llvm::orc::MaterializationResponsibility>>, llvm::orc::ThreadSafeModule) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x1007098)
#24 0x0000b7f105d07098 llvm::orc::IRTransformLayer::emit(std::unique_ptr<llvm::orc::MaterializationResponsibility, std::default_delete<llvm::orc::MaterializationResponsibility>>, llvm::orc::ThreadSafeModule) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x1007098)
#25 0x0000b7f105c7d8d0 llvm::orc::BasicIRLayerMaterializationUnit::materialize(std::unique_ptr<llvm::orc::MaterializationResponsibility, std::default_delete<llvm::orc::MaterializationResponsibility>>) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xf7d8d0)
#26 0x0000b7f105c23a84 llvm::orc::MaterializationTask::run() (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xf23a84)
#27 0x0000b7f105c08f28 llvm::orc::ExecutionSession::dispatchTask(std::unique_ptr<llvm::orc::Task, std::default_delete<llvm::orc::Task>>) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xf08f28)
#28 0x0000b7f105c24cd8 llvm::orc::ExecutionSession::dispatchOutstandingMUs() (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xf24cd8)
#29 0x0000b7f105c2f19c llvm::orc::ExecutionSession::OL_completeLookup(std::unique_ptr<llvm::orc::InProgressLookupState, std::default_delete<llvm::orc::InProgressLookupState>>, std::shared_ptr<llvm::orc::AsynchronousSymbolQuery>, std::function<void (llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>, llvm::DenseMapInfo<llvm::orc::JITDylib*, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>>> const&)>) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xf2f19c)
#30 0x0000b7f105c30430 llvm::orc::InProgressFullLookupState::complete(std::unique_ptr<llvm::orc::InProgressLookupState, std::default_delete<llvm::orc::InProgressLookupState>>) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xf30430)
#31 0x0000b7f105c1941c llvm::orc::ExecutionSession::OL_applyQueryPhase1(std::unique_ptr<llvm::orc::InProgressLookupState, std::default_delete<llvm::orc::InProgressLookupState>>, llvm::Error) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xf1941c)
#32 0x0000b7f105c250c0 llvm::orc::ExecutionSession::lookup(llvm::orc::LookupKind, std::vector<std::pair<llvm::orc::JITDylib*, llvm::orc::JITDylibLookupFlags>, std::allocator<std::pair<llvm::orc::JITDylib*, llvm::orc::JITDylibLookupFlags>>> const&, llvm::orc::SymbolLookupSet, llvm::orc::SymbolState, llvm::unique_function<void (llvm::Expected<llvm::DenseMap<llvm::orc::SymbolStringPtr, llvm::orc::ExecutorSymbolDef, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>, llvm::detail::DenseMapPair<llvm::orc::SymbolStringPtr, llvm::orc::ExecutorSymbolDef>>>)>, std::function<void (llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>, llvm::DenseMapInfo<llvm::orc::JITDylib*, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>>> const&)>) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xf250c0)
#33 0x0000b7f105c25388 llvm::orc::ExecutionSession::lookup(std::vector<std::pair<llvm::orc::JITDylib*, llvm::orc::JITDylibLookupFlags>, std::allocator<std::pair<llvm::orc::JITDylib*, llvm::orc::JITDylibLookupFlags>>> const&, llvm::orc::SymbolLookupSet, llvm::orc::LookupKind, llvm::orc::SymbolState, std::function<void (llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>, llvm::DenseMapInfo<llvm::orc::JITDylib*, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>>> const&)>) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xf25388)
#34 0x0000b7f105c257a4 llvm::orc::ExecutionSession::lookup(std::vector<std::pair<llvm::orc::JITDylib*, llvm::orc::JITDylibLookupFlags>, std::allocator<std::pair<llvm::orc::JITDylib*, llvm::orc::JITDylibLookupFlags>>> const&, llvm::orc::SymbolStringPtr, llvm::orc::SymbolState) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xf257a4)
#35 0x0000b7f105c87710 llvm::orc::LLJIT::lookupLinkerMangled(llvm::orc::JITDylib&, llvm::orc::SymbolStringPtr) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0xf87710)
#36 0x0000b7f10527b238 llvm::orc::LLJIT::lookup(llvm::orc::JITDylib&, llvm::StringRef) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x57b238)
#37 0x0000b7f105287b3c runOrcJIT(char const*) (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x587b3c)
#38 0x0000b7f105219d20 main (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x519d20)
#39 0x0000e289f57773fc __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3
#40 0x0000e289f57774cc call_init ./csu/../csu/libc-start.c:128:20
#41 0x0000e289f57774cc __libc_start_main ./csu/../csu/libc-start.c:379:5
#42 0x0000b7f105275570 _start (/proj/work/bnorris/upstream/llvm-project/build/bin/lli+0x575570)
[1]    3067275 abort (core dumped)  /proj/work/bnorris/upstream/llvm-project/build/bin/lli --entry-function=main 

@banach-space
Copy link
Contributor

lli will the JIT infrastructure to compile and execute

Yes, and it's possible that ExecutionEngine is misconfiguring the backend (i.e. there could be a bug in how JIT is used). I wouldn't exclude this possibility just yet.

The error is in the backend during code generation, so the minimal unit-test is using llc which is exclusively running the backend in isolation.

I think that there are multiple errors here.

The async.mlir test still fails when I run with lli.

But this time the failure is different to the crash that you saw with mlir-cpu-runner, right? (i.e. the crash that you reported originally)

Also, from what you are saying, "math-polynomial-approx" works fine with lli, but crashes with mlir-cpu-runner?

I'm reading/typing this from a tablet, so apologies if I missed sth.

@joker-eph
Copy link
Collaborator

joker-eph commented Jun 25, 2024

Yes, and it's possible that ExecutionEngine is misconfiguring the backend (i.e. there could be a bug in how JIT is used). I wouldn't exclude this possibility just yet.

Regardless: this kind of assertions has to be reproducible with llc.

I think that there are multiple errors here.

What do you mean? There is more than the backend assert here? (if so the issue should be split likely right?)

@banach-space
Copy link
Contributor

I finally have access to a keyboard and was able to verify a couple of things.

The reported failures are all from the "mlir/test/mlir-cpu-runner" sub-dir.

  1. These tests work fine for me locally. I checked on AWS G3 and AppleSilicon M3 (identical build flags as originally reported )
  2. These tests also work fine when run on the Buildbot (clang-aarch64-sve-vla - check ninja check 1 logs)

Since I am not able to reproduce these failures, I'll need some "remote" help @brnorris03 😅 In particular, I will need a bit of help understanding your system. Could you run clang -print-effective-triple? I'm not sure where to look just yet, but this could provide a good hint.

Now, there are 3 threads of discussion that I want to reply to/comment on. Details below :)

1. Replying to Mehdi

What do you mean? There is more than the backend assert here?

Yes - please check the original post (i.e. the "err.txt" file).

The original assertion

# | mlir-cpu-runner: /proj/work/bnorris/upstream/llvm-project/llvm/include/llvm/CodeGen/MachineOperand.h:557: int64_t llvm::MachineOperand::getImm() const: Assertion `isImm() && "Wrong MachineOperand accessor"' failed.

Where the original assertion is hit

# |  #9 0x0000beedcd25e15c llvm::AArch64GenSubtargetInfo::resolveSchedClass(unsigned int, llvm::MachineInstr const*, llvm::TargetSchedModel const*) const (/proj/work/bnorris/upstream/llvm-project/build/bin/mlir-cpu-runner+0x100e15c

I believe that this is what we should be investigating here, rather than:

PromoteIntegerResult #0: t54: i8,ch = llvm.coro.suspend t51, TargetConstant:i64<58>, t42, Constant:i1<0>

2. SelectionDAG assertions

I don't claim to be a backend expert, but from a quick investigation those "backend asserts" (e.g. ^^^) seem to be caused by these intrinsics:

331:declare token @llvm.coro.id(i32, ptr readnone, ptr nocapture readonly, ptr) #1
334:declare i64 @llvm.coro.size.i64() #2
337:declare i64 @llvm.coro.align.i64() #2
340:declare ptr @llvm.coro.begin(token, ptr writeonly) #3
343:declare token @llvm.coro.save(ptr) #4
346:declare i8 @llvm.coro.suspend(token, i1) #3
349:declare ptr @llvm.coro.free(token, ptr nocapture readonly) #5
352:declare i1 @llvm.coro.end(ptr, i1, token) #3
354:declare void @llvm.coro.resume(ptr)

IIUC, these need to be "lowered" first (e.g. via -O2) before passing the input to llc. In fact, this is what Mehdi pointed out earlier:

opt may be needed in some cases, like when coroutine are used indeed

Tl;Dr These specific assertions can be "fixed" by passing e.g. -O2 to opt - @brnorris03 - could you confirm?

3. Original assertions

Going back to the original assertions ...

@brnorris03 For all the failing tests, could you try:

mlir-opt <flags> -o file.mlir | opt -O2 | lli

If that works then we should compare lli and mlir-cpu-runner specifically. Sadly, since I'm unable to reproduce these failures, I won't be able to try this myself.

@joker-eph
Copy link
Collaborator

joker-eph commented Jun 26, 2024

What do you mean? There is more than the backend assert here?
Yes - please check the original post (i.e. the "err.txt" file).

But... that is the backend assert I'm referring to! There is nothing else in the err.txt file, is there?
(you may have noticed I changed the title of the issue to reflect this).

The thing you mention:

PromoteIntegerResult #0: t54: i8,ch = llvm.coro.suspend t51, TargetConstant:i64<58>, t42, Constant:i1<0>

Is not a bug (it's also not technically an assert by the way I believe), nor related to this issue: it is not in the original err.txt and only showed when someone tried to use llc without running opt which is mandatory here. So user error in the reproduction, I ignored it.

@joker-eph
Copy link
Collaborator

joker-eph commented Jun 26, 2024

For debugging this further, I would run the crashing case with mlir-cpu-runner with -print-after-all --print-module-scope.
The last printed module before the assertion should be a good reproducer for llc (and it should include the triple/datalayout in the IR)

For example right at the SelectionDAG point, with an empty main() input would display:

*** IR Dump After Safe Stack instrumentation pass (safe-stack) *** (function: _mlir_main)
; ModuleID = 'LLVMDialectModule'
source_filename = "LLVMDialectModule"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define void @main() {
  ret void
}

define void @_mlir_main(ptr %0) {
  call void @main()
  ret void
}

!llvm.module.flags = !{!0}

!0 = !{i32 2, !"Debug Info Version", i32 3}
# *** IR Dump After X86 DAG->DAG Instruction Selection (x86-isel) ***:
# Machine code for function _mlir_main: IsSSA, TracksLiveness

bb.0 (%ir-block.1):

I expect this issue would lead to a crash before the "IR Dump After X86 DAG->DAG ..." stage, and leave us with the IR input to the backend.

@banach-space
Copy link
Contributor

banach-space commented Jun 27, 2024

What do you mean? There is more than the backend assert here?
Yes - please check the original post (i.e. the "err.txt" file).

But... that is the backend assert I'm referring to! There is nothing else in the err.txt file, is there? (you may have noticed I changed the title of the issue to reflect this).

Apologies. You changed the subject after this reply and I incorrectly assumed that you were referring to the error from that reply (rather than the original msg). My bad, didn't notice on my small screen.

The thing you mention:

PromoteIntegerResult #0: t54: i8,ch = llvm.coro.suspend t51, TargetConstant:i64<58>, t42, Constant:i1<0>

Is not a bug

Agreed, we are on the same page.

I would run the crashing case with mlir-cpu-runner with -print-after-all --print-module-scope

+1 @brnorris03, please could you try this and report back? Btw, thats is a great suggestion, thanks Mehdi! (I didn't really register that that's available)

Btw, I've realised that I only have access to ArmV8 machines, whereas you are using ArmV9 - sorry for not noticing earlier. Based on https://resources.nvidia.com/en-us-grace-cpu/grace-hopper-superchip?ncid=no-ncid, that's Neoverse V2. I checked AArch64GenSubtargetInfo::resolveSchedClass (*) in my build directory (it's "tablegen"-ed) and Neoverse V2 is already implemented:

    if (SchedModel->getProcessorID() == 17) { // NeoverseV2Model
     if ((
           AArch64_AM::getShiftType(MI->getOperand(3).getImm()) == AArch64_AM::LSL
           && (
             AArch64_AM::getShiftValue(MI->getOperand(3).getImm()) == 0
             || AArch64_AM::getShiftValue(MI->getOperand(3).getImm()) == 1
             || AArch64_AM::getShiftValue(MI->getOperand(3).getImm()) == 2
             || AArch64_AM::getShiftValue(MI->getOperand(3).getImm()) == 3
             || AArch64_AM::getShiftValue(MI->getOperand(3).getImm()) == 4
           )
         ))
       return 1565; // V2Write_1cyc_1I_ReadI_ReadISReg
     return 1566; // V2Write_2cyc_1M_ReadI_ReadISReg
   }

Perhaps something else is not yet implemented for Neoverse V2 🤔

(*) The backtrace that you shared suggests that that's what might be "broken".

@brnorris03
Copy link
Author

Thank you all for investigating this! Regarding the clang triplet question -- I actually only have gcc 12 available on that machine, which is what I was using for building this. Do I have to see clang?

I am traveling today, but will generate mlir-cpu-runner with -print-after-all --print-module-scope traces as soon as I arrive.

@joker-eph
Copy link
Collaborator

  • I actually only have gcc 12 available on that machine, which is what I was using for building this. Do I have to see clang?

This is unrelated: the triple will be computed by mlir-cpu-runner internally before emitting LLVM IR, it'll be recorded there and visible in the trace you'll produce.

@brnorris03
Copy link
Author

This is unrelated: the triple will be computed by mlir-cpu-runner internally before emitting LLVM IR, it'll be recorded there and visible in the trace you'll produce.

Well, it's somewhat related to the request to run clang -print-effective-triple (above), which does require clang. Anyway, the target triple is in the CMakeCache.txt attached in the original comment: LLVM_HOST_TRIPLE:STRING=aarch64-unknown-linux-gnu.

I generated the trace for math-polynomial-approx.mlir with:

/proj/work/bnorris/upstream/llvm-project/build-mlir/bin/mlir-opt \
  /proj/work/bnorris/upstream/llvm-project/mlir/test/mlir-cpu-runner/math-polynomial-approx.mlir \
  -pass-pipeline="builtin.module(func.func(test-math-polynomial-approximation,convert-arith-to-llvm),convert-vector-to-scf,convert-scf-to-cf,convert-cf-to-llvm,convert-vector-to-llvm,func.func(convert-math-to-llvm),convert-func-to-llvm,reconcile-unrealized-casts)"  \
  | /proj/work/bnorris/upstream/llvm-project/build-mlir/bin/mlir-cpu-runner   \
   -print-after-all --print-module-scope  -e main -entry-point-result=void -O0 \
   -shared-libs=/proj/work/bnorris/upstream/llvm-project/build-mlir/lib/libmlir_c_runner_utils.so \
  -shared-libs=/proj/work/bnorris/upstream/llvm-project/build-mlir/lib/libmlir_runner_utils.so \
     >& mlir-cpu-runner.trace.txt

The trace is too large to attach as a gz file (and github won't let me attach the bz2 compressed version), so I put it in this temporary repo: https://github.com/brnorris03/misc/blob/main/mlir-cpu-runner.trace.txt.bz2

Thank you!

david-arm added a commit to david-arm/llvm-project that referenced this issue Jun 28, 2024
The NeoverseZeroMove predicate assumes that the first operand
is always an immediate, which isn't always true. For example,
it could be a stack offset, etc. This patch fixes that by
checking if the operand is an immediate first.
david-arm added a commit that referenced this issue Jul 1, 2024
The NeoverseZeroMove predicate assumes that the first operand is always
an immediate, which isn't always true. For example, it could be a stack
offset, etc. This patch fixes that by checking if the operand is an
immediate first.
@david-arm
Copy link
Contributor

This should be fixed by #97047 now I hope!

@brnorris03
Copy link
Author

This should be fixed by #97047 now I hope!

I can confirm that it works on our GH2000.

lravenclaw pushed a commit to lravenclaw/llvm-project that referenced this issue Jul 3, 2024
The NeoverseZeroMove predicate assumes that the first operand is always
an immediate, which isn't always true. For example, it could be a stack
offset, etc. This patch fixes that by checking if the operand is an
immediate first.
kbluck pushed a commit to kbluck/llvm-project that referenced this issue Jul 6, 2024
The NeoverseZeroMove predicate assumes that the first operand is always
an immediate, which isn't always true. For example, it could be a stack
offset, etc. This patch fixes that by checking if the operand is an
immediate first.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants