Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArgumentPromotion creates an illegal type argument for AArch64 SVE VLS #69147

Closed
kawashima-fj opened this issue Oct 16, 2023 · 9 comments · Fixed by #70034 or llvm/llvm-project-release-prs#749
Labels
backend:AArch64 llvm:ir SVE ARM Scalable Vector Extensions

Comments

@kawashima-fj
Copy link
Member

kawashima-fj commented Oct 16, 2023

This issue is derived from #63025.

Problem

Compiling the following ACLE program with -O3 -march=armv8.2-a+sve -msve-vector-bits=256 for Linux/AArch64 crashes. This problem is specific to AArch64 SVE VLS (a.k.a. fixed-length).

#include <arm_sve.h>

typedef svfloat32_t svfloat32_256_t __attribute__((arm_sve_vector_bits(256)));

__attribute__((noinline))
static void callee(svfloat32_256_t *src, svfloat32_256_t *dst) {
    *dst = *src;
}

void caller(svfloat32_256_t *src, svfloat32_256_t *dst) {
    callee(src, dst);
}
$ clang -S --target=aarch64-unknwon-linux-gnu --sysroot=/opt/aarch64-none-linux-gnu/libc -O3 -march=armv8.2-a+sve -msve-vector-bits=256 -fno-crash-diagnostics test.c
clang: /src/llvm/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:7014: void analyzeCallOperands(const llvm::AArch64TargetLowering&, const llvm::AArch64Subtarget*, const llvm::TargetLowering::CallLoweringInfo&, llvm::CCState&): Assertion `!Res && "Call operand has unhandled type"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: clang -S --target=aarch64-unknwon-linux-gnu --sysroot=/opt/aarch64-none-linux-gnu/libc -O3 -march=armv8.2-a+sve -msve-vector-bits=256 -fno-crash-diagnostics test.c
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'test.c'.
4.      Running pass 'AArch64 Instruction Selection' on function '@caller'
 #0 0x00007f41724fa010 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/llvm/bin/../lib/libLLVMSupport.so.18git+0x248010)
 #1 0x00007f41724f741f llvm::sys::RunSignalHandlers() (/opt/llvm/bin/../lib/libLLVMSupport.so.18git+0x24541f)
 #2 0x00007f41723fba68 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x00007f4171e92520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #4 0x00007f4171ee69fc __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #5 0x00007f4171ee69fc __pthread_kill_internal ./nptl/pthread_kill.c:78:10
 #6 0x00007f4171ee69fc pthread_kill ./nptl/pthread_kill.c:89:10
 #7 0x00007f4171e92476 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #8 0x00007f4171e787f3 abort ./stdlib/abort.c:81:7
 #9 0x00007f4171e7871b _nl_load_domain ./intl/loadmsgcat.c:1177:9
#10 0x00007f4171e89e96 (/lib/x86_64-linux-gnu/libc.so.6+0x39e96)
#11 0x00007f417730dbbf analyzeCallOperands(llvm::AArch64TargetLowering const&, llvm::AArch64Subtarget const*, llvm::TargetLowering::CallLoweringInfo const&, llvm::CCState&) AArch64ISelLowering.cpp:0:0
#12 0x00007f417739f9aa llvm::AArch64TargetLowering::LowerCall(llvm::TargetLowering::CallLoweringInfo&, llvm::SmallVectorImpl<llvm::SDValue>&) const (/opt/llvm/bin/../lib/libLLVMAArch64CodeGen.so.18git+0x6a49aa)
#13 0x00007f41717eb86b llvm::TargetLowering::LowerCallTo(llvm::TargetLowering::CallLoweringInfo&) const (/opt/llvm/bin/../lib/../lib/libLLVMSelectionDAG.so.18git+0x35686b)
#14 0x00007f41717f8686 llvm::SelectionDAGBuilder::lowerInvokable(llvm::TargetLowering::CallLoweringInfo&, llvm::BasicBlock const*) (/opt/llvm/bin/../lib/../lib/libLLVMSelectionDAG.so.18git+0x363686)
#15 0x00007f4171818e5e llvm::SelectionDAGBuilder::LowerCallTo(llvm::CallBase const&, llvm::SDValue, bool, bool, llvm::BasicBlock const*) (/opt/llvm/bin/../lib/../lib/libLLVMSelectionDAG.so.18git+0x383e5e)
#16 0x00007f4171805ddd llvm::SelectionDAGBuilder::visitCall(llvm::CallInst const&) (/opt/llvm/bin/../lib/../lib/libLLVMSelectionDAG.so.18git+0x370ddd)
#17 0x00007f4171839a9e llvm::SelectionDAGBuilder::visit(llvm::Instruction const&) (/opt/llvm/bin/../lib/../lib/libLLVMSelectionDAG.so.18git+0x3a4a9e)
#18 0x00007f41718ca5b8 llvm::SelectionDAGISel::SelectBasicBlock(llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void>, false, true>, llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void>, false, true>, bool&) (/opt/llvm/bin/../lib/../lib/libLLVMSelectionDAG.so.18git+0x4355b8)
#19 0x00007f41718cb2a8 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/opt/llvm/bin/../lib/../lib/libLLVMSelectionDAG.so.18git+0x4362a8)
#20 0x00007f41718cd114 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (.part.0) SelectionDAGISel.cpp:0:0
#21 0x00007f4175a9bf07 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (.part.0) MachineFunctionPass.cpp:0:0
#22 0x00007f417312488e llvm::FPPassManager::runOnFunction(llvm::Function&) (/opt/llvm/bin/../lib/libLLVMCore.so.18git+0x49888e)
#23 0x00007f4173124ad9 llvm::FPPassManager::runOnModule(llvm::Module&) (/opt/llvm/bin/../lib/libLLVMCore.so.18git+0x498ad9)
#24 0x00007f4173125415 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/opt/llvm/bin/../lib/libLLVMCore.so.18git+0x499415)
#25 0x00007f41761bb078 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>) (/opt/llvm/bin/../lib/libclangCodeGen.so.18git+0x2d6078)
#26 0x00007f4176668c3a clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) CodeGenAction.cpp:0:0
#27 0x00007f416f7515f9 clang::ParseAST(clang::Sema&, bool, bool) (/opt/llvm/bin/../lib/../lib/libclangParse.so.18git+0x685f9)
#28 0x00007f4174cbdca9 clang::FrontendAction::Execute() (/opt/llvm/bin/../lib/libclangFrontend.so.18git+0x199ca9)
#29 0x00007f4174c2eb15 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/llvm/bin/../lib/libclangFrontend.so.18git+0x10ab15)
#30 0x00007f4176bb4e75 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/llvm/bin/../lib/libclangFrontendTool.so.18git+0x5e75)
#31 0x00005576f0c83e59 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/opt/llvm/bin/clang-18+0x19e59)
#32 0x00005576f0c7b693 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#33 0x00007f417490c34d void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::'lambda'()>(long) Job.cpp:0:0
#34 0x00007f41723fbf30 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/opt/llvm/bin/../lib/libLLVMSupport.so.18git+0x149f30)
#35 0x00007f417490cbce clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (.part.0) Job.cpp:0:0
#36 0x00007f41748cd29a clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/opt/llvm/bin/../lib/libclangDriver.so.18git+0xff29a)
#37 0x00007f41748cdd6d clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/opt/llvm/bin/../lib/libclangDriver.so.18git+0xffd6d)
#38 0x00007f41748dcd34 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/opt/llvm/bin/../lib/libclangDriver.so.18git+0x10ed34)
#39 0x00005576f0c80bd0 clang_main(int, char**, llvm::ToolContext const&) (/opt/llvm/bin/clang-18+0x16bd0)
#40 0x00005576f0c911a3 main (/opt/llvm/bin/clang-18+0x271a3)
#41 0x00007f4171e79d90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#42 0x00007f4171e79e40 call_init ./csu/../csu/libc-start.c:128:20
#43 0x00007f4171e79e40 __libc_start_main ./csu/../csu/libc-start.c:379:5
#44 0x00005576f0c7a2d5 _start (/opt/llvm/bin/clang-18+0x102d5)
clang: error: clang frontend command failed with exit code 134 (use -v to see invocation)

Environment

The latest main branch (commit 97c9f9a) and at least 13.0.1, 14.0.6, 15.0.7, 16.0.6, 17.0.1 have this problem.

Cause

As described in #63025, the root cause is that ArgumentPromotion promotes an <8 x float> pointer argument to an <8 x float> value argument. There is no ABI for SVE VLS so we cannot legalize vector arguments which are larger then 128 bits.

Before ArgumentPromotionPass:

; *** IR Dump Before ArgumentPromotionPass on (callee) ***
; ModuleID = 'test.c'
source_filename = "test.c"
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-unknwon-linux-gnu"

; Function Attrs: nounwind uwtable vscale_range(2,2)
define dso_local void @caller(ptr noundef %src, ptr noundef %dst) local_unnamed_addr #0 {
entry:
  call fastcc void @callee(ptr noundef %src, ptr noundef %dst)
  ret void
}

; Function Attrs: noinline nounwind uwtable vscale_range(2,2)
define internal fastcc void @callee(ptr nocapture noundef readonly %src, ptr nocapture noundef writeonly %dst) unnamed_addr #1 {
entry:
  %0 = load <8 x float>, ptr %src, align 16, !tbaa !6
  store <8 x float> %0, ptr %dst, align 16, !tbaa !6
  ret void
}

attributes #0 = { nounwind uwtable vscale_range(2,2) "frame-pointer"="non-leaf" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+outline-atomics,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a,-fmv" }
attributes #1 = { noinline nounwind uwtable vscale_range(2,2) "frame-pointer"="non-leaf" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+outline-atomics,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a,-fmv" }

!llvm.module.flags = !{!0, !1, !2, !3, !4}
!llvm.ident = !{!5}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 8, !"PIC Level", i32 2}
!2 = !{i32 7, !"PIE Level", i32 2}
!3 = !{i32 7, !"uwtable", i32 2}
!4 = !{i32 7, !"frame-pointer", i32 1}
!5 = !{!"clang version 18.0.0"}
!6 = !{!7, !7, i64 0}
!7 = !{!"omnipotent char", !8, i64 0}
!8 = !{!"Simple C/C++ TBAA"}

After ArgumentPromotionPass:

; *** IR Dump After ArgumentPromotionPass on (callee) ***
; ModuleID = 'test.c'
source_filename = "test.c"
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-unknwon-linux-gnu"

; Function Attrs: nounwind uwtable vscale_range(2,2)
define dso_local void @caller(ptr noundef %src, ptr noundef %dst) local_unnamed_addr #0 {
entry:
  %src.val = load <8 x float>, ptr %src, align 16, !tbaa !6
  call fastcc void @callee(<8 x float> %src.val, ptr noundef %dst)
  ret void
}

; Function Attrs: noinline nounwind uwtable vscale_range(2,2)
define internal fastcc void @callee(<8 x float> %src.0.val, ptr nocapture noundef writeonly %dst) unnamed_addr #1 {
entry:
  store <8 x float> %src.0.val, ptr %dst, align 16, !tbaa !6
  ret void
}

attributes #0 = { nounwind uwtable vscale_range(2,2) "frame-pointer"="non-leaf" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+outline-atomics,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a,-fmv" }
attributes #1 = { noinline nounwind uwtable vscale_range(2,2) "frame-pointer"="non-leaf" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+outline-atomics,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a,-fmv" }

!llvm.module.flags = !{!0, !1, !2, !3, !4}
!llvm.ident = !{!5}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 8, !"PIC Level", i32 2}
!2 = !{i32 7, !"PIE Level", i32 2}
!3 = !{i32 7, !"uwtable", i32 2}
!4 = !{i32 7, !"frame-pointer", i32 1}
!5 = !{!"clang version 18.0.0"}
!6 = !{!7, !7, i64 0}
!7 = !{!"omnipotent char", !8, i64 0}
!8 = !{!"Simple C/C++ TBAA"}

Diff:

@@ -1,4 +1,4 @@
-; *** IR Dump Before ArgumentPromotionPass on (callee) ***
+; *** IR Dump After ArgumentPromotionPass on (callee) ***
 ; ModuleID = 'test.c'
 source_filename = "test.c"
 target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
@@ -7,15 +7,15 @@
 ; Function Attrs: nounwind uwtable vscale_range(2,2)
 define dso_local void @caller(ptr noundef %src, ptr noundef %dst) local_unnamed_addr #0 {
 entry:
-  call fastcc void @callee(ptr noundef %src, ptr noundef %dst)
+  %src.val = load <8 x float>, ptr %src, align 16, !tbaa !6
+  call fastcc void @callee(<8 x float> %src.val, ptr noundef %dst)
   ret void
 }
 
 ; Function Attrs: noinline nounwind uwtable vscale_range(2,2)
-define internal fastcc void @callee(ptr nocapture noundef readonly %src, ptr nocapture noundef writeonly %dst) unnamed_addr #1 {
+define internal fastcc void @callee(<8 x float> %src.0.val, ptr nocapture noundef writeonly %dst) unnamed_addr #1 {
 entry:
-  %0 = load <8 x float>, ptr %src, align 16, !tbaa !6
-  store <8 x float> %0, ptr %dst, align 16, !tbaa !6
+  store <8 x float> %src.0.val, ptr %dst, align 16, !tbaa !6
   ret void
 }
 

Possible Fix

Given that the AArch64 backend won't support vector arguments which are VLS larger then 128 bits, we have two options.

  1. Don't promote such arguments.
  2. Promote such arguments to scalable vectors and "bitcast" appropriately (as @paulwalker-arm said).

ArgumentPromotion has a hook to check ABI using TargetTransformInfo. X86 and PPC use this hook to prevent illegal argument promotions. So I'd like to take a similar approach for 1. above.

Are there any comments? If no objection, I'll post a PR.

@kawashima-fj kawashima-fj added backend:AArch64 SVE ARM Scalable Vector Extensions llvm:ir labels Oct 16, 2023
@llvmbot
Copy link
Member

llvmbot commented Oct 16, 2023

@llvm/issue-subscribers-backend-aarch64

Author: KAWASHIMA Takahiro (kawashima-fj)

# ArgumentPromotion creates an illegal type argument for AArch64 SVE VLS

This issue is derived from #63025.

Problem

Compiling the following ACLE program with -O3 -march=armv8.2-a+sve -msve-vector-bits=256 for Linux/AArch64 crashes. This problem is specific to AArch64 SVE VLS (a.k.a. fixed-length).

#include &lt;arm_sve.h&gt;

typedef svfloat32_t svfloat32_256_t __attribute__((arm_sve_vector_bits(256)));

__attribute__((noinline))
static void callee(svfloat32_256_t *src, svfloat32_256_t *dst) {
    *dst = *src;
}

void caller(svfloat32_256_t *src, svfloat32_256_t *dst) {
    callee(src, dst);
}
$ clang -S --target=aarch64-unknwon-linux-gnu --sysroot=/opt/aarch64-none-linux-gnu/libc -O3 -march=armv8.2-a+sve -msve-vector-bits=256 -fno-crash-diagnostics test.c
clang: /src/llvm/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:7014: void analyzeCallOperands(const llvm::AArch64TargetLowering&amp;, const llvm::AArch64Subtarget*, const llvm::TargetLowering::CallLoweringInfo&amp;, llvm::CCState&amp;): Assertion `!Res &amp;&amp; "Call operand has unhandled type"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: clang -S --target=aarch64-unknwon-linux-gnu --sysroot=/opt/aarch64-none-linux-gnu/libc -O3 -march=armv8.2-a+sve -msve-vector-bits=256 -fno-crash-diagnostics test.c
1.      &lt;eof&gt; parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'test.c'.
4.      Running pass 'AArch64 Instruction Selection' on function '@<!-- -->caller'
 #<!-- -->0 0x00007f41724fa010 llvm::sys::PrintStackTrace(llvm::raw_ostream&amp;, int) (/opt/llvm/bin/../lib/libLLVMSupport.so.18git+0x248010)
 #<!-- -->1 0x00007f41724f741f llvm::sys::RunSignalHandlers() (/opt/llvm/bin/../lib/libLLVMSupport.so.18git+0x24541f)
 #<!-- -->2 0x00007f41723fba68 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #<!-- -->3 0x00007f4171e92520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #<!-- -->4 0x00007f4171ee69fc __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #<!-- -->5 0x00007f4171ee69fc __pthread_kill_internal ./nptl/pthread_kill.c:78:10
 #<!-- -->6 0x00007f4171ee69fc pthread_kill ./nptl/pthread_kill.c:89:10
 #<!-- -->7 0x00007f4171e92476 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #<!-- -->8 0x00007f4171e787f3 abort ./stdlib/abort.c:81:7
 #<!-- -->9 0x00007f4171e7871b _nl_load_domain ./intl/loadmsgcat.c:1177:9
#<!-- -->10 0x00007f4171e89e96 (/lib/x86_64-linux-gnu/libc.so.6+0x39e96)
#<!-- -->11 0x00007f417730dbbf analyzeCallOperands(llvm::AArch64TargetLowering const&amp;, llvm::AArch64Subtarget const*, llvm::TargetLowering::CallLoweringInfo const&amp;, llvm::CCState&amp;) AArch64ISelLowering.cpp:0:0
#<!-- -->12 0x00007f417739f9aa llvm::AArch64TargetLowering::LowerCall(llvm::TargetLowering::CallLoweringInfo&amp;, llvm::SmallVectorImpl&lt;llvm::SDValue&gt;&amp;) const (/opt/llvm/bin/../lib/libLLVMAArch64CodeGen.so.18git+0x6a49aa)
#<!-- -->13 0x00007f41717eb86b llvm::TargetLowering::LowerCallTo(llvm::TargetLowering::CallLoweringInfo&amp;) const (/opt/llvm/bin/../lib/../lib/libLLVMSelectionDAG.so.18git+0x35686b)
#<!-- -->14 0x00007f41717f8686 llvm::SelectionDAGBuilder::lowerInvokable(llvm::TargetLowering::CallLoweringInfo&amp;, llvm::BasicBlock const*) (/opt/llvm/bin/../lib/../lib/libLLVMSelectionDAG.so.18git+0x363686)
#<!-- -->15 0x00007f4171818e5e llvm::SelectionDAGBuilder::LowerCallTo(llvm::CallBase const&amp;, llvm::SDValue, bool, bool, llvm::BasicBlock const*) (/opt/llvm/bin/../lib/../lib/libLLVMSelectionDAG.so.18git+0x383e5e)
#<!-- -->16 0x00007f4171805ddd llvm::SelectionDAGBuilder::visitCall(llvm::CallInst const&amp;) (/opt/llvm/bin/../lib/../lib/libLLVMSelectionDAG.so.18git+0x370ddd)
#<!-- -->17 0x00007f4171839a9e llvm::SelectionDAGBuilder::visit(llvm::Instruction const&amp;) (/opt/llvm/bin/../lib/../lib/libLLVMSelectionDAG.so.18git+0x3a4a9e)
#<!-- -->18 0x00007f41718ca5b8 llvm::SelectionDAGISel::SelectBasicBlock(llvm::ilist_iterator&lt;llvm::ilist_detail::node_options&lt;llvm::Instruction, true, false, void&gt;, false, true&gt;, llvm::ilist_iterator&lt;llvm::ilist_detail::node_options&lt;llvm::Instruction, true, false, void&gt;, false, true&gt;, bool&amp;) (/opt/llvm/bin/../lib/../lib/libLLVMSelectionDAG.so.18git+0x4355b8)
#<!-- -->19 0x00007f41718cb2a8 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&amp;) (/opt/llvm/bin/../lib/../lib/libLLVMSelectionDAG.so.18git+0x4362a8)
#<!-- -->20 0x00007f41718cd114 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&amp;) (.part.0) SelectionDAGISel.cpp:0:0
#<!-- -->21 0x00007f4175a9bf07 llvm::MachineFunctionPass::runOnFunction(llvm::Function&amp;) (.part.0) MachineFunctionPass.cpp:0:0
#<!-- -->22 0x00007f417312488e llvm::FPPassManager::runOnFunction(llvm::Function&amp;) (/opt/llvm/bin/../lib/libLLVMCore.so.18git+0x49888e)
#<!-- -->23 0x00007f4173124ad9 llvm::FPPassManager::runOnModule(llvm::Module&amp;) (/opt/llvm/bin/../lib/libLLVMCore.so.18git+0x498ad9)
#<!-- -->24 0x00007f4173125415 llvm::legacy::PassManagerImpl::run(llvm::Module&amp;) (/opt/llvm/bin/../lib/libLLVMCore.so.18git+0x499415)
#<!-- -->25 0x00007f41761bb078 clang::EmitBackendOutput(clang::DiagnosticsEngine&amp;, clang::HeaderSearchOptions const&amp;, clang::CodeGenOptions const&amp;, clang::TargetOptions const&amp;, clang::LangOptions const&amp;, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr&lt;llvm::vfs::FileSystem&gt;, std::unique_ptr&lt;llvm::raw_pwrite_stream, std::default_delete&lt;llvm::raw_pwrite_stream&gt;&gt;) (/opt/llvm/bin/../lib/libclangCodeGen.so.18git+0x2d6078)
#<!-- -->26 0x00007f4176668c3a clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&amp;) CodeGenAction.cpp:0:0
#<!-- -->27 0x00007f416f7515f9 clang::ParseAST(clang::Sema&amp;, bool, bool) (/opt/llvm/bin/../lib/../lib/libclangParse.so.18git+0x685f9)
#<!-- -->28 0x00007f4174cbdca9 clang::FrontendAction::Execute() (/opt/llvm/bin/../lib/libclangFrontend.so.18git+0x199ca9)
#<!-- -->29 0x00007f4174c2eb15 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&amp;) (/opt/llvm/bin/../lib/libclangFrontend.so.18git+0x10ab15)
#<!-- -->30 0x00007f4176bb4e75 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/llvm/bin/../lib/libclangFrontendTool.so.18git+0x5e75)
#<!-- -->31 0x00005576f0c83e59 cc1_main(llvm::ArrayRef&lt;char const*&gt;, char const*, void*) (/opt/llvm/bin/clang-18+0x19e59)
#<!-- -->32 0x00005576f0c7b693 ExecuteCC1Tool(llvm::SmallVectorImpl&lt;char const*&gt;&amp;, llvm::ToolContext const&amp;) driver.cpp:0:0
#<!-- -->33 0x00007f417490c34d void llvm::function_ref&lt;void ()&gt;::callback_fn&lt;clang::driver::CC1Command::Execute(llvm::ArrayRef&lt;std::optional&lt;llvm::StringRef&gt;&gt;, std::__cxx11::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt;&gt;*, bool*) const::'lambda'()&gt;(long) Job.cpp:0:0
#<!-- -->34 0x00007f41723fbf30 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref&lt;void ()&gt;) (/opt/llvm/bin/../lib/libLLVMSupport.so.18git+0x149f30)
#<!-- -->35 0x00007f417490cbce clang::driver::CC1Command::Execute(llvm::ArrayRef&lt;std::optional&lt;llvm::StringRef&gt;&gt;, std::__cxx11::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt;&gt;*, bool*) const (.part.0) Job.cpp:0:0
#<!-- -->36 0x00007f41748cd29a clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&amp;, clang::driver::Command const*&amp;, bool) const (/opt/llvm/bin/../lib/libclangDriver.so.18git+0xff29a)
#<!-- -->37 0x00007f41748cdd6d clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&amp;, llvm::SmallVectorImpl&lt;std::pair&lt;int, clang::driver::Command const*&gt;&gt;&amp;, bool) const (/opt/llvm/bin/../lib/libclangDriver.so.18git+0xffd6d)
#<!-- -->38 0x00007f41748dcd34 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&amp;, llvm::SmallVectorImpl&lt;std::pair&lt;int, clang::driver::Command const*&gt;&gt;&amp;) (/opt/llvm/bin/../lib/libclangDriver.so.18git+0x10ed34)
#<!-- -->39 0x00005576f0c80bd0 clang_main(int, char**, llvm::ToolContext const&amp;) (/opt/llvm/bin/clang-18+0x16bd0)
#<!-- -->40 0x00005576f0c911a3 main (/opt/llvm/bin/clang-18+0x271a3)
#<!-- -->41 0x00007f4171e79d90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#<!-- -->42 0x00007f4171e79e40 call_init ./csu/../csu/libc-start.c:128:20
#<!-- -->43 0x00007f4171e79e40 __libc_start_main ./csu/../csu/libc-start.c:379:5
#<!-- -->44 0x00005576f0c7a2d5 _start (/opt/llvm/bin/clang-18+0x102d5)
clang: error: clang frontend command failed with exit code 134 (use -v to see invocation)

Environment

The latest main branch (commit 97c9f9a) and at least 13.0.1, 14.0.6, 15.0.7, 16.0.6, 17.0.1 have this problem.

Cause

As described in #63025, the root cause is that ArgumentPromotion promotes an &lt;8 x float&gt; pointer argument to an &lt;8 x float&gt; value argument. There is no ABI for SVE VLS so we cannot legalize vector arguments which are larger then 128 bits.

Before ArgumentPromotionPass:

; *** IR Dump Before ArgumentPromotionPass on (callee) ***
; ModuleID = 'test.c'
source_filename = "test.c"
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-unknwon-linux-gnu"

; Function Attrs: nounwind uwtable vscale_range(2,2)
define dso_local void @<!-- -->caller(ptr noundef %src, ptr noundef %dst) local_unnamed_addr #<!-- -->0 {
entry:
  call fastcc void @<!-- -->callee(ptr noundef %src, ptr noundef %dst)
  ret void
}

; Function Attrs: noinline nounwind uwtable vscale_range(2,2)
define internal fastcc void @<!-- -->callee(ptr nocapture noundef readonly %src, ptr nocapture noundef writeonly %dst) unnamed_addr #<!-- -->1 {
entry:
  %0 = load &lt;8 x float&gt;, ptr %src, align 16, !tbaa !6
  store &lt;8 x float&gt; %0, ptr %dst, align 16, !tbaa !6
  ret void
}

attributes #<!-- -->0 = { nounwind uwtable vscale_range(2,2) "frame-pointer"="non-leaf" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+outline-atomics,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a,-fmv" }
attributes #<!-- -->1 = { noinline nounwind uwtable vscale_range(2,2) "frame-pointer"="non-leaf" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+outline-atomics,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a,-fmv" }

!llvm.module.flags = !{!0, !1, !2, !3, !4}
!llvm.ident = !{!5}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 8, !"PIC Level", i32 2}
!2 = !{i32 7, !"PIE Level", i32 2}
!3 = !{i32 7, !"uwtable", i32 2}
!4 = !{i32 7, !"frame-pointer", i32 1}
!5 = !{!"clang version 18.0.0"}
!6 = !{!7, !7, i64 0}
!7 = !{!"omnipotent char", !8, i64 0}
!8 = !{!"Simple C/C++ TBAA"}

After ArgumentPromotionPass:

; *** IR Dump After ArgumentPromotionPass on (callee) ***
; ModuleID = 'test.c'
source_filename = "test.c"
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-unknwon-linux-gnu"

; Function Attrs: nounwind uwtable vscale_range(2,2)
define dso_local void @<!-- -->caller(ptr noundef %src, ptr noundef %dst) local_unnamed_addr #<!-- -->0 {
entry:
  %src.val = load &lt;8 x float&gt;, ptr %src, align 16, !tbaa !6
  call fastcc void @<!-- -->callee(&lt;8 x float&gt; %src.val, ptr noundef %dst)
  ret void
}

; Function Attrs: noinline nounwind uwtable vscale_range(2,2)
define internal fastcc void @<!-- -->callee(&lt;8 x float&gt; %src.0.val, ptr nocapture noundef writeonly %dst) unnamed_addr #<!-- -->1 {
entry:
  store &lt;8 x float&gt; %src.0.val, ptr %dst, align 16, !tbaa !6
  ret void
}

attributes #<!-- -->0 = { nounwind uwtable vscale_range(2,2) "frame-pointer"="non-leaf" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+outline-atomics,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a,-fmv" }
attributes #<!-- -->1 = { noinline nounwind uwtable vscale_range(2,2) "frame-pointer"="non-leaf" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+outline-atomics,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a,-fmv" }

!llvm.module.flags = !{!0, !1, !2, !3, !4}
!llvm.ident = !{!5}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 8, !"PIC Level", i32 2}
!2 = !{i32 7, !"PIE Level", i32 2}
!3 = !{i32 7, !"uwtable", i32 2}
!4 = !{i32 7, !"frame-pointer", i32 1}
!5 = !{!"clang version 18.0.0"}
!6 = !{!7, !7, i64 0}
!7 = !{!"omnipotent char", !8, i64 0}
!8 = !{!"Simple C/C++ TBAA"}

Diff:

@@ -1,4 +1,4 @@
-; *** IR Dump Before ArgumentPromotionPass on (callee) ***
+; *** IR Dump After ArgumentPromotionPass on (callee) ***
 ; ModuleID = 'test.c'
 source_filename = "test.c"
 target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
@@ -7,15 +7,15 @@
 ; Function Attrs: nounwind uwtable vscale_range(2,2)
 define dso_local void @<!-- -->caller(ptr noundef %src, ptr noundef %dst) local_unnamed_addr #<!-- -->0 {
 entry:
-  call fastcc void @<!-- -->callee(ptr noundef %src, ptr noundef %dst)
+  %src.val = load &lt;8 x float&gt;, ptr %src, align 16, !tbaa !6
+  call fastcc void @<!-- -->callee(&lt;8 x float&gt; %src.val, ptr noundef %dst)
   ret void
 }
 
 ; Function Attrs: noinline nounwind uwtable vscale_range(2,2)
-define internal fastcc void @<!-- -->callee(ptr nocapture noundef readonly %src, ptr nocapture noundef writeonly %dst) unnamed_addr #<!-- -->1 {
+define internal fastcc void @<!-- -->callee(&lt;8 x float&gt; %src.0.val, ptr nocapture noundef writeonly %dst) unnamed_addr #<!-- -->1 {
 entry:
-  %0 = load &lt;8 x float&gt;, ptr %src, align 16, !tbaa !6
-  store &lt;8 x float&gt; %0, ptr %dst, align 16, !tbaa !6
+  store &lt;8 x float&gt; %src.0.val, ptr %dst, align 16, !tbaa !6
   ret void
 }
 

Possible Fix

Given that the AArch64 backend won't support vector arguments which are VLS larger then 128 bits, we have two options.

  1. Don't promote such arguments.
  2. Promote such arguments to scalable vectors and "bitcast" appropriately (as @paulwalker-arm said).

ArgumentPromotion has a hook to check ABI using TargetTransformInfo. X86 and PPC use this hook to prevent illegal argument promotions. So I'd like to take a similar approach for 1. above.

Are there any comments? If no objection, I'll post a PR.

@XChy
Copy link
Member

XChy commented Oct 16, 2023

I'm an expert in backend. But I think it may be related to #68953 where SLPVectorizer crashed. Could you take a look at it?

@kawashima-fj
Copy link
Member Author

@XChy I took a look at #68953 a bit (see #68953 (comment)) but unfortunately I think it is not related to this issue.

@XChy
Copy link
Member

XChy commented Oct 17, 2023

@XChy I took a look at #68953 a bit (see #68953 (comment)) but unfortunately I think it is not related to this issue.

That's OK. Thanks for your reduction!

@paulwalker-arm
Copy link
Collaborator

paulwalker-arm commented Oct 23, 2023

I have no objections. Implementing areTypesABICompatible for AArch64 seems sensible to me.

@kawashima-fj: You mentioned creating a PR. Do you have a timeframe? I'd quite like to get this fixed so there's time to back port it into LLVM-17. Does that match your timeline or should I find somebody else to run with a fix?

@kawashima-fj
Copy link
Member Author

@paulwalker-arm The patch is almost ready. I'm now running tests. I'll submit a PR today (in Japan timezone) hopefully.

kawashima-fj added a commit to kawashima-fj/pr-llvm-project that referenced this issue Oct 26, 2023
This patch prevents argument promotion from promoting pointers to
fixed-length vector types larger than 128 bits like `<8 x float>`
into the values of the pointees.

Such vector types are used for SVE VLS but there is no ABI for
SVE VLS arguments and the backend cannot lower such value arguments.

Fixes llvm#69147
kawashima-fj added a commit that referenced this issue Oct 26, 2023
…70034)

This patch prevents argument promotion from promoting pointers to
fixed-length vector types larger than 128 bits like `<8 x float>` into
the values of the pointees.

Such vector types are used for SVE VLS but there is no ABI for SVE VLS
arguments and the backend cannot lower such value arguments.

Fixes #69147
@kawashima-fj kawashima-fj added this to the LLVM 17.0.X Release milestone Oct 26, 2023
@github-project-automation github-project-automation bot moved this to Needs Triage in LLVM Release Status Oct 26, 2023
@kawashima-fj
Copy link
Member Author

/cherry-pick 926173c

@llvmbot
Copy link
Member

llvmbot commented Oct 26, 2023

/branch llvm/llvm-project-release-prs/issue69147

@llvmbot
Copy link
Member

llvmbot commented Oct 26, 2023

/pull-request llvm/llvm-project-release-prs#749

zahiraam pushed a commit to zahiraam/llvm-project that referenced this issue Oct 26, 2023
…lvm#70034)

This patch prevents argument promotion from promoting pointers to
fixed-length vector types larger than 128 bits like `<8 x float>` into
the values of the pointees.

Such vector types are used for SVE VLS but there is no ABI for SVE VLS
arguments and the backend cannot lower such value arguments.

Fixes llvm#69147
@tru tru moved this from Needs Triage to Needs Review in LLVM Release Status Oct 27, 2023
tru pushed a commit that referenced this issue Oct 27, 2023
…70034)

This patch prevents argument promotion from promoting pointers to
fixed-length vector types larger than 128 bits like `<8 x float>` into
the values of the pointees.

Such vector types are used for SVE VLS but there is no ABI for SVE VLS
arguments and the backend cannot lower such value arguments.

Fixes #69147

(cherry picked from commit 926173c)
@tru tru moved this from Needs Review to Done in LLVM Release Status Oct 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 llvm:ir SVE ARM Scalable Vector Extensions
Projects
4 participants