[AArch64] Implement intrinsics for SME2 FSCALE #100128

Lukacma · 2024-07-23T14:17:15Z

This patch implements these intrinsics:

FSCALE SINGLE AND MULTI

  // Variants are also available for:
  // [_single_f32_x2], [_single_f64_x2],
  // [_single_f16_x4], [_single_f32_x4], [_single_f64_x4]
  svfloat16x2_t svscale[_single_f16_x2](svfloat16x2_t zd, svfloat16_t zm) __arm_streaming;

  // Variants are also available for:
  //  [_f32_x2], [_f64_x2],
  //  [_f16_x4], [_f32_x4], [_f64_x4]
  svfloat16x2_t svscale[_f16_x2](svfloat16x2_t zd, svfloat16x2_t zm) __arm_streaming

(cf. ARM-software/acle#323)

Co-authored-by: Caroline Concatto caroline.concatto@arm.com

llvmbot · 2024-07-23T14:17:47Z

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-backend-aarch64

Author: None (Lukacma)

Changes

This patch implements these intrinsics:

FSCALE SINGLE AND MULTI

  // Variants are also available for:
  // [_single_f32_x2], [_single_f64_x2],
  // [_single_f16_x4], [_single_f32_x4], [_single_f64_x4]
  svfloat16x2_t svscale[_single_f16_x2](svfloat16x2_t zd, svfloat16_t zm) __arm_streaming;

  // Variants are also available for:
  //  [_f32_x2], [_f64_x2],
  //  [_f16_x4], [_f32_x4], [_f64_x4]
  svfloat16x2_t svscale[_f16_x2](svfloat16x2_t zd, svfloat16x2_t zm) __arm_streaming

(cf. ARM-software/acle#323)

Co-authored-by: Caroline Concatto <caroline.concatto@arm.com>

Patch is 72.34 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/100128.diff

5 Files Affected:

(modified) clang/include/clang/Basic/arm_sve.td (+10)
(added) clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c (+452)
(modified) llvm/include/llvm/IR/IntrinsicsAArch64.td (+25)
(modified) llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp (+28)
(added) llvm/test/CodeGen/AArch64/sme2-intrinsics-fscale.ll (+186)

diff --git a/clang/include/clang/Basic/arm_sve.td b/clang/include/clang/Basic/arm_sve.td
index 94c093d891156..2fb93e091aa5c 100644
--- a/clang/include/clang/Basic/arm_sve.td
+++ b/clang/include/clang/Basic/arm_sve.td
@@ -2377,6 +2377,16 @@ let SVETargetGuard = InvalidMode, SMETargetGuard = "sme2" in {
   def SVUUNPK_X4 : SInst<"svunpk_{d}[_{3}_x4]", "42.h", "UsUiUl", MergeNone, "aarch64_sve_uunpk_x4", [IsStreaming], []>;
 }
 
+//
+// Multi-vector scaling
+//
+let  SVETargetGuard = InvalidMode, SMETargetGuard = "sme2,fp8" in {
+  def FSCALE_SINGLE_X2 : Inst<"svscale[_single_{d}_x2]", "22x", "fhd", MergeNone, "aarch64_sme_fp8_scale_single_x2", [IsStreaming],[]>;
+  def FSCALE_SINGLE_X4 : Inst<"svscale[_single_{d}_x4]", "44x", "fhd", MergeNone, "aarch64_sme_fp8_scale_single_x4", [IsStreaming],[]>;
+  def FSCALE_X2 : Inst<"svscale[_{d}_x2]", "222.x", "fhd", MergeNone, "aarch64_sme_fp8_scale_x2", [IsStreaming],[]>;
+  def FSCALE_X4 : Inst<"svscale[_{d}_x4]", "444.x", "fhd", MergeNone, "aarch64_sme_fp8_scale_x4", [IsStreaming],[]>;
+}
+
 let SVETargetGuard = "sve2p1", SMETargetGuard = "sme2" in {
 // == BFloat16 multiply-subtract ==
   def SVBFMLSLB : SInst<"svbfmlslb[_{d}]", "dd$$", "f", MergeNone, "aarch64_sve_bfmlslb", [IsOverloadNone, VerifyRuntimeMode], []>;
diff --git a/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c b/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
new file mode 100644
index 0000000000000..b733e772ba307
--- /dev/null
+++ b/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
@@ -0,0 +1,452 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// REQUIRES: aarch64-registered-target
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | opt -S -passes=mem2reg,tailcallelim | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - -x c++ %s | opt -S -passes=mem2reg,tailcallelim | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | opt -S -passes=mem2reg,tailcallelim | FileCheck %s
+// RUN: %clang_cc1 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - -x c++ %s | opt -S -passes=mem2reg,tailcallelim | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -S -disable-O0-optnone -Werror -Wall -o /dev/null %s
+#include <arm_sme.h>
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1,A2_UNUSED) A1
+#else
+#define SVE_ACLE_FUNC(A1,A2) A1##A2
+#endif
+
+
+// Single x2
+// CHECK-LABEL: @test_svscale_single_f16_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
+// CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1]], i64 8)
+// CHECK-NEXT:    [[TMP2:%.*]] = tail call { <vscale x 8 x half>, <vscale x 8 x half> } @llvm.aarch64.sme.fp8.scale.single.x2.nxv8f16(<vscale x 8 x half> [[TMP0]], <vscale x 8 x half> [[TMP1]], <vscale x 8 x i16> [[OP2:%.*]])
+// CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half> } [[TMP2]], 0
+// CHECK-NEXT:    [[TMP4:%.*]] = tail call <vscale x 16 x half> @llvm.vector.insert.nxv16f16.nxv8f16(<vscale x 16 x half> poison, <vscale x 8 x half> [[TMP3]], i64 0)
+// CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half> } [[TMP2]], 1
+// CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 16 x half> @llvm.vector.insert.nxv16f16.nxv8f16(<vscale x 16 x half> [[TMP4]], <vscale x 8 x half> [[TMP5]], i64 8)
+// CHECK-NEXT:    ret <vscale x 16 x half> [[TMP6]]
+//
+// CPP-CHECK-LABEL: @_Z26test_svscale_single_f16_x213svfloat16x2_tu11__SVInt16_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1]], i64 8)
+// CPP-CHECK-NEXT:    [[TMP2:%.*]] = tail call { <vscale x 8 x half>, <vscale x 8 x half> } @llvm.aarch64.sme.fp8.scale.single.x2.nxv8f16(<vscale x 8 x half> [[TMP0]], <vscale x 8 x half> [[TMP1]], <vscale x 8 x i16> [[OP2:%.*]])
+// CPP-CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half> } [[TMP2]], 0
+// CPP-CHECK-NEXT:    [[TMP4:%.*]] = tail call <vscale x 16 x half> @llvm.vector.insert.nxv16f16.nxv8f16(<vscale x 16 x half> poison, <vscale x 8 x half> [[TMP3]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half> } [[TMP2]], 1
+// CPP-CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 16 x half> @llvm.vector.insert.nxv16f16.nxv8f16(<vscale x 16 x half> [[TMP4]], <vscale x 8 x half> [[TMP5]], i64 8)
+// CPP-CHECK-NEXT:    ret <vscale x 16 x half> [[TMP6]]
+//
+svfloat16x2_t test_svscale_single_f16_x2(svfloat16x2_t op1, svint16_t op2) __arm_streaming
+{
+  return SVE_ACLE_FUNC(svscale,_single_f16_x2)(op1, op2);
+}
+
+// CHECK-LABEL: @test_svscale_single_f32_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv8f32(<vscale x 8 x float> [[OP1:%.*]], i64 0)
+// CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv8f32(<vscale x 8 x float> [[OP1]], i64 4)
+// CHECK-NEXT:    [[TMP2:%.*]] = tail call { <vscale x 4 x float>, <vscale x 4 x float> } @llvm.aarch64.sme.fp8.scale.single.x2.nxv4f32(<vscale x 4 x float> [[TMP0]], <vscale x 4 x float> [[TMP1]], <vscale x 4 x i32> [[OP2:%.*]])
+// CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float> } [[TMP2]], 0
+// CHECK-NEXT:    [[TMP4:%.*]] = tail call <vscale x 8 x float> @llvm.vector.insert.nxv8f32.nxv4f32(<vscale x 8 x float> poison, <vscale x 4 x float> [[TMP3]], i64 0)
+// CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float> } [[TMP2]], 1
+// CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 8 x float> @llvm.vector.insert.nxv8f32.nxv4f32(<vscale x 8 x float> [[TMP4]], <vscale x 4 x float> [[TMP5]], i64 4)
+// CHECK-NEXT:    ret <vscale x 8 x float> [[TMP6]]
+//
+// CPP-CHECK-LABEL: @_Z26test_svscale_single_f32_x213svfloat32x2_tu11__SVInt32_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv8f32(<vscale x 8 x float> [[OP1:%.*]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv8f32(<vscale x 8 x float> [[OP1]], i64 4)
+// CPP-CHECK-NEXT:    [[TMP2:%.*]] = tail call { <vscale x 4 x float>, <vscale x 4 x float> } @llvm.aarch64.sme.fp8.scale.single.x2.nxv4f32(<vscale x 4 x float> [[TMP0]], <vscale x 4 x float> [[TMP1]], <vscale x 4 x i32> [[OP2:%.*]])
+// CPP-CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float> } [[TMP2]], 0
+// CPP-CHECK-NEXT:    [[TMP4:%.*]] = tail call <vscale x 8 x float> @llvm.vector.insert.nxv8f32.nxv4f32(<vscale x 8 x float> poison, <vscale x 4 x float> [[TMP3]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float> } [[TMP2]], 1
+// CPP-CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 8 x float> @llvm.vector.insert.nxv8f32.nxv4f32(<vscale x 8 x float> [[TMP4]], <vscale x 4 x float> [[TMP5]], i64 4)
+// CPP-CHECK-NEXT:    ret <vscale x 8 x float> [[TMP6]]
+//
+svfloat32x2_t test_svscale_single_f32_x2(svfloat32x2_t op1, svint32_t op2) __arm_streaming
+{
+  return SVE_ACLE_FUNC(svscale,_single_f32_x2)(op1, op2);
+}
+
+// CHECK-LABEL: @test_svscale_single_f64_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv4f64(<vscale x 4 x double> [[OP1:%.*]], i64 0)
+// CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv4f64(<vscale x 4 x double> [[OP1]], i64 2)
+// CHECK-NEXT:    [[TMP2:%.*]] = tail call { <vscale x 2 x double>, <vscale x 2 x double> } @llvm.aarch64.sme.fp8.scale.single.x2.nxv2f64(<vscale x 2 x double> [[TMP0]], <vscale x 2 x double> [[TMP1]], <vscale x 2 x i64> [[OP2:%.*]])
+// CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <vscale x 2 x double>, <vscale x 2 x double> } [[TMP2]], 0
+// CHECK-NEXT:    [[TMP4:%.*]] = tail call <vscale x 4 x double> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x double> poison, <vscale x 2 x double> [[TMP3]], i64 0)
+// CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 2 x double>, <vscale x 2 x double> } [[TMP2]], 1
+// CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 4 x double> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x double> [[TMP4]], <vscale x 2 x double> [[TMP5]], i64 2)
+// CHECK-NEXT:    ret <vscale x 4 x double> [[TMP6]]
+//
+// CPP-CHECK-LABEL: @_Z26test_svscale_single_f64_x213svfloat64x2_tu11__SVInt64_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv4f64(<vscale x 4 x double> [[OP1:%.*]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv4f64(<vscale x 4 x double> [[OP1]], i64 2)
+// CPP-CHECK-NEXT:    [[TMP2:%.*]] = tail call { <vscale x 2 x double>, <vscale x 2 x double> } @llvm.aarch64.sme.fp8.scale.single.x2.nxv2f64(<vscale x 2 x double> [[TMP0]], <vscale x 2 x double> [[TMP1]], <vscale x 2 x i64> [[OP2:%.*]])
+// CPP-CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <vscale x 2 x double>, <vscale x 2 x double> } [[TMP2]], 0
+// CPP-CHECK-NEXT:    [[TMP4:%.*]] = tail call <vscale x 4 x double> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x double> poison, <vscale x 2 x double> [[TMP3]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 2 x double>, <vscale x 2 x double> } [[TMP2]], 1
+// CPP-CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 4 x double> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x double> [[TMP4]], <vscale x 2 x double> [[TMP5]], i64 2)
+// CPP-CHECK-NEXT:    ret <vscale x 4 x double> [[TMP6]]
+//
+svfloat64x2_t test_svscale_single_f64_x2(svfloat64x2_t op1, svint64_t op2) __arm_streaming
+{
+  return SVE_ACLE_FUNC(svscale,_single_f64_x2)(op1, op2);
+}
+
+// Single x4
+// CHECK-LABEL: @test_svscale_single_f16_x4(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1:%.*]], i64 0)
+// CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1]], i64 8)
+// CHECK-NEXT:    [[TMP2:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1]], i64 16)
+// CHECK-NEXT:    [[TMP3:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1]], i64 24)
+// CHECK-NEXT:    [[TMP4:%.*]] = tail call { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } @llvm.aarch64.sme.fp8.scale.single.x4.nxv8f16(<vscale x 8 x half> [[TMP0]], <vscale x 8 x half> [[TMP1]], <vscale x 8 x half> [[TMP2]], <vscale x 8 x half> [[TMP3]], <vscale x 8 x i16> [[OP2:%.*]])
+// CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } [[TMP4]], 0
+// CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 32 x half> @llvm.vector.insert.nxv32f16.nxv8f16(<vscale x 32 x half> poison, <vscale x 8 x half> [[TMP5]], i64 0)
+// CHECK-NEXT:    [[TMP7:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } [[TMP4]], 1
+// CHECK-NEXT:    [[TMP8:%.*]] = tail call <vscale x 32 x half> @llvm.vector.insert.nxv32f16.nxv8f16(<vscale x 32 x half> [[TMP6]], <vscale x 8 x half> [[TMP7]], i64 8)
+// CHECK-NEXT:    [[TMP9:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } [[TMP4]], 2
+// CHECK-NEXT:    [[TMP10:%.*]] = tail call <vscale x 32 x half> @llvm.vector.insert.nxv32f16.nxv8f16(<vscale x 32 x half> [[TMP8]], <vscale x 8 x half> [[TMP9]], i64 16)
+// CHECK-NEXT:    [[TMP11:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } [[TMP4]], 3
+// CHECK-NEXT:    [[TMP12:%.*]] = tail call <vscale x 32 x half> @llvm.vector.insert.nxv32f16.nxv8f16(<vscale x 32 x half> [[TMP10]], <vscale x 8 x half> [[TMP11]], i64 24)
+// CHECK-NEXT:    ret <vscale x 32 x half> [[TMP12]]
+//
+// CPP-CHECK-LABEL: @_Z26test_svscale_single_f16_x413svfloat16x4_tu11__SVInt16_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1:%.*]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1]], i64 8)
+// CPP-CHECK-NEXT:    [[TMP2:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1]], i64 16)
+// CPP-CHECK-NEXT:    [[TMP3:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1]], i64 24)
+// CPP-CHECK-NEXT:    [[TMP4:%.*]] = tail call { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } @llvm.aarch64.sme.fp8.scale.single.x4.nxv8f16(<vscale x 8 x half> [[TMP0]], <vscale x 8 x half> [[TMP1]], <vscale x 8 x half> [[TMP2]], <vscale x 8 x half> [[TMP3]], <vscale x 8 x i16> [[OP2:%.*]])
+// CPP-CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } [[TMP4]], 0
+// CPP-CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 32 x half> @llvm.vector.insert.nxv32f16.nxv8f16(<vscale x 32 x half> poison, <vscale x 8 x half> [[TMP5]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP7:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } [[TMP4]], 1
+// CPP-CHECK-NEXT:    [[TMP8:%.*]] = tail call <vscale x 32 x half> @llvm.vector.insert.nxv32f16.nxv8f16(<vscale x 32 x half> [[TMP6]], <vscale x 8 x half> [[TMP7]], i64 8)
+// CPP-CHECK-NEXT:    [[TMP9:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } [[TMP4]], 2
+// CPP-CHECK-NEXT:    [[TMP10:%.*]] = tail call <vscale x 32 x half> @llvm.vector.insert.nxv32f16.nxv8f16(<vscale x 32 x half> [[TMP8]], <vscale x 8 x half> [[TMP9]], i64 16)
+// CPP-CHECK-NEXT:    [[TMP11:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } [[TMP4]], 3
+// CPP-CHECK-NEXT:    [[TMP12:%.*]] = tail call <vscale x 32 x half> @llvm.vector.insert.nxv32f16.nxv8f16(<vscale x 32 x half> [[TMP10]], <vscale x 8 x half> [[TMP11]], i64 24)
+// CPP-CHECK-NEXT:    ret <vscale x 32 x half> [[TMP12]]
+//
+svfloat16x4_t test_svscale_single_f16_x4(svfloat16x4_t op1, svint16_t op2) __arm_streaming
+{
+  return SVE_ACLE_FUNC(svscale,_single_f16_x4)(op1, op2);
+}
+
+// CHECK-LABEL: @test_svscale_single_f32_x4(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1:%.*]], i64 0)
+// CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1]], i64 4)
+// CHECK-NEXT:    [[TMP2:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1]], i64 8)
+// CHECK-NEXT:    [[TMP3:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1]], i64 12)
+// CHECK-NEXT:    [[TMP4:%.*]] = tail call { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } @llvm.aarch64.sme.fp8.scale.single.x4.nxv4f32(<vscale x 4 x float> [[TMP0]], <vscale x 4 x float> [[TMP1]], <vscale x 4 x float> [[TMP2]], <vscale x 4 x float> [[TMP3]], <vscale x 4 x i32> [[OP2:%.*]])
+// CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } [[TMP4]], 0
+// CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 16 x float> @llvm.vector.insert.nxv16f32.nxv4f32(<vscale x 16 x float> poison, <vscale x 4 x float> [[TMP5]], i64 0)
+// CHECK-NEXT:    [[TMP7:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } [[TMP4]], 1
+// CHECK-NEXT:    [[TMP8:%.*]] = tail call <vscale x 16 x float> @llvm.vector.insert.nxv16f32.nxv4f32(<vscale x 16 x float> [[TMP6]], <vscale x 4 x float> [[TMP7]], i64 4)
+// CHECK-NEXT:    [[TMP9:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } [[TMP4]], 2
+// CHECK-NEXT:    [[TMP10:%.*]] = tail call <vscale x 16 x float> @llvm.vector.insert.nxv16f32.nxv4f32(<vscale x 16 x float> [[TMP8]], <vscale x 4 x float> [[TMP9]], i64 8)
+// CHECK-NEXT:    [[TMP11:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } [[TMP4]], 3
+// CHECK-NEXT:    [[TMP12:%.*]] = tail call <vscale x 16 x float> @llvm.vector.insert.nxv16f32.nxv4f32(<vscale x 16 x float> [[TMP10]], <vscale x 4 x float> [[TMP11]], i64 12)
+// CHECK-NEXT:    ret <vscale x 16 x float> [[TMP12]]
+//
+// CPP-CHECK-LABEL: @_Z26test_svscale_single_f32_x413svfloat32x4_tu11__SVInt32_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1:%.*]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1]], i64 4)
+// CPP-CHECK-NEXT:    [[TMP2:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1]], i64 8)
+// CPP-CHECK-NEXT:    [[TMP3:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1]], i64 12)
+// CPP-CHECK-NEXT:    [[TMP4:%.*]] = tail call { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } @llvm.aarch64.sme.fp8.scale.single.x4.nxv4f32(<vscale x 4 x float> [[TMP0]], <vscale x 4 x float> [[TMP1]], <vscale x 4 x float> [[TMP2]], <vscale x 4 x float> [[TMP3]], <vscale x 4 x i32> [[OP2:%.*]])
+// CPP-CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } [[TMP4]], 0
+// CPP-CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 16 x float> @llvm.vector.insert.nxv16f32.nxv4f32(<vscale x 16 x float> poison, <vscale x 4 x float> [[TMP5]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP7:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } [[TMP4]], 1
+// CPP-CHECK-NEXT:    [[TMP8:%.*]] = tail call <vscale x 16 x float> @llvm.vector.insert.nxv16f32.nxv4f32(<vscale x 16 x float> [[TMP6]], <vscale x 4 x float> [[TMP7]], i64 4)
+// CPP-CHECK-NEXT:    [[TMP9:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } [[TMP4]], 2
+// CPP-CHECK-NEXT:    [[TMP10:%.*]] = tail call <vscale x 16 x float> @llvm.vector.insert.nxv16f32.nxv4f32(<vscale x 16 x float> [[TMP8]], <vscale x 4 x float> [[TMP9]], i64 8)
+// CPP-CHECK-NEXT:    [[TMP11:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } [[TMP4]], 3
+// CPP-CHECK-NEXT:    [[TMP12:%.*]] = tail call <vscale x 16 x float> @llvm.vector.insert.nxv16f32.nxv4f32(<vscale x 16 x float> [[TMP10]], <vscale x 4 x float> [...
[truncated]

llvm/test/CodeGen/AArch64/sme2-intrinsics-fscale.ll

SpencerAbson

Thanks, LGTM.

llvm-ci · 2024-09-25T14:00:42Z

LLVM Buildbot has detected a new failure on builder llvm-clang-aarch64-darwin running on doug-worker-4 while building clang,llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/6495

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 3: /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/clang -cc1 -internal-isystem /Users/buildbot/buildbot-root/aarch64-darwin/build/lib/clang/20/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c | /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/opt -S -passes=mem2reg,tailcallelim | /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/clang -cc1 -internal-isystem /Users/buildbot/buildbot-root/aarch64-darwin/build/lib/clang/20/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/opt -S -passes=mem2reg,tailcallelim
�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:21:16: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
�[0;1;32m               ^
�[0m�[1m<stdin>:8:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
�[0;1;32m      ^
�[0m�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:48:16: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv8f32(<vscale x 8 x float> [[OP1:%.*]], i64 0)
�[0;1;32m               ^
�[0m�[1m<stdin>:22:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
�[0;1;32m      ^
�[0m�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:75:16: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv4f64(<vscale x 4 x double> [[OP1:%.*]], i64 0)
�[0;1;32m               ^
�[0m�[1m<stdin>:36:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
�[0;1;32m      ^
�[0m�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:103:16: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1:%.*]], i64 0)
�[0;1;32m               ^
�[0m�[1m<stdin>:50:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
�[0;1;32m      ^
�[0m�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:142:16: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1:%.*]], i64 0)
�[0;1;32m               ^
�[0m�[1m<stdin>:68:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
�[0;1;32m      ^
�[0m�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:181:16: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv8f64(<vscale x 8 x double> [[OP1:%.*]], i64 0)
�[0;1;32m               ^
�[0m�[1m<stdin>:86:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
�[0;1;32m      ^
�[0m�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:221:16: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
�[0;1;32m               ^
�[0m�[1m<stdin>:104:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
...

dyung · 2024-09-25T15:21:15Z

@Lukacma the test you added, CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c, is failing on at least 2 bots, can you take a look and revert if you need time to investigate?

Lukacma · 2024-09-25T15:28:13Z

There is a fix for it ready in #109999. Just waiting for the build to finish before merging.

nico · 2024-09-25T15:59:17Z

The test is broken on trunk. It won't get more broken if you merge the fix. So please merge, and if it helps, great.

nico · 2024-09-25T16:08:43Z

Thanks for merging the fix! At least on one of my bots, it helped and it's green again :)

llvm-ci · 2024-09-25T19:18:26Z

LLVM Buildbot has detected a new failure on builder premerge-monolithic-linux running on premerge-linux-1 while building clang,llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/153/builds/10045

Here is the relevant piece of the build log for the reference

Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 3: /build/buildbot/premerge-monolithic-linux/build/bin/clang -cc1 -internal-isystem /build/buildbot/premerge-monolithic-linux/build/lib/clang/20/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - /build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c | /build/buildbot/premerge-monolithic-linux/build/bin/opt -S -passes=mem2reg,tailcallelim | /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck /build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /build/buildbot/premerge-monolithic-linux/build/bin/clang -cc1 -internal-isystem /build/buildbot/premerge-monolithic-linux/build/lib/clang/20/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - /build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /build/buildbot/premerge-monolithic-linux/build/bin/opt -S -passes=mem2reg,tailcallelim
+ /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck /build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
/build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:21:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:8:7: note: scanning from here
entry:
      ^
/build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:48:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv8f32(<vscale x 8 x float> [[OP1:%.*]], i64 0)
               ^
<stdin>:22:7: note: scanning from here
entry:
      ^
/build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:75:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv4f64(<vscale x 4 x double> [[OP1:%.*]], i64 0)
               ^
<stdin>:36:7: note: scanning from here
entry:
      ^
/build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:103:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:50:7: note: scanning from here
entry:
      ^
/build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:142:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1:%.*]], i64 0)
               ^
<stdin>:68:7: note: scanning from here
entry:
      ^
/build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:181:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv8f64(<vscale x 8 x double> [[OP1:%.*]], i64 0)
               ^
<stdin>:86:7: note: scanning from here
entry:
      ^
/build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:221:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:104:7: note: scanning from here
entry:
...

llvm-ci · 2024-09-26T13:07:42Z

LLVM Buildbot has detected a new failure on builder clang-x86_64-debian-fast running on gribozavr4 while building clang,llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/56/builds/8373

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 3: /b/1/clang-x86_64-debian-fast/llvm.obj/bin/clang -cc1 -internal-isystem /b/1/clang-x86_64-debian-fast/llvm.obj/lib/clang/20/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - /b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c | /b/1/clang-x86_64-debian-fast/llvm.obj/bin/opt -S -passes=mem2reg,tailcallelim | /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/clang -cc1 -internal-isystem /b/1/clang-x86_64-debian-fast/llvm.obj/lib/clang/20/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - /b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/opt -S -passes=mem2reg,tailcallelim
/b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:21:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:8:7: note: scanning from here
entry:
      ^
/b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:48:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv8f32(<vscale x 8 x float> [[OP1:%.*]], i64 0)
               ^
<stdin>:22:7: note: scanning from here
entry:
      ^
/b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:75:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv4f64(<vscale x 4 x double> [[OP1:%.*]], i64 0)
               ^
<stdin>:36:7: note: scanning from here
entry:
      ^
/b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:103:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:50:7: note: scanning from here
entry:
      ^
/b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:142:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1:%.*]], i64 0)
               ^
<stdin>:68:7: note: scanning from here
entry:
      ^
/b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:181:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv8f64(<vscale x 8 x double> [[OP1:%.*]], i64 0)
               ^
<stdin>:86:7: note: scanning from here
entry:
      ^
/b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:221:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:104:7: note: scanning from here
entry:
...

llvm-ci · 2024-09-26T13:21:34Z

LLVM Buildbot has detected a new failure on builder llvm-x86_64-debian-dylib running on gribozavr4 while building clang,llvm at step 6 "test-build-unified-tree-check-clang".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/60/builds/8620

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-clang) failure: test (failure)
******************** TEST 'Clang :: CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 3: /b/1/llvm-x86_64-debian-dylib/build/bin/clang -cc1 -internal-isystem /b/1/llvm-x86_64-debian-dylib/build/lib/clang/20/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - /b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c | /b/1/llvm-x86_64-debian-dylib/build/bin/opt -S -passes=mem2reg,tailcallelim | /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck /b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /b/1/llvm-x86_64-debian-dylib/build/bin/opt -S -passes=mem2reg,tailcallelim
+ /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck /b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /b/1/llvm-x86_64-debian-dylib/build/bin/clang -cc1 -internal-isystem /b/1/llvm-x86_64-debian-dylib/build/lib/clang/20/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - /b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
/b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:21:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:8:7: note: scanning from here
entry:
      ^
/b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:48:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv8f32(<vscale x 8 x float> [[OP1:%.*]], i64 0)
               ^
<stdin>:22:7: note: scanning from here
entry:
      ^
/b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:75:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv4f64(<vscale x 4 x double> [[OP1:%.*]], i64 0)
               ^
<stdin>:36:7: note: scanning from here
entry:
      ^
/b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:103:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:50:7: note: scanning from here
entry:
      ^
/b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:142:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1:%.*]], i64 0)
               ^
<stdin>:68:7: note: scanning from here
entry:
      ^
/b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:181:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv8f64(<vscale x 8 x double> [[OP1:%.*]], i64 0)
               ^
<stdin>:86:7: note: scanning from here
entry:
      ^
/b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:221:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:104:7: note: scanning from here
entry:
...

This patch implements these intrinsics: FSCALE SINGLE AND MULTI ``` // Variants are also available for: // [_single_f32_x2], [_single_f64_x2], // [_single_f16_x4], [_single_f32_x4], [_single_f64_x4] svfloat16x2_t svscale[_single_f16_x2](svfloat16x2_t zd, svfloat16_t zm) __arm_streaming; // Variants are also available for: // [_f32_x2], [_f64_x2], // [_f16_x4], [_f32_x4], [_f64_x4] svfloat16x2_t svscale[_f16_x2](svfloat16x2_t zd, svfloat16x2_t zm) __arm_streaming ``` (cf. ARM-software/acle#323) Co-authored-by: Caroline Concatto <caroline.concatto@arm.com>

[AArch64] Implement intrinsics for SME2 FSCALE

ea280c4

Lukacma requested review from momchil-velikov and CarolineConcatto July 23, 2024 14:17

llvmbot added clang Clang issues not falling into any other category backend:AArch64 clang:frontend Language frontend issues, e.g. anything involving "Sema" llvm:ir labels Jul 23, 2024

SpencerAbson reviewed Sep 2, 2024

View reviewed changes

llvm/test/CodeGen/AArch64/sme2-intrinsics-fscale.ll Outdated Show resolved Hide resolved

Fix test file

961a45e

SpencerAbson approved these changes Sep 2, 2024

View reviewed changes

Lukacma added 3 commits September 23, 2024 10:24

Merge branch 'main' into fscale-sme

d46912a

Merge branch 'main' into fscale-sme

4d50b42

Merge branch 'main' into fscale-sme

3d9814b

Lukacma merged commit 02c138f into llvm:main Sep 25, 2024
5 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AArch64] Implement intrinsics for SME2 FSCALE #100128

[AArch64] Implement intrinsics for SME2 FSCALE #100128

Lukacma commented Jul 23, 2024

llvmbot commented Jul 23, 2024 •

edited

Loading

SpencerAbson left a comment

llvm-ci commented Sep 25, 2024

dyung commented Sep 25, 2024

Lukacma commented Sep 25, 2024

nico commented Sep 25, 2024

nico commented Sep 25, 2024

llvm-ci commented Sep 25, 2024

llvm-ci commented Sep 26, 2024

llvm-ci commented Sep 26, 2024

[AArch64] Implement intrinsics for SME2 FSCALE #100128

[AArch64] Implement intrinsics for SME2 FSCALE #100128

Conversation

Lukacma commented Jul 23, 2024

llvmbot commented Jul 23, 2024 • edited Loading

SpencerAbson left a comment

Choose a reason for hiding this comment

llvm-ci commented Sep 25, 2024

dyung commented Sep 25, 2024

Lukacma commented Sep 25, 2024

nico commented Sep 25, 2024

nico commented Sep 25, 2024

llvm-ci commented Sep 25, 2024

llvm-ci commented Sep 26, 2024

llvm-ci commented Sep 26, 2024

llvmbot commented Jul 23, 2024 •

edited

Loading