Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64] Implement intrinsics for SME2 FSCALE #100128

Merged
merged 5 commits into from
Sep 25, 2024
Merged

Conversation

Lukacma
Copy link
Contributor

@Lukacma Lukacma commented Jul 23, 2024

This patch implements these intrinsics:

FSCALE SINGLE AND MULTI

  // Variants are also available for:
  // [_single_f32_x2], [_single_f64_x2],
  // [_single_f16_x4], [_single_f32_x4], [_single_f64_x4]
  svfloat16x2_t svscale[_single_f16_x2](svfloat16x2_t zd, svfloat16_t zm) __arm_streaming;

  // Variants are also available for:
  //  [_f32_x2], [_f64_x2],
  //  [_f16_x4], [_f32_x4], [_f64_x4]
  svfloat16x2_t svscale[_f16_x2](svfloat16x2_t zd, svfloat16x2_t zm) __arm_streaming

(cf. ARM-software/acle#323)

Co-authored-by: Caroline Concatto caroline.concatto@arm.com

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AArch64 clang:frontend Language frontend issues, e.g. anything involving "Sema" llvm:ir labels Jul 23, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 23, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-backend-aarch64

Author: None (Lukacma)

Changes

This patch implements these intrinsics:

FSCALE SINGLE AND MULTI

  // Variants are also available for:
  // [_single_f32_x2], [_single_f64_x2],
  // [_single_f16_x4], [_single_f32_x4], [_single_f64_x4]
  svfloat16x2_t svscale[_single_f16_x2](svfloat16x2_t zd, svfloat16_t zm) __arm_streaming;

  // Variants are also available for:
  //  [_f32_x2], [_f64_x2],
  //  [_f16_x4], [_f32_x4], [_f64_x4]
  svfloat16x2_t svscale[_f16_x2](svfloat16x2_t zd, svfloat16x2_t zm) __arm_streaming

(cf. ARM-software/acle#323)

Co-authored-by: Caroline Concatto <caroline.concatto@arm.com>


Patch is 72.34 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/100128.diff

5 Files Affected:

  • (modified) clang/include/clang/Basic/arm_sve.td (+10)
  • (added) clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c (+452)
  • (modified) llvm/include/llvm/IR/IntrinsicsAArch64.td (+25)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp (+28)
  • (added) llvm/test/CodeGen/AArch64/sme2-intrinsics-fscale.ll (+186)
diff --git a/clang/include/clang/Basic/arm_sve.td b/clang/include/clang/Basic/arm_sve.td
index 94c093d891156..2fb93e091aa5c 100644
--- a/clang/include/clang/Basic/arm_sve.td
+++ b/clang/include/clang/Basic/arm_sve.td
@@ -2377,6 +2377,16 @@ let SVETargetGuard = InvalidMode, SMETargetGuard = "sme2" in {
   def SVUUNPK_X4 : SInst<"svunpk_{d}[_{3}_x4]", "42.h", "UsUiUl", MergeNone, "aarch64_sve_uunpk_x4", [IsStreaming], []>;
 }
 
+//
+// Multi-vector scaling
+//
+let  SVETargetGuard = InvalidMode, SMETargetGuard = "sme2,fp8" in {
+  def FSCALE_SINGLE_X2 : Inst<"svscale[_single_{d}_x2]", "22x", "fhd", MergeNone, "aarch64_sme_fp8_scale_single_x2", [IsStreaming],[]>;
+  def FSCALE_SINGLE_X4 : Inst<"svscale[_single_{d}_x4]", "44x", "fhd", MergeNone, "aarch64_sme_fp8_scale_single_x4", [IsStreaming],[]>;
+  def FSCALE_X2 : Inst<"svscale[_{d}_x2]", "222.x", "fhd", MergeNone, "aarch64_sme_fp8_scale_x2", [IsStreaming],[]>;
+  def FSCALE_X4 : Inst<"svscale[_{d}_x4]", "444.x", "fhd", MergeNone, "aarch64_sme_fp8_scale_x4", [IsStreaming],[]>;
+}
+
 let SVETargetGuard = "sve2p1", SMETargetGuard = "sme2" in {
 // == BFloat16 multiply-subtract ==
   def SVBFMLSLB : SInst<"svbfmlslb[_{d}]", "dd$$", "f", MergeNone, "aarch64_sve_bfmlslb", [IsOverloadNone, VerifyRuntimeMode], []>;
diff --git a/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c b/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
new file mode 100644
index 0000000000000..b733e772ba307
--- /dev/null
+++ b/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
@@ -0,0 +1,452 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// REQUIRES: aarch64-registered-target
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | opt -S -passes=mem2reg,tailcallelim | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - -x c++ %s | opt -S -passes=mem2reg,tailcallelim | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | opt -S -passes=mem2reg,tailcallelim | FileCheck %s
+// RUN: %clang_cc1 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - -x c++ %s | opt -S -passes=mem2reg,tailcallelim | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -S -disable-O0-optnone -Werror -Wall -o /dev/null %s
+#include <arm_sme.h>
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1,A2_UNUSED) A1
+#else
+#define SVE_ACLE_FUNC(A1,A2) A1##A2
+#endif
+
+
+// Single x2
+// CHECK-LABEL: @test_svscale_single_f16_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
+// CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1]], i64 8)
+// CHECK-NEXT:    [[TMP2:%.*]] = tail call { <vscale x 8 x half>, <vscale x 8 x half> } @llvm.aarch64.sme.fp8.scale.single.x2.nxv8f16(<vscale x 8 x half> [[TMP0]], <vscale x 8 x half> [[TMP1]], <vscale x 8 x i16> [[OP2:%.*]])
+// CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half> } [[TMP2]], 0
+// CHECK-NEXT:    [[TMP4:%.*]] = tail call <vscale x 16 x half> @llvm.vector.insert.nxv16f16.nxv8f16(<vscale x 16 x half> poison, <vscale x 8 x half> [[TMP3]], i64 0)
+// CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half> } [[TMP2]], 1
+// CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 16 x half> @llvm.vector.insert.nxv16f16.nxv8f16(<vscale x 16 x half> [[TMP4]], <vscale x 8 x half> [[TMP5]], i64 8)
+// CHECK-NEXT:    ret <vscale x 16 x half> [[TMP6]]
+//
+// CPP-CHECK-LABEL: @_Z26test_svscale_single_f16_x213svfloat16x2_tu11__SVInt16_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1]], i64 8)
+// CPP-CHECK-NEXT:    [[TMP2:%.*]] = tail call { <vscale x 8 x half>, <vscale x 8 x half> } @llvm.aarch64.sme.fp8.scale.single.x2.nxv8f16(<vscale x 8 x half> [[TMP0]], <vscale x 8 x half> [[TMP1]], <vscale x 8 x i16> [[OP2:%.*]])
+// CPP-CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half> } [[TMP2]], 0
+// CPP-CHECK-NEXT:    [[TMP4:%.*]] = tail call <vscale x 16 x half> @llvm.vector.insert.nxv16f16.nxv8f16(<vscale x 16 x half> poison, <vscale x 8 x half> [[TMP3]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half> } [[TMP2]], 1
+// CPP-CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 16 x half> @llvm.vector.insert.nxv16f16.nxv8f16(<vscale x 16 x half> [[TMP4]], <vscale x 8 x half> [[TMP5]], i64 8)
+// CPP-CHECK-NEXT:    ret <vscale x 16 x half> [[TMP6]]
+//
+svfloat16x2_t test_svscale_single_f16_x2(svfloat16x2_t op1, svint16_t op2) __arm_streaming
+{
+  return SVE_ACLE_FUNC(svscale,_single_f16_x2)(op1, op2);
+}
+
+// CHECK-LABEL: @test_svscale_single_f32_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv8f32(<vscale x 8 x float> [[OP1:%.*]], i64 0)
+// CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv8f32(<vscale x 8 x float> [[OP1]], i64 4)
+// CHECK-NEXT:    [[TMP2:%.*]] = tail call { <vscale x 4 x float>, <vscale x 4 x float> } @llvm.aarch64.sme.fp8.scale.single.x2.nxv4f32(<vscale x 4 x float> [[TMP0]], <vscale x 4 x float> [[TMP1]], <vscale x 4 x i32> [[OP2:%.*]])
+// CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float> } [[TMP2]], 0
+// CHECK-NEXT:    [[TMP4:%.*]] = tail call <vscale x 8 x float> @llvm.vector.insert.nxv8f32.nxv4f32(<vscale x 8 x float> poison, <vscale x 4 x float> [[TMP3]], i64 0)
+// CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float> } [[TMP2]], 1
+// CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 8 x float> @llvm.vector.insert.nxv8f32.nxv4f32(<vscale x 8 x float> [[TMP4]], <vscale x 4 x float> [[TMP5]], i64 4)
+// CHECK-NEXT:    ret <vscale x 8 x float> [[TMP6]]
+//
+// CPP-CHECK-LABEL: @_Z26test_svscale_single_f32_x213svfloat32x2_tu11__SVInt32_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv8f32(<vscale x 8 x float> [[OP1:%.*]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv8f32(<vscale x 8 x float> [[OP1]], i64 4)
+// CPP-CHECK-NEXT:    [[TMP2:%.*]] = tail call { <vscale x 4 x float>, <vscale x 4 x float> } @llvm.aarch64.sme.fp8.scale.single.x2.nxv4f32(<vscale x 4 x float> [[TMP0]], <vscale x 4 x float> [[TMP1]], <vscale x 4 x i32> [[OP2:%.*]])
+// CPP-CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float> } [[TMP2]], 0
+// CPP-CHECK-NEXT:    [[TMP4:%.*]] = tail call <vscale x 8 x float> @llvm.vector.insert.nxv8f32.nxv4f32(<vscale x 8 x float> poison, <vscale x 4 x float> [[TMP3]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float> } [[TMP2]], 1
+// CPP-CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 8 x float> @llvm.vector.insert.nxv8f32.nxv4f32(<vscale x 8 x float> [[TMP4]], <vscale x 4 x float> [[TMP5]], i64 4)
+// CPP-CHECK-NEXT:    ret <vscale x 8 x float> [[TMP6]]
+//
+svfloat32x2_t test_svscale_single_f32_x2(svfloat32x2_t op1, svint32_t op2) __arm_streaming
+{
+  return SVE_ACLE_FUNC(svscale,_single_f32_x2)(op1, op2);
+}
+
+// CHECK-LABEL: @test_svscale_single_f64_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv4f64(<vscale x 4 x double> [[OP1:%.*]], i64 0)
+// CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv4f64(<vscale x 4 x double> [[OP1]], i64 2)
+// CHECK-NEXT:    [[TMP2:%.*]] = tail call { <vscale x 2 x double>, <vscale x 2 x double> } @llvm.aarch64.sme.fp8.scale.single.x2.nxv2f64(<vscale x 2 x double> [[TMP0]], <vscale x 2 x double> [[TMP1]], <vscale x 2 x i64> [[OP2:%.*]])
+// CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <vscale x 2 x double>, <vscale x 2 x double> } [[TMP2]], 0
+// CHECK-NEXT:    [[TMP4:%.*]] = tail call <vscale x 4 x double> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x double> poison, <vscale x 2 x double> [[TMP3]], i64 0)
+// CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 2 x double>, <vscale x 2 x double> } [[TMP2]], 1
+// CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 4 x double> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x double> [[TMP4]], <vscale x 2 x double> [[TMP5]], i64 2)
+// CHECK-NEXT:    ret <vscale x 4 x double> [[TMP6]]
+//
+// CPP-CHECK-LABEL: @_Z26test_svscale_single_f64_x213svfloat64x2_tu11__SVInt64_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv4f64(<vscale x 4 x double> [[OP1:%.*]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv4f64(<vscale x 4 x double> [[OP1]], i64 2)
+// CPP-CHECK-NEXT:    [[TMP2:%.*]] = tail call { <vscale x 2 x double>, <vscale x 2 x double> } @llvm.aarch64.sme.fp8.scale.single.x2.nxv2f64(<vscale x 2 x double> [[TMP0]], <vscale x 2 x double> [[TMP1]], <vscale x 2 x i64> [[OP2:%.*]])
+// CPP-CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <vscale x 2 x double>, <vscale x 2 x double> } [[TMP2]], 0
+// CPP-CHECK-NEXT:    [[TMP4:%.*]] = tail call <vscale x 4 x double> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x double> poison, <vscale x 2 x double> [[TMP3]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 2 x double>, <vscale x 2 x double> } [[TMP2]], 1
+// CPP-CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 4 x double> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x double> [[TMP4]], <vscale x 2 x double> [[TMP5]], i64 2)
+// CPP-CHECK-NEXT:    ret <vscale x 4 x double> [[TMP6]]
+//
+svfloat64x2_t test_svscale_single_f64_x2(svfloat64x2_t op1, svint64_t op2) __arm_streaming
+{
+  return SVE_ACLE_FUNC(svscale,_single_f64_x2)(op1, op2);
+}
+
+// Single x4
+// CHECK-LABEL: @test_svscale_single_f16_x4(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1:%.*]], i64 0)
+// CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1]], i64 8)
+// CHECK-NEXT:    [[TMP2:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1]], i64 16)
+// CHECK-NEXT:    [[TMP3:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1]], i64 24)
+// CHECK-NEXT:    [[TMP4:%.*]] = tail call { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } @llvm.aarch64.sme.fp8.scale.single.x4.nxv8f16(<vscale x 8 x half> [[TMP0]], <vscale x 8 x half> [[TMP1]], <vscale x 8 x half> [[TMP2]], <vscale x 8 x half> [[TMP3]], <vscale x 8 x i16> [[OP2:%.*]])
+// CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } [[TMP4]], 0
+// CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 32 x half> @llvm.vector.insert.nxv32f16.nxv8f16(<vscale x 32 x half> poison, <vscale x 8 x half> [[TMP5]], i64 0)
+// CHECK-NEXT:    [[TMP7:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } [[TMP4]], 1
+// CHECK-NEXT:    [[TMP8:%.*]] = tail call <vscale x 32 x half> @llvm.vector.insert.nxv32f16.nxv8f16(<vscale x 32 x half> [[TMP6]], <vscale x 8 x half> [[TMP7]], i64 8)
+// CHECK-NEXT:    [[TMP9:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } [[TMP4]], 2
+// CHECK-NEXT:    [[TMP10:%.*]] = tail call <vscale x 32 x half> @llvm.vector.insert.nxv32f16.nxv8f16(<vscale x 32 x half> [[TMP8]], <vscale x 8 x half> [[TMP9]], i64 16)
+// CHECK-NEXT:    [[TMP11:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } [[TMP4]], 3
+// CHECK-NEXT:    [[TMP12:%.*]] = tail call <vscale x 32 x half> @llvm.vector.insert.nxv32f16.nxv8f16(<vscale x 32 x half> [[TMP10]], <vscale x 8 x half> [[TMP11]], i64 24)
+// CHECK-NEXT:    ret <vscale x 32 x half> [[TMP12]]
+//
+// CPP-CHECK-LABEL: @_Z26test_svscale_single_f16_x413svfloat16x4_tu11__SVInt16_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1:%.*]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1]], i64 8)
+// CPP-CHECK-NEXT:    [[TMP2:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1]], i64 16)
+// CPP-CHECK-NEXT:    [[TMP3:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1]], i64 24)
+// CPP-CHECK-NEXT:    [[TMP4:%.*]] = tail call { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } @llvm.aarch64.sme.fp8.scale.single.x4.nxv8f16(<vscale x 8 x half> [[TMP0]], <vscale x 8 x half> [[TMP1]], <vscale x 8 x half> [[TMP2]], <vscale x 8 x half> [[TMP3]], <vscale x 8 x i16> [[OP2:%.*]])
+// CPP-CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } [[TMP4]], 0
+// CPP-CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 32 x half> @llvm.vector.insert.nxv32f16.nxv8f16(<vscale x 32 x half> poison, <vscale x 8 x half> [[TMP5]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP7:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } [[TMP4]], 1
+// CPP-CHECK-NEXT:    [[TMP8:%.*]] = tail call <vscale x 32 x half> @llvm.vector.insert.nxv32f16.nxv8f16(<vscale x 32 x half> [[TMP6]], <vscale x 8 x half> [[TMP7]], i64 8)
+// CPP-CHECK-NEXT:    [[TMP9:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } [[TMP4]], 2
+// CPP-CHECK-NEXT:    [[TMP10:%.*]] = tail call <vscale x 32 x half> @llvm.vector.insert.nxv32f16.nxv8f16(<vscale x 32 x half> [[TMP8]], <vscale x 8 x half> [[TMP9]], i64 16)
+// CPP-CHECK-NEXT:    [[TMP11:%.*]] = extractvalue { <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x half> } [[TMP4]], 3
+// CPP-CHECK-NEXT:    [[TMP12:%.*]] = tail call <vscale x 32 x half> @llvm.vector.insert.nxv32f16.nxv8f16(<vscale x 32 x half> [[TMP10]], <vscale x 8 x half> [[TMP11]], i64 24)
+// CPP-CHECK-NEXT:    ret <vscale x 32 x half> [[TMP12]]
+//
+svfloat16x4_t test_svscale_single_f16_x4(svfloat16x4_t op1, svint16_t op2) __arm_streaming
+{
+  return SVE_ACLE_FUNC(svscale,_single_f16_x4)(op1, op2);
+}
+
+// CHECK-LABEL: @test_svscale_single_f32_x4(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1:%.*]], i64 0)
+// CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1]], i64 4)
+// CHECK-NEXT:    [[TMP2:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1]], i64 8)
+// CHECK-NEXT:    [[TMP3:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1]], i64 12)
+// CHECK-NEXT:    [[TMP4:%.*]] = tail call { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } @llvm.aarch64.sme.fp8.scale.single.x4.nxv4f32(<vscale x 4 x float> [[TMP0]], <vscale x 4 x float> [[TMP1]], <vscale x 4 x float> [[TMP2]], <vscale x 4 x float> [[TMP3]], <vscale x 4 x i32> [[OP2:%.*]])
+// CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } [[TMP4]], 0
+// CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 16 x float> @llvm.vector.insert.nxv16f32.nxv4f32(<vscale x 16 x float> poison, <vscale x 4 x float> [[TMP5]], i64 0)
+// CHECK-NEXT:    [[TMP7:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } [[TMP4]], 1
+// CHECK-NEXT:    [[TMP8:%.*]] = tail call <vscale x 16 x float> @llvm.vector.insert.nxv16f32.nxv4f32(<vscale x 16 x float> [[TMP6]], <vscale x 4 x float> [[TMP7]], i64 4)
+// CHECK-NEXT:    [[TMP9:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } [[TMP4]], 2
+// CHECK-NEXT:    [[TMP10:%.*]] = tail call <vscale x 16 x float> @llvm.vector.insert.nxv16f32.nxv4f32(<vscale x 16 x float> [[TMP8]], <vscale x 4 x float> [[TMP9]], i64 8)
+// CHECK-NEXT:    [[TMP11:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } [[TMP4]], 3
+// CHECK-NEXT:    [[TMP12:%.*]] = tail call <vscale x 16 x float> @llvm.vector.insert.nxv16f32.nxv4f32(<vscale x 16 x float> [[TMP10]], <vscale x 4 x float> [[TMP11]], i64 12)
+// CHECK-NEXT:    ret <vscale x 16 x float> [[TMP12]]
+//
+// CPP-CHECK-LABEL: @_Z26test_svscale_single_f32_x413svfloat32x4_tu11__SVInt32_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1:%.*]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP1:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1]], i64 4)
+// CPP-CHECK-NEXT:    [[TMP2:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1]], i64 8)
+// CPP-CHECK-NEXT:    [[TMP3:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1]], i64 12)
+// CPP-CHECK-NEXT:    [[TMP4:%.*]] = tail call { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } @llvm.aarch64.sme.fp8.scale.single.x4.nxv4f32(<vscale x 4 x float> [[TMP0]], <vscale x 4 x float> [[TMP1]], <vscale x 4 x float> [[TMP2]], <vscale x 4 x float> [[TMP3]], <vscale x 4 x i32> [[OP2:%.*]])
+// CPP-CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } [[TMP4]], 0
+// CPP-CHECK-NEXT:    [[TMP6:%.*]] = tail call <vscale x 16 x float> @llvm.vector.insert.nxv16f32.nxv4f32(<vscale x 16 x float> poison, <vscale x 4 x float> [[TMP5]], i64 0)
+// CPP-CHECK-NEXT:    [[TMP7:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } [[TMP4]], 1
+// CPP-CHECK-NEXT:    [[TMP8:%.*]] = tail call <vscale x 16 x float> @llvm.vector.insert.nxv16f32.nxv4f32(<vscale x 16 x float> [[TMP6]], <vscale x 4 x float> [[TMP7]], i64 4)
+// CPP-CHECK-NEXT:    [[TMP9:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } [[TMP4]], 2
+// CPP-CHECK-NEXT:    [[TMP10:%.*]] = tail call <vscale x 16 x float> @llvm.vector.insert.nxv16f32.nxv4f32(<vscale x 16 x float> [[TMP8]], <vscale x 4 x float> [[TMP9]], i64 8)
+// CPP-CHECK-NEXT:    [[TMP11:%.*]] = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } [[TMP4]], 3
+// CPP-CHECK-NEXT:    [[TMP12:%.*]] = tail call <vscale x 16 x float> @llvm.vector.insert.nxv16f32.nxv4f32(<vscale x 16 x float> [[TMP10]], <vscale x 4 x float> [...
[truncated]

Copy link
Contributor

@SpencerAbson SpencerAbson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM.

@Lukacma Lukacma merged commit 02c138f into llvm:main Sep 25, 2024
5 of 8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 25, 2024

LLVM Buildbot has detected a new failure on builder llvm-clang-aarch64-darwin running on doug-worker-4 while building clang,llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/6495

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 3: /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/clang -cc1 -internal-isystem /Users/buildbot/buildbot-root/aarch64-darwin/build/lib/clang/20/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c | /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/opt -S -passes=mem2reg,tailcallelim | /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/clang -cc1 -internal-isystem /Users/buildbot/buildbot-root/aarch64-darwin/build/lib/clang/20/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/opt -S -passes=mem2reg,tailcallelim
�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:21:16: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
�[0;1;32m               ^
�[0m�[1m<stdin>:8:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
�[0;1;32m      ^
�[0m�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:48:16: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv8f32(<vscale x 8 x float> [[OP1:%.*]], i64 0)
�[0;1;32m               ^
�[0m�[1m<stdin>:22:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
�[0;1;32m      ^
�[0m�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:75:16: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv4f64(<vscale x 4 x double> [[OP1:%.*]], i64 0)
�[0;1;32m               ^
�[0m�[1m<stdin>:36:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
�[0;1;32m      ^
�[0m�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:103:16: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1:%.*]], i64 0)
�[0;1;32m               ^
�[0m�[1m<stdin>:50:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
�[0;1;32m      ^
�[0m�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:142:16: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1:%.*]], i64 0)
�[0;1;32m               ^
�[0m�[1m<stdin>:68:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
�[0;1;32m      ^
�[0m�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:181:16: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv8f64(<vscale x 8 x double> [[OP1:%.*]], i64 0)
�[0;1;32m               ^
�[0m�[1m<stdin>:86:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
�[0;1;32m      ^
�[0m�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:221:16: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
�[0;1;32m               ^
�[0m�[1m<stdin>:104:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mentry:
...

@dyung
Copy link
Collaborator

dyung commented Sep 25, 2024

@Lukacma the test you added, CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c, is failing on at least 2 bots, can you take a look and revert if you need time to investigate?

@Lukacma
Copy link
Contributor Author

Lukacma commented Sep 25, 2024

There is a fix for it ready in #109999. Just waiting for the build to finish before merging.

@nico
Copy link
Contributor

nico commented Sep 25, 2024

The test is broken on trunk. It won't get more broken if you merge the fix. So please merge, and if it helps, great.

@nico
Copy link
Contributor

nico commented Sep 25, 2024

Thanks for merging the fix! At least on one of my bots, it helped and it's green again :)

@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 25, 2024

LLVM Buildbot has detected a new failure on builder premerge-monolithic-linux running on premerge-linux-1 while building clang,llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/153/builds/10045

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 3: /build/buildbot/premerge-monolithic-linux/build/bin/clang -cc1 -internal-isystem /build/buildbot/premerge-monolithic-linux/build/lib/clang/20/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - /build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c | /build/buildbot/premerge-monolithic-linux/build/bin/opt -S -passes=mem2reg,tailcallelim | /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck /build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /build/buildbot/premerge-monolithic-linux/build/bin/clang -cc1 -internal-isystem /build/buildbot/premerge-monolithic-linux/build/lib/clang/20/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - /build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /build/buildbot/premerge-monolithic-linux/build/bin/opt -S -passes=mem2reg,tailcallelim
+ /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck /build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
/build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:21:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:8:7: note: scanning from here
entry:
      ^
/build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:48:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv8f32(<vscale x 8 x float> [[OP1:%.*]], i64 0)
               ^
<stdin>:22:7: note: scanning from here
entry:
      ^
/build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:75:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv4f64(<vscale x 4 x double> [[OP1:%.*]], i64 0)
               ^
<stdin>:36:7: note: scanning from here
entry:
      ^
/build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:103:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:50:7: note: scanning from here
entry:
      ^
/build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:142:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1:%.*]], i64 0)
               ^
<stdin>:68:7: note: scanning from here
entry:
      ^
/build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:181:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv8f64(<vscale x 8 x double> [[OP1:%.*]], i64 0)
               ^
<stdin>:86:7: note: scanning from here
entry:
      ^
/build/buildbot/premerge-monolithic-linux/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:221:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:104:7: note: scanning from here
entry:
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 26, 2024

LLVM Buildbot has detected a new failure on builder clang-x86_64-debian-fast running on gribozavr4 while building clang,llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/56/builds/8373

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 3: /b/1/clang-x86_64-debian-fast/llvm.obj/bin/clang -cc1 -internal-isystem /b/1/clang-x86_64-debian-fast/llvm.obj/lib/clang/20/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - /b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c | /b/1/clang-x86_64-debian-fast/llvm.obj/bin/opt -S -passes=mem2reg,tailcallelim | /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/clang -cc1 -internal-isystem /b/1/clang-x86_64-debian-fast/llvm.obj/lib/clang/20/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - /b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/opt -S -passes=mem2reg,tailcallelim
/b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:21:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:8:7: note: scanning from here
entry:
      ^
/b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:48:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv8f32(<vscale x 8 x float> [[OP1:%.*]], i64 0)
               ^
<stdin>:22:7: note: scanning from here
entry:
      ^
/b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:75:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv4f64(<vscale x 4 x double> [[OP1:%.*]], i64 0)
               ^
<stdin>:36:7: note: scanning from here
entry:
      ^
/b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:103:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:50:7: note: scanning from here
entry:
      ^
/b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:142:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1:%.*]], i64 0)
               ^
<stdin>:68:7: note: scanning from here
entry:
      ^
/b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:181:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv8f64(<vscale x 8 x double> [[OP1:%.*]], i64 0)
               ^
<stdin>:86:7: note: scanning from here
entry:
      ^
/b/1/clang-x86_64-debian-fast/llvm.src/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:221:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:104:7: note: scanning from here
entry:
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 26, 2024

LLVM Buildbot has detected a new failure on builder llvm-x86_64-debian-dylib running on gribozavr4 while building clang,llvm at step 6 "test-build-unified-tree-check-clang".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/60/builds/8620

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-clang) failure: test (failure)
******************** TEST 'Clang :: CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 3: /b/1/llvm-x86_64-debian-dylib/build/bin/clang -cc1 -internal-isystem /b/1/llvm-x86_64-debian-dylib/build/lib/clang/20/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - /b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c | /b/1/llvm-x86_64-debian-dylib/build/bin/opt -S -passes=mem2reg,tailcallelim | /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck /b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /b/1/llvm-x86_64-debian-dylib/build/bin/opt -S -passes=mem2reg,tailcallelim
+ /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck /b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
+ /b/1/llvm-x86_64-debian-dylib/build/bin/clang -cc1 -internal-isystem /b/1/llvm-x86_64-debian-dylib/build/lib/clang/20/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2 -target-feature +fp8 -disable-O0-optnone -Werror -Wall -emit-llvm -o - /b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c
/b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:21:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:8:7: note: scanning from here
entry:
      ^
/b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:48:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv8f32(<vscale x 8 x float> [[OP1:%.*]], i64 0)
               ^
<stdin>:22:7: note: scanning from here
entry:
      ^
/b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:75:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv4f64(<vscale x 4 x double> [[OP1:%.*]], i64 0)
               ^
<stdin>:36:7: note: scanning from here
entry:
      ^
/b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:103:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv32f16(<vscale x 32 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:50:7: note: scanning from here
entry:
      ^
/b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:142:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.vector.extract.nxv4f32.nxv16f32(<vscale x 16 x float> [[OP1:%.*]], i64 0)
               ^
<stdin>:68:7: note: scanning from here
entry:
      ^
/b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:181:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.vector.extract.nxv2f64.nxv8f64(<vscale x 8 x double> [[OP1:%.*]], i64 0)
               ^
<stdin>:86:7: note: scanning from here
entry:
      ^
/b/1/llvm-x86_64-debian-dylib/llvm-project/clang/test/CodeGen/aarch64-fp8-intrinsics/acle_sme2_fp8_scale.c:221:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.vector.extract.nxv8f16.nxv16f16(<vscale x 16 x half> [[OP1:%.*]], i64 0)
               ^
<stdin>:104:7: note: scanning from here
entry:
...

augusto2112 pushed a commit to augusto2112/llvm-project that referenced this pull request Sep 26, 2024
This patch implements these intrinsics:

FSCALE SINGLE AND MULTI
``` 
  // Variants are also available for:
  // [_single_f32_x2], [_single_f64_x2],
  // [_single_f16_x4], [_single_f32_x4], [_single_f64_x4]
  svfloat16x2_t svscale[_single_f16_x2](svfloat16x2_t zd, svfloat16_t zm) __arm_streaming;

  // Variants are also available for:
  //  [_f32_x2], [_f64_x2],
  //  [_f16_x4], [_f32_x4], [_f64_x4]
  svfloat16x2_t svscale[_f16_x2](svfloat16x2_t zd, svfloat16x2_t zm) __arm_streaming

```
(cf. ARM-software/acle#323)

Co-authored-by: Caroline Concatto <caroline.concatto@arm.com>
xgupta pushed a commit to xgupta/llvm-project that referenced this pull request Oct 4, 2024
This patch implements these intrinsics:

FSCALE SINGLE AND MULTI
``` 
  // Variants are also available for:
  // [_single_f32_x2], [_single_f64_x2],
  // [_single_f16_x4], [_single_f32_x4], [_single_f64_x4]
  svfloat16x2_t svscale[_single_f16_x2](svfloat16x2_t zd, svfloat16_t zm) __arm_streaming;

  // Variants are also available for:
  //  [_f32_x2], [_f64_x2],
  //  [_f16_x4], [_f32_x4], [_f64_x4]
  svfloat16x2_t svscale[_f16_x2](svfloat16x2_t zd, svfloat16x2_t zm) __arm_streaming

```
(cf. ARM-software/acle#323)

Co-authored-by: Caroline Concatto <caroline.concatto@arm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:ir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants