Skip to content

Commit

Permalink
[RISCV] Defer forming x0,x0 vsetvlis till after insertion
Browse files Browse the repository at this point in the history
Stacked on llvm#96200

Currently we try and detect when the VL doesn't change between two vsetvlis in emitVSETVLIs, and insert a VL-preserving vsetvli x0,x0 then and there.

Doing it in situ has some drawbacks:

- We lose information about what the VL is which can prevent doLocalPostpass from coalescing some vsetvlis further down the line
- We have to explicitly handle x0,x0 form vsetvlis in coalesceVSETVLIs, whereas we don't in the top-down passes
- This prevents us from sharing the VSETVLIInfo compatibility logic between the two, hence why we have canMutatePriorConfig

This patch changes emitVSETVLIs to just emit regular vsetvlis, and adds a separate pass after coalesceVSETVLIs to convert vsetvlis to x0,x0 when possible.

By removing the edge cases needed to handle x0,x0s, we can unify how we check vsetvli compatibility between coalesceVSETVLIs and emitVSETVLIs, and remove the duplicated logic in areCompatibleVTYPEs and canMutatePriorConfig.

Note that when converting to x0,x0, we reuse the block data computed from the dataflow analysis despite it taking place after coalesceVSETVLIs. This turns out to be fine since coalesceVSETVLI never changes the exit state (only the local state within the block), and so the entry states stay the same too.
  • Loading branch information
lukel97 committed Jun 20, 2024
1 parent c9b4345 commit d9cd801
Show file tree
Hide file tree
Showing 8 changed files with 352 additions and 577 deletions.
339 changes: 155 additions & 184 deletions llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp

Large diffs are not rendered by default.

54 changes: 18 additions & 36 deletions llvm/test/CodeGen/RISCV/rvv/fixed-vectors-expandload-fp.ll
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,8 @@ define <2 x half> @expandload_v2f16(ptr %base, <2 x half> %src0, <2 x i1> %mask)
; RV32-NEXT: beqz a1, .LBB1_2
; RV32-NEXT: .LBB1_4: # %cond.load1
; RV32-NEXT: flh fa5, 0(a0)
; RV32-NEXT: vsetvli zero, zero, e16, m2, ta, ma
; RV32-NEXT: vfmv.s.f v9, fa5
; RV32-NEXT: vsetivli zero, 2, e16, mf4, ta, ma
; RV32-NEXT: vfmv.s.f v9, fa5
; RV32-NEXT: vslideup.vi v8, v9, 1
; RV32-NEXT: ret
;
Expand All @@ -77,9 +76,8 @@ define <2 x half> @expandload_v2f16(ptr %base, <2 x half> %src0, <2 x i1> %mask)
; RV64-NEXT: beqz a1, .LBB1_2
; RV64-NEXT: .LBB1_4: # %cond.load1
; RV64-NEXT: flh fa5, 0(a0)
; RV64-NEXT: vsetvli zero, zero, e16, m2, ta, ma
; RV64-NEXT: vfmv.s.f v9, fa5
; RV64-NEXT: vsetivli zero, 2, e16, mf4, ta, ma
; RV64-NEXT: vfmv.s.f v9, fa5
; RV64-NEXT: vslideup.vi v8, v9, 1
; RV64-NEXT: ret
%res = call <2 x half> @llvm.masked.expandload.v2f16(ptr align 2 %base, <2 x i1> %mask, <2 x half> %src0)
Expand Down Expand Up @@ -114,9 +112,8 @@ define <4 x half> @expandload_v4f16(ptr %base, <4 x half> %src0, <4 x i1> %mask)
; RV32-NEXT: beqz a2, .LBB2_2
; RV32-NEXT: .LBB2_6: # %cond.load1
; RV32-NEXT: flh fa5, 0(a0)
; RV32-NEXT: vsetvli zero, zero, e16, m2, ta, ma
; RV32-NEXT: vfmv.s.f v9, fa5
; RV32-NEXT: vsetivli zero, 2, e16, mf2, tu, ma
; RV32-NEXT: vfmv.s.f v9, fa5
; RV32-NEXT: vslideup.vi v8, v9, 1
; RV32-NEXT: addi a0, a0, 2
; RV32-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -162,9 +159,8 @@ define <4 x half> @expandload_v4f16(ptr %base, <4 x half> %src0, <4 x i1> %mask)
; RV64-NEXT: beqz a2, .LBB2_2
; RV64-NEXT: .LBB2_6: # %cond.load1
; RV64-NEXT: flh fa5, 0(a0)
; RV64-NEXT: vsetvli zero, zero, e16, m2, ta, ma
; RV64-NEXT: vfmv.s.f v9, fa5
; RV64-NEXT: vsetivli zero, 2, e16, mf2, tu, ma
; RV64-NEXT: vfmv.s.f v9, fa5
; RV64-NEXT: vslideup.vi v8, v9, 1
; RV64-NEXT: addi a0, a0, 2
; RV64-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -227,9 +223,8 @@ define <8 x half> @expandload_v8f16(ptr %base, <8 x half> %src0, <8 x i1> %mask)
; RV32-NEXT: beqz a2, .LBB3_2
; RV32-NEXT: .LBB3_10: # %cond.load1
; RV32-NEXT: flh fa5, 0(a0)
; RV32-NEXT: vsetvli zero, zero, e16, m2, ta, ma
; RV32-NEXT: vfmv.s.f v9, fa5
; RV32-NEXT: vsetivli zero, 2, e16, m1, tu, ma
; RV32-NEXT: vfmv.s.f v9, fa5
; RV32-NEXT: vslideup.vi v8, v9, 1
; RV32-NEXT: addi a0, a0, 2
; RV32-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -319,9 +314,8 @@ define <8 x half> @expandload_v8f16(ptr %base, <8 x half> %src0, <8 x i1> %mask)
; RV64-NEXT: beqz a2, .LBB3_2
; RV64-NEXT: .LBB3_10: # %cond.load1
; RV64-NEXT: flh fa5, 0(a0)
; RV64-NEXT: vsetvli zero, zero, e16, m2, ta, ma
; RV64-NEXT: vfmv.s.f v9, fa5
; RV64-NEXT: vsetivli zero, 2, e16, m1, tu, ma
; RV64-NEXT: vfmv.s.f v9, fa5
; RV64-NEXT: vslideup.vi v8, v9, 1
; RV64-NEXT: addi a0, a0, 2
; RV64-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -425,9 +419,8 @@ define <2 x float> @expandload_v2f32(ptr %base, <2 x float> %src0, <2 x i1> %mas
; RV32-NEXT: beqz a1, .LBB5_2
; RV32-NEXT: .LBB5_4: # %cond.load1
; RV32-NEXT: flw fa5, 0(a0)
; RV32-NEXT: vsetvli zero, zero, e32, m4, ta, ma
; RV32-NEXT: vfmv.s.f v9, fa5
; RV32-NEXT: vsetivli zero, 2, e32, mf2, ta, ma
; RV32-NEXT: vfmv.s.f v9, fa5
; RV32-NEXT: vslideup.vi v8, v9, 1
; RV32-NEXT: ret
;
Expand All @@ -451,9 +444,8 @@ define <2 x float> @expandload_v2f32(ptr %base, <2 x float> %src0, <2 x i1> %mas
; RV64-NEXT: beqz a1, .LBB5_2
; RV64-NEXT: .LBB5_4: # %cond.load1
; RV64-NEXT: flw fa5, 0(a0)
; RV64-NEXT: vsetvli zero, zero, e32, m4, ta, ma
; RV64-NEXT: vfmv.s.f v9, fa5
; RV64-NEXT: vsetivli zero, 2, e32, mf2, ta, ma
; RV64-NEXT: vfmv.s.f v9, fa5
; RV64-NEXT: vslideup.vi v8, v9, 1
; RV64-NEXT: ret
%res = call <2 x float> @llvm.masked.expandload.v2f32(ptr align 4 %base, <2 x i1> %mask, <2 x float> %src0)
Expand Down Expand Up @@ -488,9 +480,8 @@ define <4 x float> @expandload_v4f32(ptr %base, <4 x float> %src0, <4 x i1> %mas
; RV32-NEXT: beqz a2, .LBB6_2
; RV32-NEXT: .LBB6_6: # %cond.load1
; RV32-NEXT: flw fa5, 0(a0)
; RV32-NEXT: vsetvli zero, zero, e32, m4, ta, ma
; RV32-NEXT: vfmv.s.f v9, fa5
; RV32-NEXT: vsetivli zero, 2, e32, m1, tu, ma
; RV32-NEXT: vfmv.s.f v9, fa5
; RV32-NEXT: vslideup.vi v8, v9, 1
; RV32-NEXT: addi a0, a0, 4
; RV32-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -536,9 +527,8 @@ define <4 x float> @expandload_v4f32(ptr %base, <4 x float> %src0, <4 x i1> %mas
; RV64-NEXT: beqz a2, .LBB6_2
; RV64-NEXT: .LBB6_6: # %cond.load1
; RV64-NEXT: flw fa5, 0(a0)
; RV64-NEXT: vsetvli zero, zero, e32, m4, ta, ma
; RV64-NEXT: vfmv.s.f v9, fa5
; RV64-NEXT: vsetivli zero, 2, e32, m1, tu, ma
; RV64-NEXT: vfmv.s.f v9, fa5
; RV64-NEXT: vslideup.vi v8, v9, 1
; RV64-NEXT: addi a0, a0, 4
; RV64-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -601,9 +591,8 @@ define <8 x float> @expandload_v8f32(ptr %base, <8 x float> %src0, <8 x i1> %mas
; RV32-NEXT: beqz a2, .LBB7_2
; RV32-NEXT: .LBB7_10: # %cond.load1
; RV32-NEXT: flw fa5, 0(a0)
; RV32-NEXT: vsetvli zero, zero, e32, m4, ta, ma
; RV32-NEXT: vfmv.s.f v10, fa5
; RV32-NEXT: vsetivli zero, 2, e32, m1, tu, ma
; RV32-NEXT: vfmv.s.f v10, fa5
; RV32-NEXT: vslideup.vi v8, v10, 1
; RV32-NEXT: addi a0, a0, 4
; RV32-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -693,9 +682,8 @@ define <8 x float> @expandload_v8f32(ptr %base, <8 x float> %src0, <8 x i1> %mas
; RV64-NEXT: beqz a2, .LBB7_2
; RV64-NEXT: .LBB7_10: # %cond.load1
; RV64-NEXT: flw fa5, 0(a0)
; RV64-NEXT: vsetvli zero, zero, e32, m4, ta, ma
; RV64-NEXT: vfmv.s.f v10, fa5
; RV64-NEXT: vsetivli zero, 2, e32, m1, tu, ma
; RV64-NEXT: vfmv.s.f v10, fa5
; RV64-NEXT: vslideup.vi v8, v10, 1
; RV64-NEXT: addi a0, a0, 4
; RV64-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -799,9 +787,8 @@ define <2 x double> @expandload_v2f64(ptr %base, <2 x double> %src0, <2 x i1> %m
; RV32-NEXT: beqz a1, .LBB9_2
; RV32-NEXT: .LBB9_4: # %cond.load1
; RV32-NEXT: fld fa5, 0(a0)
; RV32-NEXT: vsetvli zero, zero, e64, m8, ta, ma
; RV32-NEXT: vfmv.s.f v9, fa5
; RV32-NEXT: vsetivli zero, 2, e64, m1, ta, ma
; RV32-NEXT: vfmv.s.f v9, fa5
; RV32-NEXT: vslideup.vi v8, v9, 1
; RV32-NEXT: ret
;
Expand All @@ -825,9 +812,8 @@ define <2 x double> @expandload_v2f64(ptr %base, <2 x double> %src0, <2 x i1> %m
; RV64-NEXT: beqz a1, .LBB9_2
; RV64-NEXT: .LBB9_4: # %cond.load1
; RV64-NEXT: fld fa5, 0(a0)
; RV64-NEXT: vsetvli zero, zero, e64, m8, ta, ma
; RV64-NEXT: vfmv.s.f v9, fa5
; RV64-NEXT: vsetivli zero, 2, e64, m1, ta, ma
; RV64-NEXT: vfmv.s.f v9, fa5
; RV64-NEXT: vslideup.vi v8, v9, 1
; RV64-NEXT: ret
%res = call <2 x double> @llvm.masked.expandload.v2f64(ptr align 8 %base, <2 x i1> %mask, <2 x double> %src0)
Expand Down Expand Up @@ -862,9 +848,8 @@ define <4 x double> @expandload_v4f64(ptr %base, <4 x double> %src0, <4 x i1> %m
; RV32-NEXT: beqz a2, .LBB10_2
; RV32-NEXT: .LBB10_6: # %cond.load1
; RV32-NEXT: fld fa5, 0(a0)
; RV32-NEXT: vsetvli zero, zero, e64, m8, ta, ma
; RV32-NEXT: vfmv.s.f v10, fa5
; RV32-NEXT: vsetivli zero, 2, e64, m1, tu, ma
; RV32-NEXT: vfmv.s.f v10, fa5
; RV32-NEXT: vslideup.vi v8, v10, 1
; RV32-NEXT: addi a0, a0, 8
; RV32-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -910,9 +895,8 @@ define <4 x double> @expandload_v4f64(ptr %base, <4 x double> %src0, <4 x i1> %m
; RV64-NEXT: beqz a2, .LBB10_2
; RV64-NEXT: .LBB10_6: # %cond.load1
; RV64-NEXT: fld fa5, 0(a0)
; RV64-NEXT: vsetvli zero, zero, e64, m8, ta, ma
; RV64-NEXT: vfmv.s.f v10, fa5
; RV64-NEXT: vsetivli zero, 2, e64, m1, tu, ma
; RV64-NEXT: vfmv.s.f v10, fa5
; RV64-NEXT: vslideup.vi v8, v10, 1
; RV64-NEXT: addi a0, a0, 8
; RV64-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -975,9 +959,8 @@ define <8 x double> @expandload_v8f64(ptr %base, <8 x double> %src0, <8 x i1> %m
; RV32-NEXT: beqz a2, .LBB11_2
; RV32-NEXT: .LBB11_10: # %cond.load1
; RV32-NEXT: fld fa5, 0(a0)
; RV32-NEXT: vsetvli zero, zero, e64, m8, ta, ma
; RV32-NEXT: vfmv.s.f v12, fa5
; RV32-NEXT: vsetivli zero, 2, e64, m1, tu, ma
; RV32-NEXT: vfmv.s.f v12, fa5
; RV32-NEXT: vslideup.vi v8, v12, 1
; RV32-NEXT: addi a0, a0, 8
; RV32-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -1067,9 +1050,8 @@ define <8 x double> @expandload_v8f64(ptr %base, <8 x double> %src0, <8 x i1> %m
; RV64-NEXT: beqz a2, .LBB11_2
; RV64-NEXT: .LBB11_10: # %cond.load1
; RV64-NEXT: fld fa5, 0(a0)
; RV64-NEXT: vsetvli zero, zero, e64, m8, ta, ma
; RV64-NEXT: vfmv.s.f v12, fa5
; RV64-NEXT: vsetivli zero, 2, e64, m1, tu, ma
; RV64-NEXT: vfmv.s.f v12, fa5
; RV64-NEXT: vslideup.vi v8, v12, 1
; RV64-NEXT: addi a0, a0, 8
; RV64-NEXT: andi a2, a1, 4
Expand Down
36 changes: 12 additions & 24 deletions llvm/test/CodeGen/RISCV/rvv/fixed-vectors-expandload-int.ll
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,8 @@ define <2 x i8> @expandload_v2i8(ptr %base, <2 x i8> %src0, <2 x i1> %mask) {
; CHECK-NEXT: beqz a1, .LBB1_2
; CHECK-NEXT: .LBB1_4: # %cond.load1
; CHECK-NEXT: lbu a0, 0(a0)
; CHECK-NEXT: vsetvli zero, zero, e8, m1, ta, ma
; CHECK-NEXT: vmv.s.x v9, a0
; CHECK-NEXT: vsetivli zero, 2, e8, mf8, ta, ma
; CHECK-NEXT: vmv.s.x v9, a0
; CHECK-NEXT: vslideup.vi v8, v9, 1
; CHECK-NEXT: ret
%res = call <2 x i8> @llvm.masked.expandload.v2i8(ptr %base, <2 x i1> %mask, <2 x i8> %src0)
Expand Down Expand Up @@ -77,9 +76,8 @@ define <4 x i8> @expandload_v4i8(ptr %base, <4 x i8> %src0, <4 x i1> %mask) {
; CHECK-NEXT: beqz a2, .LBB2_2
; CHECK-NEXT: .LBB2_6: # %cond.load1
; CHECK-NEXT: lbu a2, 0(a0)
; CHECK-NEXT: vsetvli zero, zero, e8, m1, ta, ma
; CHECK-NEXT: vmv.s.x v9, a2
; CHECK-NEXT: vsetivli zero, 2, e8, mf4, tu, ma
; CHECK-NEXT: vmv.s.x v9, a2
; CHECK-NEXT: vslideup.vi v8, v9, 1
; CHECK-NEXT: addi a0, a0, 1
; CHECK-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -142,9 +140,8 @@ define <8 x i8> @expandload_v8i8(ptr %base, <8 x i8> %src0, <8 x i1> %mask) {
; CHECK-NEXT: beqz a2, .LBB3_2
; CHECK-NEXT: .LBB3_10: # %cond.load1
; CHECK-NEXT: lbu a2, 0(a0)
; CHECK-NEXT: vsetvli zero, zero, e8, m1, ta, ma
; CHECK-NEXT: vmv.s.x v9, a2
; CHECK-NEXT: vsetivli zero, 2, e8, mf2, tu, ma
; CHECK-NEXT: vmv.s.x v9, a2
; CHECK-NEXT: vslideup.vi v8, v9, 1
; CHECK-NEXT: addi a0, a0, 1
; CHECK-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -237,9 +234,8 @@ define <2 x i16> @expandload_v2i16(ptr %base, <2 x i16> %src0, <2 x i1> %mask) {
; CHECK-NEXT: beqz a1, .LBB5_2
; CHECK-NEXT: .LBB5_4: # %cond.load1
; CHECK-NEXT: lh a0, 0(a0)
; CHECK-NEXT: vsetvli zero, zero, e16, m2, ta, ma
; CHECK-NEXT: vmv.s.x v9, a0
; CHECK-NEXT: vsetivli zero, 2, e16, mf4, ta, ma
; CHECK-NEXT: vmv.s.x v9, a0
; CHECK-NEXT: vslideup.vi v8, v9, 1
; CHECK-NEXT: ret
%res = call <2 x i16> @llvm.masked.expandload.v2i16(ptr align 2 %base, <2 x i1> %mask, <2 x i16> %src0)
Expand Down Expand Up @@ -274,9 +270,8 @@ define <4 x i16> @expandload_v4i16(ptr %base, <4 x i16> %src0, <4 x i1> %mask) {
; CHECK-NEXT: beqz a2, .LBB6_2
; CHECK-NEXT: .LBB6_6: # %cond.load1
; CHECK-NEXT: lh a2, 0(a0)
; CHECK-NEXT: vsetvli zero, zero, e16, m2, ta, ma
; CHECK-NEXT: vmv.s.x v9, a2
; CHECK-NEXT: vsetivli zero, 2, e16, mf2, tu, ma
; CHECK-NEXT: vmv.s.x v9, a2
; CHECK-NEXT: vslideup.vi v8, v9, 1
; CHECK-NEXT: addi a0, a0, 2
; CHECK-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -339,9 +334,8 @@ define <8 x i16> @expandload_v8i16(ptr %base, <8 x i16> %src0, <8 x i1> %mask) {
; CHECK-NEXT: beqz a2, .LBB7_2
; CHECK-NEXT: .LBB7_10: # %cond.load1
; CHECK-NEXT: lh a2, 0(a0)
; CHECK-NEXT: vsetvli zero, zero, e16, m2, ta, ma
; CHECK-NEXT: vmv.s.x v9, a2
; CHECK-NEXT: vsetivli zero, 2, e16, m1, tu, ma
; CHECK-NEXT: vmv.s.x v9, a2
; CHECK-NEXT: vslideup.vi v8, v9, 1
; CHECK-NEXT: addi a0, a0, 2
; CHECK-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -434,9 +428,8 @@ define <2 x i32> @expandload_v2i32(ptr %base, <2 x i32> %src0, <2 x i1> %mask) {
; CHECK-NEXT: beqz a1, .LBB9_2
; CHECK-NEXT: .LBB9_4: # %cond.load1
; CHECK-NEXT: lw a0, 0(a0)
; CHECK-NEXT: vsetvli zero, zero, e32, m4, ta, ma
; CHECK-NEXT: vmv.s.x v9, a0
; CHECK-NEXT: vsetivli zero, 2, e32, mf2, ta, ma
; CHECK-NEXT: vmv.s.x v9, a0
; CHECK-NEXT: vslideup.vi v8, v9, 1
; CHECK-NEXT: ret
%res = call <2 x i32> @llvm.masked.expandload.v2i32(ptr align 4 %base, <2 x i1> %mask, <2 x i32> %src0)
Expand Down Expand Up @@ -471,9 +464,8 @@ define <4 x i32> @expandload_v4i32(ptr %base, <4 x i32> %src0, <4 x i1> %mask) {
; CHECK-NEXT: beqz a2, .LBB10_2
; CHECK-NEXT: .LBB10_6: # %cond.load1
; CHECK-NEXT: lw a2, 0(a0)
; CHECK-NEXT: vsetvli zero, zero, e32, m4, ta, ma
; CHECK-NEXT: vmv.s.x v9, a2
; CHECK-NEXT: vsetivli zero, 2, e32, m1, tu, ma
; CHECK-NEXT: vmv.s.x v9, a2
; CHECK-NEXT: vslideup.vi v8, v9, 1
; CHECK-NEXT: addi a0, a0, 4
; CHECK-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -536,9 +528,8 @@ define <8 x i32> @expandload_v8i32(ptr %base, <8 x i32> %src0, <8 x i1> %mask) {
; CHECK-NEXT: beqz a2, .LBB11_2
; CHECK-NEXT: .LBB11_10: # %cond.load1
; CHECK-NEXT: lw a2, 0(a0)
; CHECK-NEXT: vsetvli zero, zero, e32, m4, ta, ma
; CHECK-NEXT: vmv.s.x v10, a2
; CHECK-NEXT: vsetivli zero, 2, e32, m1, tu, ma
; CHECK-NEXT: vmv.s.x v10, a2
; CHECK-NEXT: vslideup.vi v8, v10, 1
; CHECK-NEXT: addi a0, a0, 4
; CHECK-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -680,9 +671,8 @@ define <2 x i64> @expandload_v2i64(ptr %base, <2 x i64> %src0, <2 x i1> %mask) {
; RV64-NEXT: beqz a1, .LBB13_2
; RV64-NEXT: .LBB13_4: # %cond.load1
; RV64-NEXT: ld a0, 0(a0)
; RV64-NEXT: vsetvli zero, zero, e64, m8, ta, ma
; RV64-NEXT: vmv.s.x v9, a0
; RV64-NEXT: vsetivli zero, 2, e64, m1, ta, ma
; RV64-NEXT: vmv.s.x v9, a0
; RV64-NEXT: vslideup.vi v8, v9, 1
; RV64-NEXT: ret
%res = call <2 x i64> @llvm.masked.expandload.v2i64(ptr align 8 %base, <2 x i1> %mask, <2 x i64> %src0)
Expand Down Expand Up @@ -775,9 +765,8 @@ define <4 x i64> @expandload_v4i64(ptr %base, <4 x i64> %src0, <4 x i1> %mask) {
; RV64-NEXT: beqz a2, .LBB14_2
; RV64-NEXT: .LBB14_6: # %cond.load1
; RV64-NEXT: ld a2, 0(a0)
; RV64-NEXT: vsetvli zero, zero, e64, m8, ta, ma
; RV64-NEXT: vmv.s.x v10, a2
; RV64-NEXT: vsetivli zero, 2, e64, m1, tu, ma
; RV64-NEXT: vmv.s.x v10, a2
; RV64-NEXT: vslideup.vi v8, v10, 1
; RV64-NEXT: addi a0, a0, 8
; RV64-NEXT: andi a2, a1, 4
Expand Down Expand Up @@ -954,9 +943,8 @@ define <8 x i64> @expandload_v8i64(ptr %base, <8 x i64> %src0, <8 x i1> %mask) {
; RV64-NEXT: beqz a2, .LBB15_2
; RV64-NEXT: .LBB15_10: # %cond.load1
; RV64-NEXT: ld a2, 0(a0)
; RV64-NEXT: vsetvli zero, zero, e64, m8, ta, ma
; RV64-NEXT: vmv.s.x v12, a2
; RV64-NEXT: vsetivli zero, 2, e64, m1, tu, ma
; RV64-NEXT: vmv.s.x v12, a2
; RV64-NEXT: vslideup.vi v8, v12, 1
; RV64-NEXT: addi a0, a0, 8
; RV64-NEXT: andi a2, a1, 4
Expand Down
2 changes: 1 addition & 1 deletion llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-buildvec.ll
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ define <4 x float> @hang_when_merging_stores_after_legalization(<8 x float> %x,
; CHECK-NEXT: vmul.vx v14, v12, a0
; CHECK-NEXT: vsetvli zero, zero, e32, m2, ta, ma
; CHECK-NEXT: vrgatherei16.vv v12, v8, v14
; CHECK-NEXT: vsetivli zero, 8, e16, m1, ta, ma
; CHECK-NEXT: vsetvli zero, zero, e16, m1, ta, ma
; CHECK-NEXT: vmv.v.i v0, 12
; CHECK-NEXT: vadd.vi v8, v14, -14
; CHECK-NEXT: vsetvli zero, zero, e32, m2, ta, mu
Expand Down
Loading

0 comments on commit d9cd801

Please sign in to comment.