-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] Don't forward AVL in VSETVLIInfo if it would clobber other definitions #97264
[RISCV] Don't forward AVL in VSETVLIInfo if it would clobber other definitions #97264
Conversation
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-risc-v Author: Luke Lau (lukel97) ChangesThis fixes a crash found when compiling OpenBLAS with -mllvm -verify-machineinstrs. When we "forward" the AVL from the output of a vsetvli, we might have to extend the LiveInterval of the AVL down to where insert the new vsetvli. Most of the time we are able to extend the LiveInterval because there's only one val num (definition) for the register. But PHI elimination can assign multiple values to the same register, in which case we end up clobbering a different val num when extending:
Here there's no way to extend the %avl from the vsetvli since %avl is redefined, i.e. we have two val nums. This fixes it by only forwarding it when we have exactly one val num, where it should be safe to extend it. Full diff: https://github.com/llvm/llvm-project/pull/97264.diff 3 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
index bf693344b070a..cacfdebb6d906 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
@@ -955,6 +955,12 @@ void RISCVInsertVSETVLI::forwardVSETVLIAVL(VSETVLIInfo &Info) const {
VSETVLIInfo DefInstrInfo = getInfoForVSETVLI(*DefMI);
if (!DefInstrInfo.hasSameVLMAX(Info))
return;
+ // If the AVL is a register with multiple definitions, don't forward it. We
+ // might not be able to extend its LiveInterval without clobbering other val
+ // nums.
+ if (DefInstrInfo.hasAVLReg() &&
+ !LIS->getInterval(DefInstrInfo.getAVLReg()).containsOneValue())
+ return;
Info.setAVL(DefInstrInfo);
}
@@ -1155,15 +1161,22 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
.addImm(Info.encodeVTYPE());
if (LIS) {
LIS->InsertMachineInstrInMaps(*MI);
- // Normally the AVL's live range will already extend past the inserted
+ // Normally the AVL's LiveInterval will already extend past the inserted
// vsetvli because the pseudos below will already use the AVL. But this
// isn't always the case, e.g. PseudoVMV_X_S doesn't have an AVL operand or
// we've taken the AVL from the VL output of another vsetvli.
LiveInterval &LI = LIS->getInterval(AVLReg);
// Need to get non-const VNInfo
VNInfo *VNI = LI.getValNumInfo(Info.getAVLVNInfo()->id);
- LI.addSegment(LiveInterval::Segment(
- VNI->def, LIS->getInstructionIndex(*MI).getRegSlot(), VNI));
+
+ LiveInterval::Segment Segment(
+ VNI->def, LIS->getInstructionIndex(*MI).getRegSlot(), VNI);
+ // We should never end up extending the interval over a different value.
+ assert(all_of(LI.segments, [&Segment](LiveInterval::Segment S) {
+ return !S.containsInterval(Segment.start, Segment.end) ||
+ S.valno == S.valno;
+ }));
+ LI.addSegment(Segment);
}
}
diff --git a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
index 68e8f5dd0a406..94a7536cee664 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
@@ -1062,3 +1062,39 @@ exit:
%c = call <vscale x 2 x i32> @llvm.riscv.vadd.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i32> %a, <vscale x 2 x i32> %d, i64 %vl)
ret <vscale x 2 x i32> %c
}
+
+; Check that we don't forward an AVL if we wouldn't be able to extend its
+; LiveInterval without clobbering other val nos.
+define <vscale x 4 x i32> @unforwardable_avl(i64 %n, <vscale x 4 x i32> %v, i1 %cmp) {
+; CHECK-LABEL: unforwardable_avl:
+; CHECK: # %bb.0: # %entry
+; CHECK-NEXT: vsetvli a2, a0, e32, m2, ta, ma
+; CHECK-NEXT: andi a1, a1, 1
+; CHECK-NEXT: .LBB25_1: # %for.body
+; CHECK-NEXT: # =>This Inner Loop Header: Depth=1
+; CHECK-NEXT: addi a0, a0, 1
+; CHECK-NEXT: bnez a1, .LBB25_1
+; CHECK-NEXT: # %bb.2: # %for.cond.cleanup
+; CHECK-NEXT: vsetvli a0, zero, e32, m2, ta, ma
+; CHECK-NEXT: vadd.vv v10, v8, v8
+; CHECK-NEXT: vsetvli zero, a2, e32, m2, ta, ma
+; CHECK-NEXT: vadd.vv v8, v10, v8
+; CHECK-NEXT: ret
+entry:
+ %0 = tail call i64 @llvm.riscv.vsetvli.i64(i64 %n, i64 2, i64 1)
+ br label %for.body
+
+for.body:
+ ; Use %n in a PHI here so its virtual register is assigned to a second time here.
+ %1 = phi i64 [ %3, %for.body ], [ %n, %entry ]
+ %2 = tail call i64 @llvm.riscv.vsetvli.i64(i64 %1, i64 0, i64 0)
+ %3 = add i64 %1, 1
+ br i1 %cmp, label %for.body, label %for.cond.cleanup
+
+for.cond.cleanup:
+ %4 = tail call <vscale x 4 x i32> @llvm.riscv.vadd.nxv2f32.nxv2f32.i64(<vscale x 4 x i32> undef, <vscale x 4 x i32> %v, <vscale x 4 x i32> %v, i64 -1)
+ ; VL toggle needed here: If the %n AVL was forwarded here we wouldn't be able
+ ; to extend it's LiveInterval because it would clobber the assignment at %1.
+ %5 = tail call <vscale x 4 x i32> @llvm.riscv.vadd.nxv2f32.nxv2f32.i64(<vscale x 4 x i32> undef, <vscale x 4 x i32> %4, <vscale x 4 x i32> %v, i64 %0)
+ ret <vscale x 4 x i32> %5
+}
diff --git a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.mir b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.mir
index 4091d1711b584..ee10c9987be49 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.mir
+++ b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.mir
@@ -134,6 +134,10 @@
ret void
}
+ define void @unforwardable_avl() {
+ ret void
+ }
+
declare i32 @llvm.vector.reduce.add.v4i32(<4 x i32>)
declare <vscale x 1 x i64> @llvm.riscv.vadd.nxv1i64.nxv1i64.i64(<vscale x 1 x i64>, <vscale x 1 x i64>, <vscale x 1 x i64>, i64) #1
@@ -990,3 +994,43 @@ body: |
%x:gpr = PseudoVMV_X_S undef $noreg, 6
PseudoBR %bb.1
...
+---
+name: unforwardable_avl
+tracksRegLiveness: true
+body: |
+ ; CHECK-LABEL: name: unforwardable_avl
+ ; CHECK: bb.0:
+ ; CHECK-NEXT: successors: %bb.1(0x80000000)
+ ; CHECK-NEXT: liveins: $x10, $v8m2
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: %avl:gprnox0 = COPY $x10
+ ; CHECK-NEXT: %outvl:gprnox0 = PseudoVSETVLI %avl, 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.1:
+ ; CHECK-NEXT: successors: %bb.2(0x80000000)
+ ; CHECK-NEXT: liveins: $v8m2
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: dead %avl:gprnox0 = ADDI %avl, 1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.2:
+ ; CHECK-NEXT: liveins: $v8m2
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: dead [[PseudoVSETVLIX0_:%[0-9]+]]:gpr = PseudoVSETVLIX0 killed $x0, 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
+ ; CHECK-NEXT: renamable $v10m2 = PseudoVADD_VV_M2 undef renamable $v10m2, renamable $v8m2, renamable $v8m2, -1, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
+ ; CHECK-NEXT: dead $x0 = PseudoVSETVLI %outvl, 209 /* e32, m2, ta, ma */, implicit-def $vl, implicit-def $vtype
+ ; CHECK-NEXT: renamable $v8m2 = PseudoVADD_VV_M2 undef renamable $v8m2, killed renamable $v10m2, renamable $v8m2, $noreg, 5 /* e32 */, 0 /* tu, mu */, implicit $vl, implicit $vtype
+ ; CHECK-NEXT: PseudoRET implicit $v8m2
+ bb.0:
+ liveins: $x10, $v8m2
+ %avl:gprnox0 = COPY $x10
+ %outvl:gprnox0 = PseudoVSETVLI %avl:gprnox0, 209, implicit-def dead $vl, implicit-def dead $vtype
+
+ bb.1:
+ liveins: $v8m2
+ %avl:gprnox0 = ADDI %avl:gprnox0, 1
+
+ bb.2:
+ liveins: $v8m2
+ renamable $v10m2 = PseudoVADD_VV_M2 undef renamable $v10m2, renamable $v8m2, renamable $v8m2, -1, 5, 0
+ renamable $v8m2 = PseudoVADD_VV_M2 undef renamable $v8m2, killed renamable $v10m2, killed renamable $v8m2, %outvl:gprnox0, 5, 0
+ PseudoRET implicit $v8m2
|
…TVLI In llvm#96200 we handled extending AVL LiveIntervals across basic blocks, which fixed a crash in a test case in 133ab9a. This was done by manually adding a single segment to the LiveInterval to extend it from AVL def -> inserted vsetvli, but in hindsight this was too simple and fails to handle cases where the vsetlvi is located before the AVL def. This patch fixes this by using LiveIntervals::extendToIndices instead which can handle these cases. (The crash that this fixes is separate from the crash in llvm#97264)
…TVLI (#97512) In #96200 we handled extending AVL LiveIntervals across basic blocks, which fixed a crash in a test case in 133ab9a. This was done by manually adding a single segment to the LiveInterval to extend it from AVL def -> inserted vsetvli, but in hindsight this was too simple and fails to handle cases where the vsetlvi is located before the AVL def. This patch fixes this by using LiveIntervals::extendToIndices instead which can handle these cases. (The crash that this fixes is separate from the crash in #97264)
b0ee183
to
06adeab
Compare
…TVLI (llvm#97512) In llvm#96200 we handled extending AVL LiveIntervals across basic blocks, which fixed a crash in a test case in 133ab9a. This was done by manually adding a single segment to the LiveInterval to extend it from AVL def -> inserted vsetvli, but in hindsight this was too simple and fails to handle cases where the vsetlvi is located before the AVL def. This patch fixes this by using LiveIntervals::extendToIndices instead which can handle these cases. (The crash that this fixes is separate from the crash in llvm#97264)
…finitions This fixes a crash found when compiling OpenBLAS with -mllvm -verify-machineinstrs. When we "forward" the AVL from the output of a vsetvli, we might have to extend the LiveInterval of the AVL down to where insert the new vsetvli. Most of the time we are able to extend the LiveInterval because there's only one val num (definition) for the register. But PHI elimination can assign multiple values to the same register, in which case we end up clobbering a different val num when extending: %x = PseudoVSETVLI %avl, ... %avl = ADDI ... %v = PseudoVADD ..., avl=%x ; %avl is forwarded to PseudoVADD: %x = PseudoVSETVLI %avl, ... %avl = ADDI ... %v = PseudoVADD ..., avl=%avl Here there's no way to extend the %avl from the vsetvli since %avl is redefined, i.e. we have two val nums. This fixes it by only forwarding it when we have exactly one val num, where it should be safe to extend it.
A regression, but only since llvm#96200. This just brings us back to square one.
06adeab
to
5d4745a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/174/builds/1146 Here is the relevant piece of the build log for the reference:
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/66/builds/1046 Here is the relevant piece of the build log for the reference:
|
…TVLI (llvm#97512) In llvm#96200 we handled extending AVL LiveIntervals across basic blocks, which fixed a crash in a test case in 133ab9a. This was done by manually adding a single segment to the LiveInterval to extend it from AVL def -> inserted vsetvli, but in hindsight this was too simple and fails to handle cases where the vsetlvi is located before the AVL def. This patch fixes this by using LiveIntervals::extendToIndices instead which can handle these cases. (The crash that this fixes is separate from the crash in llvm#97264)
…finitions (llvm#97264) This fixes a crash found when compiling OpenBLAS with -mllvm -verify-machineinstrs. When we "forward" the AVL from the output of a vsetvli, we might have to extend the LiveInterval of the AVL to where insert the new vsetvli. Most of the time we are able to extend the LiveInterval because there's only one val num (definition) for the register. But PHI elimination can assign multiple values to the same register, in which case we end up clobbering a different val num when extending: %x = PseudoVSETVLI %avl, ... %avl = ADDI ... %v = PseudoVADD ..., avl=%x ; %avl is forwarded to PseudoVADD: %x = PseudoVSETVLI %avl, ... %avl = ADDI ... %v = PseudoVADD ..., avl=%avl Here there's no way to extend the %avl from the vsetvli since %avl is redefined, i.e. we have two val nums. This fixes it by only forwarding it when we have exactly one val num, where it should be safe to extend it.
If I'm following the logic of this patch, don't we need to adjust the equally-zero reasoning in transferBefore as well? That extends the live range of an AVL too doesn't it? If so, I think we're also missing an assert in insertVSETVLI that the AVL we're trying to extend has a single def. |
As another possible approach to this problem... could we create a new virtual register for the forwarded AVL, use a COPY from the existing range (within it's existing live range), and then a use of the new virtreg at the use point? (Thinking specifically in the case we can't extend the existing use.) This would avoid the need to detect the "illegal to forward" case early. |
When we coalesce and delete a vsetvli, we should shrink the LiveInterval of its AVL register now that there is one less use. This fixes a -verify-machineinstrs assertion in an MIR test case I found while investigating llvm#97264 (comment). I couldn't recreate this at the LLVM IR level, seemingly because RISCVInsertVSETVLI will just avoid inserting extra vsetvlis that don't need coalesced.
If we're inserting a vsetvli that uses a register AVL, then the AVL register should either: a) Be already live at the vsetvli with the expected value b) Not be live at the vsetvli, but have exactly one value that can be extended safely: see #97264
I've added an assert in 5ab755c. The AVL can have multiple defs but in all these cases register is already live at the point (and so extending it is a noop I believe). I think you're right about the equally-zero reasoning though, but I can't seem to get it to trigger the above assertion. Here's my thinking of the different cases that can occur when inserting a vsetvli with a register AVL: The AVL doesn't need extended:
The AVL register might need extended:
I think the latter case is mostly working by coincidence. #89089 defers the x0,x0 form and would expose the issue. I'll have a think and see if I can get anywhere with your COPY idea. |
I've submitted a PR for this in #98342, this turned out to be a good idea. I'd rather we're able to handle non-extendible values via a fallback, than have to conservatively check for extendibility in every place we mutate the AVL, which is more likely to have cases fall through the cracks. |
…unk (#98286) Most of the time when we coalesce and delete a vsetvli, we shrink the LiveInterval of its AVL register now that there is one less use. However there's one edge case we were missing where if we have two vsetvlis with no users of vl or vtype in between, we coalesced a vsetvli without shrinking it's AVL. This fixes it by shrinking the LiveInterval whenever we delete a vsetvli, and also makes the LiveIntervals consistent in-situ by not removing the use before shrinking. This fixes a -verify-machineinstrs assertion in an MIR test case I found while investigating #97264 (comment). I couldn't recreate this at the LLVM IR level, seemingly because RISCVInsertVSETVLI will just avoid inserting extra vsetvlis that don't need coalesced.
If we're inserting a vsetvli that uses a register AVL, then the AVL register should either: a) Be already live at the vsetvli with the expected value b) Not be live at the vsetvli, but have exactly one value that can be extended safely: see llvm#97264
…unk (llvm#98286) Most of the time when we coalesce and delete a vsetvli, we shrink the LiveInterval of its AVL register now that there is one less use. However there's one edge case we were missing where if we have two vsetvlis with no users of vl or vtype in between, we coalesced a vsetvli without shrinking it's AVL. This fixes it by shrinking the LiveInterval whenever we delete a vsetvli, and also makes the LiveIntervals consistent in-situ by not removing the use before shrinking. This fixes a -verify-machineinstrs assertion in an MIR test case I found while investigating llvm#97264 (comment). I couldn't recreate this at the LLVM IR level, seemingly because RISCVInsertVSETVLI will just avoid inserting extra vsetvlis that don't need coalesced.
This fixes a crash found when compiling OpenBLAS with -mllvm -verify-machineinstrs.
When we "forward" the AVL from the output of a vsetvli, we might have to extend the LiveInterval of the AVL to where insert the new vsetvli.
Most of the time we are able to extend the LiveInterval because there's only one val num (definition) for the register. But PHI elimination can assign multiple values to the same register, in which case we end up clobbering a different val num when extending:
Here there's no way to extend the %avl from the vsetvli since %avl is redefined, i.e. we have two val nums.
This fixes it by only forwarding it when we have exactly one val num, where it should be safe to extend it.