-
Notifications
You must be signed in to change notification settings - Fork 12.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BasicAA] Handle scalable type sizes with constant offsets #80445
Conversation
@llvm/pr-subscribers-llvm-analysis Author: David Green (davemgreen) ChangesThis is a separate, but related issue to #69152 that was attempting to improve The Upper range of the vscale_range, along with known part of the typesize are Full diff: https://github.com/llvm/llvm-project/pull/80445.diff 3 Files Affected:
diff --git a/llvm/lib/Analysis/BasicAliasAnalysis.cpp b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
index 1028b52a79123..a4c2eb20addb8 100644
--- a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
@@ -1113,10 +1113,6 @@ AliasResult BasicAAResult::aliasGEP(
return BaseAlias;
}
- // Bail on analysing scalable LocationSize
- if (V1Size.isScalable() || V2Size.isScalable())
- return AliasResult::MayAlias;
-
// If there is a constant difference between the pointers, but the difference
// is less than the size of the associated memory object, then we know
// that the objects are partially overlapping. If the difference is
@@ -1146,24 +1142,37 @@ AliasResult BasicAAResult::aliasGEP(
if (!VLeftSize.hasValue())
return AliasResult::MayAlias;
- const uint64_t LSize = VLeftSize.getValue();
- if (Off.ult(LSize)) {
- // Conservatively drop processing if a phi was visited and/or offset is
- // too big.
- AliasResult AR = AliasResult::PartialAlias;
- if (VRightSize.hasValue() && Off.ule(INT32_MAX) &&
- (Off + VRightSize.getValue()).ule(LSize)) {
- // Memory referenced by right pointer is nested. Save the offset in
- // cache. Note that originally offset estimated as GEP1-V2, but
- // AliasResult contains the shift that represents GEP1+Offset=V2.
- AR.setOffset(-Off.getSExtValue());
- AR.swap(Swapped);
+ const TypeSize LSize = VLeftSize.getValue();
+ if (!LSize.isScalable()) {
+ if (Off.ult(LSize)) {
+ // Conservatively drop processing if a phi was visited and/or offset is
+ // too big.
+ AliasResult AR = AliasResult::PartialAlias;
+ if (VRightSize.hasValue() && !VRightSize.isScalable() &&
+ Off.ule(INT32_MAX) && (Off + VRightSize.getValue()).ule(LSize)) {
+ // Memory referenced by right pointer is nested. Save the offset in
+ // cache. Note that originally offset estimated as GEP1-V2, but
+ // AliasResult contains the shift that represents GEP1+Offset=V2.
+ AR.setOffset(-Off.getSExtValue());
+ AR.swap(Swapped);
+ }
+ return AR;
}
- return AR;
+ return AliasResult::NoAlias;
+ } else {
+ // We can used the getVScaleRange to prove that Off >= (CR.upper * LSize).
+ ConstantRange CR = getVScaleRange(&F, Off.getBitWidth());
+ APInt UpperRange = CR.getUnsignedMax().umul_sat(
+ APInt(Off.getBitWidth(), LSize.getKnownMinValue()));
+ if (Off.uge(UpperRange))
+ return AliasResult::NoAlias;
}
- return AliasResult::NoAlias;
}
+ // Bail on analysing scalable LocationSize
+ if (V1Size.isScalable() || V2Size.isScalable())
+ return AliasResult::MayAlias;
+
// We need to know both acess sizes for all the following heuristics.
if (!V1Size.hasValue() || !V2Size.hasValue())
return AliasResult::MayAlias;
diff --git a/llvm/test/Analysis/AliasSet/memloc-vscale.ll b/llvm/test/Analysis/AliasSet/memloc-vscale.ll
index 1c7ca79c8db11..6b41604637405 100644
--- a/llvm/test/Analysis/AliasSet/memloc-vscale.ll
+++ b/llvm/test/Analysis/AliasSet/memloc-vscale.ll
@@ -34,7 +34,8 @@ define void @ss2(ptr %p) {
ret void
}
; CHECK-LABEL: Alias sets for function 'son':
-; CHECK: AliasSet[{{.*}}, 2] may alias, Mod Memory locations: (ptr %g, LocationSize::precise(vscale x 16)), (ptr %p, LocationSize::precise(8))
+; CHECK: AliasSet[{{.*}}, 1] must alias, Mod Memory locations: (ptr %g, LocationSize::precise(vscale x 16))
+; CHECK: AliasSet[{{.*}}, 1] must alias, Mod Memory locations: (ptr %p, LocationSize::precise(8))
define void @son(ptr %p) {
%g = getelementptr i8, ptr %p, i64 8
store <vscale x 2 x i64> zeroinitializer, ptr %g, align 2
diff --git a/llvm/test/Analysis/BasicAA/vscale.ll b/llvm/test/Analysis/BasicAA/vscale.ll
index 0d6d8fea392bb..339e4400abc0f 100644
--- a/llvm/test/Analysis/BasicAA/vscale.ll
+++ b/llvm/test/Analysis/BasicAA/vscale.ll
@@ -117,6 +117,36 @@ define void @gep_different_base_const_offset(ptr noalias %p1, ptr noalias %p2) {
ret void
}
+; getelementptr @llvm.vscale tests
+; CHECK-LABEL: gep_llvm_vscale_no_alias
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %gep1, <vscale x 4 x i32>* %gep2
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %gep1, <vscale x 4 x i32>* %gep3
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %gep2, <vscale x 4 x i32>* %gep3
+define void @gep_llvm_vscale_no_alias(ptr %p) {
+ %t1 = tail call i64 @llvm.vscale.i64()
+ %t2 = shl nuw nsw i64 %t1, 3
+ %gep1 = getelementptr i32, ptr %p, i64 %t2
+ %gep2 = getelementptr <vscale x 4 x i32>, ptr %p, i64 1
+ %gep3 = getelementptr <vscale x 4 x i32>, ptr %p, i64 2
+ load <vscale x 4 x i32>, ptr %gep1
+ load <vscale x 4 x i32>, ptr %gep2
+ load <vscale x 4 x i32>, ptr %gep3
+ ret void
+}
+
+declare i64 @llvm.vscale.i64()
+
+; CHECK-LABEL: gep_llvm_vscale_squared_may_alias
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %gep1, <vscale x 4 x i32>* %gep2
+define void @gep_llvm_vscale_squared_may_alias(ptr %p) {
+ %t1 = tail call i64 @llvm.vscale.i64()
+ %gep1 = getelementptr <vscale x 4 x i32>, ptr %p, i64 %t1
+ %gep2 = getelementptr i32, ptr %p, i64 1
+ load <vscale x 4 x i32>, ptr %gep1
+ load <vscale x 4 x i32>, ptr %gep2
+ ret void
+}
+
; getelementptr + bitcast
; CHECK-LABEL: gep_bitcast_1
@@ -153,6 +183,132 @@ define void @gep_bitcast_2(ptr %p) {
ret void
}
+; negative offset tests
+
+; CHECK-LABEL: gep_neg_notscalable
+; CHECK-DAG: MayAlias: <4 x i32>* %p, <4 x i32>* %vm16
+; CHECK-DAG: NoAlias: <4 x i32>* %m16, <4 x i32>* %p
+; CHECK-DAG: MayAlias: <4 x i32>* %m16, <4 x i32>* %vm16
+; CHECK-DAG: MayAlias: <4 x i32>* %p, <4 x i32>* %vm16m16
+; CHECK-DAG: NoAlias: <4 x i32>* %vm16, <4 x i32>* %vm16m16
+; CHECK-DAG: MayAlias: <4 x i32>* %m16, <4 x i32>* %vm16m16
+; CHECK-DAG: MayAlias: <4 x i32>* %m16pv16, <4 x i32>* %p
+; CHECK-DAG: MayAlias: <4 x i32>* %m16pv16, <4 x i32>* %vm16
+; CHECK-DAG: MayAlias: <4 x i32>* %m16, <4 x i32>* %m16pv16
+; CHECK-DAG: MayAlias: <4 x i32>* %m16pv16, <4 x i32>* %vm16m16
+define void @gep_neg_notscalable(ptr %p) vscale_range(1,16) {
+ %vm16 = getelementptr <vscale x 4 x i32>, ptr %p, i64 -1
+ %m16 = getelementptr <4 x i32>, ptr %p, i64 -1
+ %vm16m16 = getelementptr <4 x i32>, ptr %vm16, i64 -1
+ %m16pv16 = getelementptr <vscale x 4 x i32>, ptr %m16, i64 1
+ load <4 x i32>, ptr %p
+ load <4 x i32>, ptr %vm16
+ load <4 x i32>, ptr %m16
+ load <4 x i32>, ptr %vm16m16
+ load <4 x i32>, ptr %m16pv16
+ ret void
+}
+
+; CHECK-LABEL: gep_neg_scalable
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %p
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vm16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16m16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %vm16, <vscale x 4 x i32>* %vm16m16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vm16m16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %p
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %vm16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %m16pv16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %vm16m16
+define void @gep_neg_scalable(ptr %p) vscale_range(1,16) {
+ %vm16 = getelementptr <vscale x 4 x i32>, ptr %p, i64 -1
+ %m16 = getelementptr <4 x i32>, ptr %p, i64 -1
+ %vm16m16 = getelementptr <4 x i32>, ptr %vm16, i64 -1
+ %m16pv16 = getelementptr <vscale x 4 x i32>, ptr %vm16, i64 1
+ load <vscale x 4 x i32>, ptr %p
+ load <vscale x 4 x i32>, ptr %vm16
+ load <vscale x 4 x i32>, ptr %m16
+ load <vscale x 4 x i32>, ptr %vm16m16
+ load <vscale x 4 x i32>, ptr %m16pv16
+ ret void
+}
+
+; CHECK-LABEL: gep_pos_notscalable
+; CHECK-DAG: MayAlias: <4 x i32>* %p, <4 x i32>* %vm16
+; CHECK-DAG: NoAlias: <4 x i32>* %m16, <4 x i32>* %p
+; CHECK-DAG: MayAlias: <4 x i32>* %m16, <4 x i32>* %vm16
+; CHECK-DAG: MayAlias: <4 x i32>* %p, <4 x i32>* %vm16m16
+; CHECK-DAG: NoAlias: <4 x i32>* %vm16, <4 x i32>* %vm16m16
+; CHECK-DAG: MayAlias: <4 x i32>* %m16, <4 x i32>* %vm16m16
+; CHECK-DAG: MayAlias: <4 x i32>* %m16pv16, <4 x i32>* %p
+; CHECK-DAG: MayAlias: <4 x i32>* %m16pv16, <4 x i32>* %vm16
+; CHECK-DAG: MayAlias: <4 x i32>* %m16, <4 x i32>* %m16pv16
+; CHECK-DAG: MayAlias: <4 x i32>* %m16pv16, <4 x i32>* %vm16m16
+define void @gep_pos_notscalable(ptr %p) vscale_range(1,16) {
+ %vm16 = getelementptr <vscale x 4 x i32>, ptr %p, i64 1
+ %m16 = getelementptr <4 x i32>, ptr %p, i64 1
+ %vm16m16 = getelementptr <4 x i32>, ptr %vm16, i64 1
+ %m16pv16 = getelementptr <vscale x 4 x i32>, ptr %vm16, i64 -1
+ load <4 x i32>, ptr %p
+ load <4 x i32>, ptr %vm16
+ load <4 x i32>, ptr %m16
+ load <4 x i32>, ptr %vm16m16
+ load <4 x i32>, ptr %m16pv16
+ ret void
+}
+
+; CHECK-LABEL: gep_pos_scalable
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %p
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vm16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16m16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %vm16, <vscale x 4 x i32>* %vm16m16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vm16m16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %p
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %vm16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %m16pv16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %vm16m16
+define void @gep_pos_scalable(ptr %p) vscale_range(1,16) {
+ %vm16 = getelementptr <vscale x 4 x i32>, ptr %p, i64 1
+ %m16 = getelementptr <4 x i32>, ptr %p, i64 1
+ %vm16m16 = getelementptr <4 x i32>, ptr %vm16, i64 1
+ %m16pv16 = getelementptr <vscale x 4 x i32>, ptr %vm16, i64 -1
+ load <vscale x 4 x i32>, ptr %p
+ load <vscale x 4 x i32>, ptr %vm16
+ load <vscale x 4 x i32>, ptr %m16
+ load <vscale x 4 x i32>, ptr %vm16m16
+ load <vscale x 4 x i32>, ptr %m16pv16
+ ret void
+}
+
+; CHECK-LABEL: v1v2types
+; CHECK-DAG: MustAlias: <4 x i32>* %p, <vscale x 4 x i32>* %p
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16
+; CHECK-DAG: MayAlias: <4 x i32>* %p, <vscale x 4 x i32>* %vm16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %p, <4 x i32>* %vm16
+; CHECK-DAG: MayAlias: <4 x i32>* %p, <4 x i32>* %vm16
+; CHECK-DAG: MustAlias: <4 x i32>* %vm16, <vscale x 4 x i32>* %vm16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %p
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16, <4 x i32>* %p
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vm16
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %m16, <4 x i32>* %vm16
+; CHECK-DAG: NoAlias: <4 x i32>* %m16, <vscale x 4 x i32>* %p
+; CHECK-DAG: NoAlias: <4 x i32>* %m16, <4 x i32>* %p
+; CHECK-DAG: MayAlias: <4 x i32>* %m16, <vscale x 4 x i32>* %vm16
+; CHECK-DAG: MayAlias: <4 x i32>* %m16, <4 x i32>* %vm16
+; CHECK-DAG: MustAlias: <4 x i32>* %m16, <vscale x 4 x i32>* %m16
+define void @v1v2types(ptr %p) vscale_range(1,16) {
+ %vm16 = getelementptr <vscale x 4 x i32>, ptr %p, i64 -1
+ %m16 = getelementptr <4 x i32>, ptr %p, i64 -1
+ load <vscale x 4 x i32>, ptr %p
+ load <4 x i32>, ptr %p
+ load <vscale x 4 x i32>, ptr %vm16
+ load <4 x i32>, ptr %vm16
+ load <vscale x 4 x i32>, ptr %m16
+ load <4 x i32>, ptr %m16
+ ret void
+}
+
; getelementptr recursion
; CHECK-LABEL: gep_recursion_level_1
@@ -269,3 +425,51 @@ define void @gep_recursion_max_lookup_depth_reached(ptr %a, ptr %p) {
load i32, ptr %gep_rec_6
ret void
}
+
+; CHECK-LABEL: gep_2048
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %off255, <vscale x 4 x i32>* %p
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %noff255, <vscale x 4 x i32>* %p
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %noff255, <vscale x 4 x i32>* %off255
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %off256, <vscale x 4 x i32>* %p
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %off255, <vscale x 4 x i32>* %off256
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %noff255, <vscale x 4 x i32>* %off256
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %noff256, <vscale x 4 x i32>* %p
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %noff256, <vscale x 4 x i32>* %off255
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %noff255, <vscale x 4 x i32>* %noff256
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %noff256, <vscale x 4 x i32>* %off256
+define void @gep_2048(ptr %p) {
+ %off255 = getelementptr i8, ptr %p, i64 255
+ %noff255 = getelementptr i8, ptr %p, i64 -255
+ %off256 = getelementptr i8, ptr %p, i64 256
+ %noff256 = getelementptr i8, ptr %p, i64 -256
+ load <vscale x 4 x i32>, ptr %p
+ load <vscale x 4 x i32>, ptr %off255
+ load <vscale x 4 x i32>, ptr %noff255
+ load <vscale x 4 x i32>, ptr %off256
+ load <vscale x 4 x i32>, ptr %noff256
+ ret void
+}
+
+; CHECK-LABEL: gep_2048_vscalerange
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %off255, <vscale x 4 x i32>* %p
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %noff255, <vscale x 4 x i32>* %p
+; CHECK-DAG: NoAlias: <vscale x 4 x i32>* %noff255, <vscale x 4 x i32>* %off255
+; CHECK-DAG: NoAlias: <vscale x 4 x i32>* %off256, <vscale x 4 x i32>* %p
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %off255, <vscale x 4 x i32>* %off256
+; CHECK-DAG: NoAlias: <vscale x 4 x i32>* %noff255, <vscale x 4 x i32>* %off256
+; CHECK-DAG: NoAlias: <vscale x 4 x i32>* %noff256, <vscale x 4 x i32>* %p
+; CHECK-DAG: NoAlias: <vscale x 4 x i32>* %noff256, <vscale x 4 x i32>* %off255
+; CHECK-DAG: MayAlias: <vscale x 4 x i32>* %noff255, <vscale x 4 x i32>* %noff256
+; CHECK-DAG: NoAlias: <vscale x 4 x i32>* %noff256, <vscale x 4 x i32>* %off256
+define void @gep_2048_vscalerange(ptr %p) vscale_range(1,16) {
+ %off255 = getelementptr i8, ptr %p, i64 255
+ %noff255 = getelementptr i8, ptr %p, i64 -255
+ %off256 = getelementptr i8, ptr %p, i64 256
+ %noff256 = getelementptr i8, ptr %p, i64 -256
+ load <vscale x 4 x i32>, ptr %p
+ load <vscale x 4 x i32>, ptr %off255
+ load <vscale x 4 x i32>, ptr %noff255
+ load <vscale x 4 x i32>, ptr %off256
+ load <vscale x 4 x i32>, ptr %noff256
+ ret void
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This is a separate, but related issue to llvm#69152 that was attempting to improve AA with scalable dependency distances. This patch attempts to improve when there are scalable accesses with a constant offset between them. We happen to get a report of such a thing recently, where so long as the vscale_range is known, the maximum size of the access can be assessed and better aliasing results can be returned. The Upper range of the vscale_range, along with known part of the typesize are used to prove that Off >= CR.upper * LSize. It does not try to produce PartialAlias results at the moment from the lower vscale_range.
4356c6e
to
7805c3e
Compare
This is a separate, but related issue to llvm#69152 that was attempting to improve AA with scalable dependency distances. This patch attempts to improve when there are scalable accesses with a constant offset between them. We happen to get a report of such a thing recently, where so long as the vscale_range is known, the maximum size of the access can be assessed and better aliasing results can be returned. The Upper range of the vscale_range, along with known part of the typesize are used to prove that Off >= CR.upper * LSize. It does not try to produce PartialAlias results at the moment from the lower vscale_range. It also enables the added benefit of allowing better alias analysis when the RHS of the two values is scalable, but the LHS is normal and can be treated like any other aliasing query.
This commit adds support for the `vscale_range()` LLVM function attribute to be generated for SVE and SME targets. Some LLVM optimisation passes make use of the `vscale_range()` function attribute when scalable vectors are present (e.g. BasicAA llvm/llvm-project/pull/80445), so we include it alongside the "target_cpu" and "target-features" attributes.
…6962) This commit adds support for the `vscale_range()` LLVM function attribute to be generated for SVE and SME targets. Some LLVM optimisation passes make use of the `vscale_range()` function attribute when scalable vectors are present (e.g. BasicAA llvm/llvm-project/pull/80445), so we include it alongside the "target_cpu" and "target-features" attributes.
This is a separate, but related issue to #69152 that was attempting to improve
AA with scalable dependency distances. This patch attempts to improve when
there are scalable accesses with a constant offset between them. We happen to
get a report of such a thing recently, where so long as the vscale_range is
known, the maximum size of the access can be assessed and better aliasing
results can be returned.
The Upper range of the vscale_range, along with known part of the typesize are
used to prove that Off >= CR.upper * LSize. It does not try to produce
PartialAlias results at the moment from the lower vscale_range. It also enables
the added benefit of allowing better alias analysis when the RHS of the two
values is scalable, but the LHS is normal and can be treated like any other
aliasing query.