Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang][hlfir] do not consider local temps as conflicting in assignment #113330

Merged
merged 1 commit into from
Oct 23, 2024

Conversation

jeanPerier
Copy link
Contributor

Last patch required to avoid creating a temporary for the LHS when dealing with x([a,b]) = y.

The code dealing with "ordered assignments" (where, forall, user and vector subscripted assignments) is saving the evaluated RHS/LHS and masks if they have write effects because this write effects should not be evaluated when they affect entities that may be written to in other contexts after the evaluation and before the re-evaluation.

But when dealing with write to storage allocated in the region for the expression being evluated, there is no problem to re-evaluate the write: it has no effect outside of the expression evaluation that owns the allocation.

In the case of x([a,b]) = y, the temporary is created for the vector subscript. Raising the HLFIR abstraction for simple array constructors may be a good idea, but local temps are created in other contexts, so this fix is more generic.

@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Oct 22, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 22, 2024

@llvm/pr-subscribers-flang-fir-hlfir

Author: None (jeanPerier)

Changes

Last patch required to avoid creating a temporary for the LHS when dealing with x([a,b]) = y.

The code dealing with "ordered assignments" (where, forall, user and vector subscripted assignments) is saving the evaluated RHS/LHS and masks if they have write effects because this write effects should not be evaluated when they affect entities that may be written to in other contexts after the evaluation and before the re-evaluation.

But when dealing with write to storage allocated in the region for the expression being evluated, there is no problem to re-evaluate the write: it has no effect outside of the expression evaluation that owns the allocation.

In the case of x([a,b]) = y, the temporary is created for the vector subscript. Raising the HLFIR abstraction for simple array constructors may be a good idea, but local temps are created in other contexts, so this fix is more generic.


Full diff: https://github.com/llvm/llvm-project/pull/113330.diff

3 Files Affected:

  • (modified) flang/lib/Optimizer/HLFIR/Transforms/ScheduleOrderedAssignments.cpp (+24-8)
  • (added) flang/test/HLFIR/order_assignments/vector-subscripts-scheduling.fir (+31)
  • (modified) flang/test/HLFIR/order_assignments/where-scheduling.f90 (+3)
diff --git a/flang/lib/Optimizer/HLFIR/Transforms/ScheduleOrderedAssignments.cpp b/flang/lib/Optimizer/HLFIR/Transforms/ScheduleOrderedAssignments.cpp
index efe59b8ef988e6..5971b5b9d76a0e 100644
--- a/flang/lib/Optimizer/HLFIR/Transforms/ScheduleOrderedAssignments.cpp
+++ b/flang/lib/Optimizer/HLFIR/Transforms/ScheduleOrderedAssignments.cpp
@@ -347,12 +347,23 @@ conflict(llvm::ArrayRef<mlir::MemoryEffects::EffectInstance> effectsA,
          anyRAWorWAW(effectsB, effectsA, aliasAnalysis);
 }
 
-/// Could there be any write effects in "effects"?
+/// Could there be any write effects in "effects" affecting memory storages
+/// that are not local to the current region.
 static bool
-anyWrite(llvm::ArrayRef<mlir::MemoryEffects::EffectInstance> effects) {
+anyNonLocalWrite(llvm::ArrayRef<mlir::MemoryEffects::EffectInstance> effects,
+                 mlir::Region &region) {
   return llvm::any_of(
-      effects, [](const mlir::MemoryEffects::EffectInstance &effect) {
-        return mlir::isa<mlir::MemoryEffects::Write>(effect.getEffect());
+      effects, [&region](const mlir::MemoryEffects::EffectInstance &effect) {
+        if (mlir::isa<mlir::MemoryEffects::Write>(effect.getEffect())) {
+          if (mlir::Value v = effect.getValue()) {
+            v = getStorageSource(v);
+            if (v.getDefiningOp<fir::AllocaOp>() ||
+                v.getDefiningOp<fir::AllocMemOp>())
+              return !region.isAncestor(v.getParentRegion());
+          }
+          return true;
+        }
+        return false;
       });
 }
 
@@ -393,9 +404,13 @@ void Scheduler::saveEvaluationIfConflict(mlir::Region &yieldRegion,
     if (entity && hlfir::isFortranVariableType(entity->get().getType()))
       effects.emplace_back(mlir::MemoryEffects::Read::get(), entity);
   }
-  if (!leafRegionsMayOnlyRead && anyWrite(effects)) {
-    // Region with write effect must be executed only once: save it the first
-    // time it is encountered.
+  if (!leafRegionsMayOnlyRead && anyNonLocalWrite(effects, yieldRegion)) {
+    // Region with write effect must be executed only once (unless all writes
+    // affect storages allocated inside the region): save it the first time it
+    // is encountered.
+    LLVM_DEBUG(llvm::dbgs()
+                   << "saving eval because write effect prevents re-evaluation"
+                   << "\n";);
     saveEvaluation(yieldRegion, effects, /*anyWrite=*/true);
   } else if (conflict(effects, assignEffects)) {
     // Region that conflicts with the current assignments must be fully
@@ -411,7 +426,8 @@ void Scheduler::saveEvaluationIfConflict(mlir::Region &yieldRegion,
     // For example, a WHERE mask might be written by the masked assignment
     // evaluations, and it has to be saved in this case:
     //   where (mask) r = f() ! function f modifies mask
-    saveEvaluation(yieldRegion, effects, anyWrite(effects));
+    saveEvaluation(yieldRegion, effects,
+                   anyNonLocalWrite(effects, yieldRegion));
   } else {
     // Can be executed while doing the assignment.
     independentEvaluationEffects.append(effects.begin(), effects.end());
diff --git a/flang/test/HLFIR/order_assignments/vector-subscripts-scheduling.fir b/flang/test/HLFIR/order_assignments/vector-subscripts-scheduling.fir
new file mode 100644
index 00000000000000..b76492efe2a562
--- /dev/null
+++ b/flang/test/HLFIR/order_assignments/vector-subscripts-scheduling.fir
@@ -0,0 +1,31 @@
+// Test local alloca and store inside hlfir.region_assign do not trigger the
+// creation of a temporary for the LHS.
+
+// RUN: fir-opt -o - -lower-hlfir-ordered-assignments --debug-only=flang-ordered-assignment -flang-dbg-order-assignment-schedule-only %s 2>&1 | FileCheck %s
+// REQUIRES: asserts
+
+func.func @simple(%arg0: !fir.ref<!fir.array<100x10xf32>> , %arg1: !fir.ref<!fir.array<10xi64>> , %arg2: !fir.ref<!fir.array<10xf32>>, %i: i64, %f: f32) {
+  %c10 = arith.constant 10 : index
+  %c100 = arith.constant 100 : index
+  %0 = fir.shape %c100, %c10 : (index, index) -> !fir.shape<2>
+  %1:2 = hlfir.declare %arg0(%0) {uniq_name = "_QFsimpleEx"} : (!fir.ref<!fir.array<100x10xf32>>, !fir.shape<2>) -> (!fir.ref<!fir.array<100x10xf32>>, !fir.ref<!fir.array<100x10xf32>>)
+  %2 = fir.shape %c10 : (index) -> !fir.shape<1>
+  %3:2 = hlfir.declare %arg1(%2) {uniq_name = "y"} : (!fir.ref<!fir.array<10xi64>>, !fir.shape<1>) -> (!fir.ref<!fir.array<10xi64>>, !fir.ref<!fir.array<10xi64>>)
+  hlfir.region_assign {
+    hlfir.yield %f : f32
+  } to {
+    %local_temp = fir.alloca i64
+    fir.store %i to %local_temp : !fir.ref<i64>
+    %icopy = fir.load %local_temp : !fir.ref<i64>
+    hlfir.elemental_addr %2 : !fir.shape<1> {
+    ^bb0(%arg3: index):
+      %5 = hlfir.designate %3#0 (%arg3)  : (!fir.ref<!fir.array<10xi64>>, index) -> !fir.ref<i64>
+      %6 = fir.load %5 : !fir.ref<i64>
+      %7 = hlfir.designate %1#0 (%icopy, %6)  : (!fir.ref<!fir.array<100x10xf32>>, i64, i64) -> !fir.ref<f32>
+      hlfir.yield %7 : !fir.ref<f32>
+    }
+  }
+  return
+}
+
+// CHECK: run 1 evaluate: region_assign
diff --git a/flang/test/HLFIR/order_assignments/where-scheduling.f90 b/flang/test/HLFIR/order_assignments/where-scheduling.f90
index ab87ae92de5799..3010476d4a188f 100644
--- a/flang/test/HLFIR/order_assignments/where-scheduling.f90
+++ b/flang/test/HLFIR/order_assignments/where-scheduling.f90
@@ -135,6 +135,7 @@ end function f
 !CHECK-NEXT: run 2 evaluate: where/region_assign1
 !CHECK-LABEL: ------------ scheduling where in _QPonly_once ------------
 !CHECK-NEXT: unknown effect: %{{[0-9]+}} = llvm.intr.stacksave : !llvm.ptr
+!CHECK-NEXT: saving eval because write effect prevents re-evaluation
 !CHECK-NEXT: run 1 save  (w): where/mask
 !CHECK-NEXT: run 2 evaluate: where/region_assign1
 !CHECK-NEXT: run 3 evaluate: where/region_assign2
@@ -180,6 +181,7 @@ end function f
 !CHECK-NEXT: conflict: R/W: %{{.*}} = hlfir.declare %{{.*}} {uniq_name = "_QFwhere_construct_unknown_conflictEmask"} : (!fir.box<!fir.array<?x!fir.logical<4>>>, !fir.dscope) -> (!fir.box<!fir.array<?x!fir.logical<4>>>, !fir.box<!fir.array<?x!fir.logical<4>>>) W:<unknown>
 !CHECK-NEXT: run 1 save    : where/mask
 !CHECK-NEXT: unknown effect: %{{.*}} = fir.call @_QPf() fastmath<contract> : () -> f32
+!CHECK-NEXT: saving eval because write effect prevents re-evaluation
 !CHECK-NEXT: run 2 save  (w): where/region_assign1/rhs
 !CHECK-NEXT: run 3 evaluate: where/region_assign1
 !CHECK-NEXT: ------------ scheduling where in _QPelsewhere_construct_unknown_conflict ------------
@@ -190,5 +192,6 @@ end function f
 !CHECK-NEXT: conflict: R/W: %{{.*}} = hlfir.declare %{{.*}} {uniq_name = "_QFelsewhere_construct_unknown_conflictEmask2"} : (!fir.box<!fir.array<?x!fir.logical<4>>>, !fir.dscope) -> (!fir.box<!fir.array<?x!fir.logical<4>>>, !fir.box<!fir.array<?x!fir.logical<4>>>) W:<unknown>
 !CHECK-NEXT: run 2 save    : where/elsewhere1/mask
 !CHECK-NEXT: unknown effect: %{{.*}} = fir.call @_QPf() fastmath<contract> : () -> f32
+!CHECK-NEXT: saving eval because write effect prevents re-evaluation
 !CHECK-NEXT: run 3 save  (w): where/elsewhere1/region_assign1/rhs
 !CHECK-NEXT: run 4 evaluate: where/elsewhere1/region_assign1

Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the optimization!

Copy link
Contributor

@ashermancinelli ashermancinelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jeanPerier jeanPerier merged commit a59f712 into llvm:main Oct 23, 2024
11 checks passed
@jeanPerier
Copy link
Contributor Author

Thanks for the reviews, there should be a visible speed-up on polyhedron test_fpu2 benchmark with these patches.

@frobtech frobtech mentioned this pull request Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants