Skip to content

Commit

Permalink
egraphs: fix accidental remat of call. (#5726) (#5728)
Browse files Browse the repository at this point in the history
In the provided test case in #5716, the result of a call was then
added to 0. We have a rewrite rule that sets the remat-bit on any add
of a value and a constant, because these frequently appear (e.g. from
address offset calculations) and this can frequently reduce register
pressure (one long-lived base vs. many long-lived base+offset values).
Separately, we have an algebraic rule that `x+0` rewrites to `x`.

The result of this was that we had an eclass with the remat bit set on
the add, but the add was also union'd into the call. We pick the
latter during extraction, because it's cheaper not to do the add at
all; but we still get the remat bit, and try to remat a call (!),
which blows up later.

This PR fixes the logic to look up the "best value" for a value (i.e.,
whatever extraction determined), and look up the remat bit on *that*
node, not the canonical node.

(Why did the canonical node become the iadd and not the call? Because
the former had a lower value-number, as an accident of IR
construction; we don't impose any requirements on the input CLIF's
value-number ordering, and I don't think this breaks any of the
important acyclic properties, even though there is technically a
dependence from a lower-numbered to a higher-numbered node. In essence
one can think of them as having "virtual numbers" in any true
topologically-sorted order, and the only place the actual integer
indices matter should be in choosing the "canonical ID", which is just
used for dedup'ing, modulo this bug.)

Fixes #5716.
  • Loading branch information
cfallin authored Feb 7, 2023
1 parent 2d23331 commit 42ec4c7
Show file tree
Hide file tree
Showing 2 changed files with 51 additions and 10 deletions.
20 changes: 10 additions & 10 deletions cranelift/codegen/src/egraph/elaborate.rs
Original file line number Diff line number Diff line change
Expand Up @@ -278,6 +278,14 @@ impl<'a> Elaborator<'a> {
before
);

// Get the best option; we use `value` (latest
// value) here so we have a full view of the
// eclass.
trace!("looking up best value for {}", value);
let (_, best_value) = self.value_to_best_value[value];
debug_assert_ne!(best_value, Value::reserved_value());
trace!("elaborate: value {} -> best {}", value, best_value);

let remat = if let Some(elab_val) =
self.value_to_elaborated_value.get(&canonical_value)
{
Expand All @@ -287,7 +295,7 @@ impl<'a> Elaborator<'a> {
// from another block. If so, ignore the hit
// and recompute below.
let remat = elab_val.in_block != self.cur_block
&& self.remat_values.contains(&canonical_value);
&& self.remat_values.contains(&best_value);
if !remat {
trace!("elaborate: value {} -> {:?}", value, elab_val);
self.stats.elaborate_memoize_hit += 1;
Expand All @@ -304,20 +312,12 @@ impl<'a> Elaborator<'a> {
// Value not available; but still look up
// whether it's been flagged for remat because
// this affects placement.
let remat = self.remat_values.contains(&canonical_value);
let remat = self.remat_values.contains(&best_value);
trace!(" -> not present in map; remat = {}", remat);
remat
};
self.stats.elaborate_memoize_miss += 1;

// Get the best option; we use `value` (latest
// value) here so we have a full view of the
// eclass.
trace!("looking up best value for {}", value);
let (_, best_value) = self.value_to_best_value[value];
debug_assert_ne!(best_value, Value::reserved_value());
trace!("elaborate: value {} -> best {}", value, best_value,);

// Now resolve the value to its definition to see
// how we can compute it.
let (inst, result_idx) = match self.func.dfg.value_def(best_value) {
Expand Down
41 changes: 41 additions & 0 deletions cranelift/filetests/filetests/egraph/issue-5716.clif
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
;; This tests that a call does not get rematerialized, even if a remat flag is
;; set on a different node in its eclass.
;;
;; Below, `v97` is an add of `v238` (the call's first return value) and a
;; constant 0; a mid-end rule rewrites this to just `v238` (i.e., `v97` is unioned
;; in). Separately, a rule states that an add of a value and a constant always
;; gets rematerialized at use. When `v97` is used in a later block, it would have
;; rematerialized the add; except, if we instead use the result of the call
;; directly, we should *not* remat the call. If we do, a compile error results
;; later.

test compile
set opt_level=speed_and_size
target aarch64

function u0:33() system_v {
ss0 = explicit_slot 32
sig0 = (i64, i64, i64, i64, i64) -> i64, i64 system_v
fn0 = colocated u0:0 sig0
jt0 = jump_table [block36, block38]
block0:
v80 = iconst.i32 0
v91 = iconst.i64 0
v92 = iconst.i64 0
v96 = iconst.i64 0
v235 = iconst.i64 0
v236 = iconst.i64 0
v237 = iconst.i64 0
v238, v239 = call fn0(v236, v237, v91, v92, v235) ; v236 = 0, v237 = 0, v91 = 0, v92 = 0, v235 = 0
v97 = iadd v238, v96 ; v96 = 0
br_table v80, block37, jt0 ; v80 = 0
block36:
trap user0
block37:
trap unreachable
block38:
v98 = load.i8 notrap v97
v99 = fcvt_from_uint.f64 v98
stack_store v99, ss0
trap user0
}

0 comments on commit 42ec4c7

Please sign in to comment.