-
Notifications
You must be signed in to change notification settings - Fork 747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement more TypeGeneralizing transfer functions #6118
Conversation
Current dependencies on/for this PR: This stack of pull requests is managed by Graphite. |
// ) | ||
// | ||
// To fix this, we need to insert an extra basic block encompassing the | ||
// liminal space between where the br_if determines it should take the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
liminal
Good thing I studied for the SATs, so I know that word 😄
src/passes/TypeGeneralizing.cpp
Outdated
void visitMemoryFill(MemoryFill* curr) { WASM_UNREACHABLE("TODO"); } | ||
|
||
void visitGlobalSet(GlobalSet* curr) { | ||
if (auto type = wasm.getGlobal(curr->name)->type; type.isRef()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel strongly, but these long lines always seem less clear to me. I would write it as two lines myself.
} | ||
// The new requirement for the call target. | ||
push(Type(targetReq, Nullable)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite a lot of code for CallRef - can it really not use handleCall?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, handleCall
pops the result type if it is a reference and does nothing with it, but we need it here to determine how much we can generalize the function reference.
|
||
void visitRefTest(RefTest* curr) { | ||
// Do not require anything of the input. | ||
push(Type::none); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't the spec require that the type be compatible, i.e., that there is a possibility for the test to succeed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, such a requirement only exists for br_on_cast
and br_on_cast_fail
, which have input type annotations. ref.cast
and ref.test
can always cast from the top type of the hierarchy.
src/passes/TypeGeneralizing.cpp
Outdated
void visitArrayInitData(ArrayInitData* curr) { | ||
auto type = curr->ref->type.getHeapType(); | ||
if (type.isBottom()) { | ||
// This will be emitted as unreahcalbe. Do not require anything of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// This will be emitted as unreahcalbe. Do not require anything of the | |
// This will be emitted as unreachable. Do not require anything of the |
src/passes/TypeGeneralizing.cpp
Outdated
void visitArrayInitElem(ArrayInitElem* curr) { | ||
auto type = curr->ref->type.getHeapType(); | ||
if (type.isBottom()) { | ||
// This will be emitted as unreahcalbe. Do not require anything of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// This will be emitted as unreahcalbe. Do not require anything of the | |
// This will be emitted as unreachable. Do not require anything of the |
;; CHECK-NEXT: (local.get $var) | ||
;; CHECK-NEXT: ) | ||
;; CHECK-NEXT: ) | ||
(func $call (param $x eqref) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already a test for a call on line 144, before this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will rewrite that previous test to not use a call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, no, the trick with those previous tests is that the call gets DCEd out, so actually they show nothing about how calls are analyzed. I could rewrite them to use tuple.make
instead, but they're much simpler with the call, so I think it would be better to keep them as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sgtm. Maybe add a comment explaining that in those call usages?
;; CHECK-NEXT: (i32.const 0) | ||
;; CHECK-NEXT: ) | ||
;; CHECK-NEXT: ) | ||
(func $call_indirect (param $x eqref) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(func $call_indirect (param $x eqref) | |
(func $call_indirect |
(local $arg i31ref) | ||
;; Require that typeof($f) <: $mid and that typeof($arg) <: eqref. In | ||
;; principle we could generalize $f up to $top, but then we wouldn't be able | ||
;; to generalize $arg at all. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it better to partially generalize both rather than fully generalize $f
and leave $arg
as is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could construct situations in which generalizing the function reference is more beneficial and we could construct situations in which generalizing the parameters is more beneficial. It's unclear which will be more important in practice, so we should definitely experiment with it. For now, splitting the difference seemed like a good conservative starting point.
src/passes/TypeGeneralizing.cpp
Outdated
// Propagate the requirement on our output to our input if we can, and | ||
// otherwise keep the input type unchanged. Unfortunately we cannot fall | ||
// back to requiring nothing of our input because that would not be | ||
// monotonic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow this. No matter how much we generalize the input, we can generalize the cast in lockstep, can't we? I don't see a situation where we end up with the input "too generalized".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Monotonicity here means that the tighter the constraint on the output to the cast, the tighter the constraint we must propagate to the input to the cast. If the constraint on the output gets tighter such that we realize we won't be able to remove the cast, we cannot then loosen the constraint on the input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To check if I follow, does "tighter" means "more refined" here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, even with that I can't seem to follow this. Isn't monotonicity a property of each location? That is, as we perform work, we must change the type of this cast monotonically, and separately the type of the child. Concrete example:
(use-A ;; uses type $A
(ref.cast $B
(block
(ref.cast $C
..
The outer cast (of $B
) is informed that we only need an $A
, so we monotonically change its requirements from the initial top type ("no requirement") to $A. We can still do nothing at all for the block that is its child; it began as top and remains as top, which is also monotonic.
Continuing on, since the block has top type (no requirements reach it) so does the inner cast, so we can modify the casts like this:
(use-A ;; uses type $A
(ref.cast $A
(block
(ref.cast null any
..
Then OptimizeInstructions can remove (edit: fixed this part) the inner cast immediately, and perhaps after other optimizations the outer cast can end up removable as well (unless the source of the data there really does come from a place that needs a cast to $A
).
Did I miss a place where monotonicity was lost here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, monotonicity is a property of the transfer function operating on the single lattice encompassing all program locations. If the input to the transfer function gets higher up the lattice because our constraint on one program location becomes tighter, then if the transfer function responded by loosening the constraint on another program location, then the transfer function would not be monotonic.
Your example doesn't do anything wrong as far as I can tell.
The non-monotonic implementation this comment is warning against is this:
auto req = pop();
if (req != curr->ref->type && Type::isSubType(req, curr->ref->type) {
// We will not be able to remove this cast anyway, so we are free to
// generalize the input.
push(Type::none);
} else {
// We still might be able to remove the cast, but only if we don't generalize
// the input any further than the output.
push(req);
}
In this implementation, as the popped requirement on the output req
gets more refined, the pushed requirement on the input can relax from req
to Type::none
. That implementation might allow more previous casts to be optimized out, but since it is not monotonic, it might also cause the analysis to infinitely loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that code makes it clearer for me. To make sure I follow, that would be non-monotonic in the sense I mentioned, correct? Those pushes could go from pushing increasingly refined types $A, $B, $C
, going downwards more and more, to suddenly pushing none
which is at the very top.
Now that I (hopefully) follow that, I'm still not clear on the code below. As I wrote above, I would just push none
unconditionally here. It seems like you push the LUB in an attempt to avoid a situation where you lose the ability to remove a cast, but I can't see how that can occur: if in fact we become not-refined-enough for a cast, that suggests to me that we removed another cast somewhere else, to achieve that.
Also, it seems worrying to me to try to find some tricky compromise in between always generalizing and sometimes generalizing. It seems simpler to always generalize, period, and then if necessary to do some sort of followup operation to improve things if we need to (but as I said in the previous paragraph, I don't see why we'd need even that).
Is there a simple situation I'm missing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sounds like you're following correctly 👍
Here's a trivial example input where unconditionally pushing none would inhibit a cast optimization:
(func $foo (param $i i32) (result i31ref)
(local $x i31ref)
(ref.cast i31ref
(local.get $x)
)
)
If the cast pushed Type::none
as its input requirement, then we would end up generalizing $x
to anyref
and would no longer be able to optimize out the cast.
A less trivial example might involve a struct.get that feeds into a cast. If we allow the cast input to be generalized too much, then the struct reference we pass to the struct.get might become general enough that the accessed field is no longer a subtype of the cast's required output, so we wouldn't be able to eliminate the cast any more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trivial example is correct, but I'm not sure it's what we should aim to solve. --type-generalizing --local-refining
turns that local into nullref
anyhow.
The less trivial example is more plausible, but it could easily go the other way - if we generalize the struct reference enough we might well remove a cast there, as I said before. I don't have data on this, but my intuition is that this is a zero-sum game (an attempt to improve in one place worsens in another).
The attempt to strike a middle ground is also more complicated. Unless I'm mistaken, we don't have good data pointing to either option, so I'm in favor of the simple one, that is, to maximally generalize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, fair enough. This is a policy decision similar to the one we make for ref.cast. My intuition is that it is better to try to optimize the cast we know about than the one we don't know about, but that may very well turn out to be incorrect. I'll simplify the code here and add a TODO to experiment with it.
FWIW, an advanced version of this analysis could find not just a single constraint for each program location, but rather a set of constraints on program locations for each cast that would allow that cast to be optimized out. We could then feed that system of constraints to a SMT solver to maximize the number of eliminated casts. I doubt this is something we would want to spend time on, but it might make for an interesting intern project some day.
8e50214
to
48074cd
Compare
PTAL at just the "simplify cast constraints" commit for the latest changes. |
;; CHECK-NEXT: ) | ||
(func $ref-cast-more-limited (result nullref) | ||
(local $var i31ref) | ||
;; Do not constraint $var. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
;; Do not constraint $var. | |
;; Do not constrain $var. |
// arguments. Because function parameters are contravariant, generalizing | ||
// the function type inhibits generalizing the arguments and vice versa. | ||
// Attempt to split the difference by generalizing the function type only as | ||
// much as we can without imposing stronger requirements on the arguments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little like RefCast, this seems somewhat speculative to me. Picking one of the two other options would be simpler. I'd pick function parameters myself. However, unlike RefCast I don't have an intuition here about whether the complexity is worth it or not, or even between the options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the differences in code complexity between the options are too large in the grand scheme of things. Let's leave this as-is and make sure to experiment with different choices on real-world programs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
} | ||
auto generalized = generalizeStructType(type, curr->index); | ||
push(Type(generalized, Nullable)); | ||
push(generalized.getStruct().fields[curr->index].type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing is popped here at all - how does this work? This seems to only read from static locations (that are not modified during the analysis).
I would expect to see that changes to the reference affect the value, and changes to the value affect the reference. Can that not work, or is there a TODO here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, struct.set
has no return value, so there is nothing to pop. Since mutable struct fields are invariant, it's always valid to generalize the type of the struct we are setting as long as it still has a field at the correct index; that field will have the same type in the generalized struct type as in the original struct type. We can also generalize the value being set up to the type of the field being assigned to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks. What I was missing is that this is a write, so any supertype with the field is ok to use, and so there aren't really any "dynamic" constraints here. That is, the more the ref is generalized does not help the value generalize, or vice versa.
Finish the transfer functions for all expressions except for string instructions, exception handling instructions, tuple instructions, and branch instructions that carry values. The latter require more work in the CFG builder because dropping the extra stack values happens after the branch but before the target block.
Finish the transfer functions for all expressions except for string
instructions, exception handling instructions, tuple instructions, and branch
instructions that carry values. The latter require more work in the CFG builder
because dropping the extra stack values happens after the branch but before the
target block.