Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Relax reg-optionality validity checks #81614

Merged
merged 6 commits into from
Feb 13, 2023

Conversation

jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Feb 3, 2023

This PR does two things:

  1. Rename the existing IsSafeToContainMem into IsInvariantInRange, and use IsInvariantInRange in the places where we are using IsSafeToContainMem today as a general-purpose "can move" check. IsSafeToContainMem remains and just forwards to IsInvariantInRange. We still use it for the cases where we're actually containing a load. This is follow-up to Add lowering support for conditional nodes #71705 (comment) that I promised a long time ago.
  2. Relax reg-optionality checking. The checks we have today are much more stringent than they need to be, doing a full (and expensive) invariance check. Let me copy/paste my comment from the new IsSafeToMarkRegOptional about the checking we actually need to do:
//    Unlike containment, reg-optionality can only rarely introduce new
//    conflicts, because reg-optionality mostly does not cause the child node
//    to be evaluated at a new point in time:
//
//    1. For LIR edges (i.e. anything that isn't GT_LCL_VAR) reg-optionality
//       indicates that if the edge was spilled to a temp at its def, the parent
//       node can use it directly from its spill location without reloading it
//       into a register first. This is always safe as as spill temps cannot
//       interfere.
//
//       For example, an indirection can be marked reg-optional even if there
//       is interference between it and its parent; the indirection will still
//       be evaluated at its original position, but if the value is spilled to
//       stack, then reg-optionality can allow using the value from the spill
//       location directly.
//
//    2. For GT_LCL_VAR reg-optionality indicates that the node can use the
//       local directly from its home location. IR invariants guarantee that the
//       local is not defined between its LIR location and the parent node (see
//       CheckLclVarSemanticsHelper). That means the only case where it could
//       interfere is due to it being address exposed. So this is the only unsafe
//       case.
//

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 3, 2023
@ghost ghost assigned jakobbotsch Feb 3, 2023
@ghost
Copy link

ghost commented Feb 3, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

null

Author: jakobbotsch
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@jakobbotsch
Copy link
Member Author

/azp run Fuzzlyn

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jakobbotsch
Copy link
Member Author

/azp run Fuzzlyn

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jakobbotsch
Copy link
Member Author

This will conflict with #81267 so will wait for that one to get in first. It should still be ready for review though, cc @dotnet/jit-contrib PTAL @BruceForstall @kunalspathak

//
bool Lowering::IsSafeToMarkRegOptional(GenTree* parentNode, GenTree* childNode) const
{
if (!childNode->OperIs(GT_LCL_VAR))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a rather subtle difference between LCL_VAR and LCL_FLD that LCL_FLDs, if marked reg optional (which they won't be under normal circumstances), will use spill temps. Something that would be good to write down explicitly I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add it with the "indirection" example above? I suppose it's the exact same example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, just mentioning it would be good.

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// true if it is safe to make childNode a contained memory operand.
// Returns:
// True if 'node' can be evaluated at any point between its current
// location and 'parentNode' without giving a different result; otherwise
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// location and 'parentNode' without giving a different result; otherwise
// location and 'endExclusive' without giving a different result; otherwise

@jakobbotsch
Copy link
Member Author

The Fuzzlyn failure was #75442.

@jakobbotsch jakobbotsch marked this pull request as ready for review February 9, 2023 13:42
@jakobbotsch
Copy link
Member Author

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress, Fuzzlyn, runtime-coreclr jitstressregs, runtime-coreclr libraries-jitstressregs

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

{
assert((node != nullptr) && (endExclusive != nullptr));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any invariant case we should handle here? That is, the prior handles the linear order being node, endExclusive. Should this handle node, ignoreNode, endExclusive?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add an additional assert about this

// interfere is due to it being address exposed. So this is the only unsafe
// case.
//
bool Lowering::IsSafeToMarkRegOptional(GenTree* parentNode, GenTree* childNode) const
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is any of this optimization that should also apply to IsSafeToContainMem?

That is, iirc there is a "pessimization" for the InterferesWith check today with regards to locals in that any write is considered interfering, even if its to another local and therefore cannot impact the local being read.

However, based on https://github.com/dotnet/runtime/blob/main/docs/design/specs/Memory-model.md I believe we are allowed to reorder such a non-volatile read of a local assuming there are no volatile writes.

Copy link
Member

@tannergooding tannergooding Feb 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe what I'm remembering is just the fact that we couldn't mark it reg-optional and the fix is exactly what is being handled here, not something to the general IsSafeToContainMem

Copy link
Member Author

@jakobbotsch jakobbotsch Feb 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think most of this applies to IsSafeToContainMem, the problem there is really that it causes evaluation to happen at a new point in time, which brings along with it all of the extra checking. W.r.t the locals, from looking at AliasSet::InterferesWith it looks like we are pretty precise:

//------------------------------------------------------------------------
// AliasSet::InterferesWith:
// Returns true if the reads and writes in this alias set interfere
// with the given alias set.
//
// Two alias sets interfere under any of the following conditions:
// - Both sets write to any addressable location (e.g. the heap,
// address-exposed locals)
// - One set reads any addressable location and the other set writes
// any addressable location
// - Both sets write to the same lclVar
// - One set writes to a lclVar that is read by the other set
//
// Arguments:
// other - The other alias set.
//
bool AliasSet::InterferesWith(const AliasSet& other) const

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SingleAccretion managed to find the discussion we had it some time back on Discord

The general scenario I was thinking about is: #76273 (comment)

Many diffs are cases like the following:

- add     x0, x0, w26, UXTW
- ldrb    w0, [x0]
+ ldrb    w0, [x0, w26, UXTW #2]

These seem to mostly be due to ADDEX not being properly handled in many places while ADD always is.


There are a few small regressions such as:

+ lsl     w0, w0, #8
  ldr     x1, [fp, #0xA8]	// [V192 tmp170]
  ldrb    w1, [x1, #0x01]
- add     w1, w1, w0,  LSL #8
+ add     w1, w0, w1

Notably we're checking if childNode->gtGetOp1() (the value to be shifted) is contained, the original doesn't (but it shouldn't be anyways since we need a register for it).

We're also checking IsSafeToContainMem(parentNode, childNode) where-as the original wasn't. I expect this is the "actual" reason for the "larger diff" since we always do "strict" checking and we don't allow reordering across anything with a "side effect". That being said, there is no actual interdependence between these two and so it is technically "safe" anyways.

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Feb 9, 2023

The jitstress/jitstressregs failures are all present in the previous rolling run too.

I think there's something wrong with arm64-OSX exception handling in general on current main, I'm seeing a lot of Fuzzlyn failures too. According to those the runtime is crashing after printing "Unhandled exception", just like those failures in jitstress/jitstressregs. (Note that Fuzzlyn catches all exceptions, so this is unexpected.)
@janvorli is this a known issue?

Edit: the symptom is the same as in #81869

@tannergooding
Copy link
Member

outerloop commit range between last success and first failure: f01d5a0...215839e

@janvorli
Copy link
Member

janvorli commented Feb 9, 2023

Sigh, I guess I know what's wrong. My change to prevent unwinding through the bottom of the stack on Alpine is incorrect for secondary threads.

@jakobbotsch
Copy link
Member Author

/azp run Fuzzlyn

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jakobbotsch
Copy link
Member Author

I can reproduce the problematic codegen in the macOS fuzzlyn example on main too. Will open an issue and investigate that separately.

@jakobbotsch jakobbotsch merged commit faab048 into dotnet:main Feb 13, 2023
@jakobbotsch jakobbotsch deleted the relax-reg-optionality branch February 13, 2023 12:45
@ghost ghost locked as resolved and limited conversation to collaborators Mar 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants