Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a simpler representation for array addresses after morph #64581

Merged
merged 14 commits into from
Apr 1, 2022

Conversation

SingleAccretion
Copy link
Contributor

@SingleAccretion SingleAccretion commented Jan 31, 2022

This deletes: the ArrayInfo map, maintenance of non-standard constant index field sequences, maintenance of index labeling flags, GT_ADDR in array addresses post-morph, special INDs that cannot be folded away, and finally, pseudo-fields.

This adds: a new node, GT_ARR_ADDR, to replace all the functionality provided by the above features.

Contributes to #11057. There will be a second change that deletes GT_INDEX, instead importing GT_INDEX_ADDR directly (with an IND/OBJ on top if necessary).

We're expecting a small set of diffs. Almost all will be caused by different costing by gtSetEvalOrder, causing arg sorting / CSE / tail duplication to make different decisions. The rest are due to better COMMA handling in optComputeLoopSideEffectsOfBlock.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 31, 2022
@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Jan 31, 2022
@ghost
Copy link

ghost commented Jan 31, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

This deletes: the ArrayInfo map, the maintenance of non-standard constant index field sequences as well as maintenance of index labeling flags, and removes the need for GT_ADDR to represent array addresses post-morph. It also deletes pseudo-fields.

This adds: a new node, GT_ARR_ADDR, to replace all the functionality provided by the above features.

Contributes to #11057. There will be a second change that deletes GT_INDEX, instead importing GT_INDEX_ADDR (with an IND/OBJ on top if necessary).

We're expected a small set of diffs. Almost all will be caused by different costing by gtSetEvalOrder, causing arg sorting / CSE / tail duplication to make different decisions. The rest are due to better COMMA handling in optComputeLoopSideEffectsOfBlock.

Author: SingleAccretion
Assignees: -
Labels:

area-CodeGen-coreclr, community-contribution

Milestone: -

@SingleAccretion SingleAccretion marked this pull request as ready for review February 1, 2022 18:22
@SingleAccretion
Copy link
Contributor Author

CI failures are #60154 and emscripten-core/emscripten#16164.

Diffs.

@dotnet/jit-contrib

@BruceForstall
Copy link
Member

BruceForstall commented Feb 1, 2022

@SingleAccretion It's clear you have an idea of a large set of related changes that you plan to make over time as part of a long-term plan. E.g., your reference above to removing GT_INDEX. I think it would be worthwhile to identify all these in an issue, maybe just the "remove GT_ADDR" node issue #11057, updated and "renewed" with a checkbox list of work. Or maybe a new issue. That would provide a place to get design feedback before you actually do the work.

@SingleAccretion
Copy link
Contributor Author

@BruceForstall Thank you for the request, I've summarized my current thinking in #11057 (comment). I believe most changes in this area will be fairly straightforward (as in, don't have a design consideration attached to them), this array change is perhaps the exception.

  • So why do I think ARR_ADDR is the "best" way to replace the ADDRs in arrays?

First of all, ARR_ADDR already exists - it's ADDR(IND) where the IND has the ArrayInfo attached to it. ADDR(IND) is required in all but the simplest case of a GT_INDEX of a primitive type. Practically all other cases (ldelema / ldelem<struct>) end up with the ADDR. So from the IR size standpoint we're even or a little better off (this change actually measures negative in memory consumption when CG-ing CoreLib).

From the maintainability point of view, I believe that the explicit nature of the annotation node helps prevent mistakes in its handling - it is notable how IND with ArrayInfo has to be remembered to be special cased "everywhere". It also makes the pattern matching simpler.

The downside is that ARR_ADDR, being a NOP, must (now) be "skipped" in some places. In making this change, I was to find out in how many, and it turns out the count is 1: gtSetEvalOrder. That seems acceptable.

@JulieLeeMSFT
Copy link
Member

Ping @BruceForstall for review.

@BruceForstall
Copy link
Member

I plan to look at this again soon.

We only need to update the side effects, and
gtUpdateNodeSideEffects already takes care of that.

Will avoid copying GTF_NO_CSE unnecessarily.

No diffs.
This preserves the non-faultness annotation for the parent
indir when ADDR(IND(ARR_ADDR)) -> ARR_ADDR folding happends.
We don't need the field sequence or the complex parsing.

A few diffs because previos code didn't handle COMMAs.
(The new code doesn't either, but they are skipped
automatically by the optComputeLoopSideEffectsOfBlock).
Move it to GenTreeArrAddr, stop using ArrayInfo.
FldSeq no loger expected or needed. We will be
attaching it to ARR_ADDR directly in the future.
And associated code. No longer needed, and we
will use a different mechanism for ARR_ADDR.

No diffs.
No longer needed...
@BruceForstall
Copy link
Member

We're expecting a small set of diffs. Almost all will be caused by different costing by gtSetEvalOrder, causing arg sorting / CSE / tail duplication to make different decisions. The rest are due to better COMMA handling in optComputeLoopSideEffectsOfBlock.

Would it be possible to make this a zero diff change, possibly with some "scaffolding", and then make a small(er) follow-up change that removes that scaffolding and has diffs?

@SingleAccretion
Copy link
Contributor Author

Would it be possible to make this a zero diff change, possibly with some "scaffolding", and then make a small(er) follow-up change that removes that scaffolding and has diffs?

I tried my best to achieve that (the way I make most changes, including this one, is by writing the code and then eliminating all diffs from the result), but ultimately quirking gtSetEvalOrder proved hard to the point I "gave up".

The COMMA handling could be quirked, but not sure how much value that has given we'll still have diffs from costs.

I can look into it again if you think that's valuable enough.

@BruceForstall
Copy link
Member

I can look into it again if you think that's valuable enough.

No, the diffs look small enough so it's not worth it.

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

This is an impressive simplification. I love getting rid of "side table" data structures that are outside the IR.

Thanks for the contribution!

@BruceForstall BruceForstall merged commit 779de9b into dotnet:main Apr 1, 2022
@BruceForstall
Copy link
Member

fyi, it looks like this change leads to (temporary) cases of:

ISSUE: <ASSERT> #209701 C:\gh\runtime\src\coreclr\jit\ee_il_dll.cpp (1556) - Assertion failed 'cchStrLen == cchResultStrLen' in '<>c__DisplayClass74_0:<TestVectorB128>b__6():System.Object:this' during 'Do value numbering' (IL size 31; hash 0xa2d0111f; FullOpts)

in SPMI replay. This is because GenTreeArrAddr::ParseArrayAddress() calls typGetObjLayout() in cases where a layout wasn't previously created, for a System.Numerics.Vector1[System.Boolean]type. SoeeGetShortClassName` first asks for the length of a name and doesn't find it, so uses "hackishClassName", then asks for the actual name, and finds a real name.

This will get cleared up when another SPMI collection is done.

@SingleAccretion SingleAccretion deleted the Arr-Addr branch April 4, 2022 11:28
@ghost ghost locked as resolved and limited conversation to collaborators May 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants