Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-revert "Fold overflow operations in value numbering" #51440

Conversation

SingleAccretion
Copy link
Contributor

@SingleAccretion SingleAccretion commented Apr 17, 2021

This PR reintroduces changes from #50450, except for folding of the helpers for casts - both the VN part and morph part have been left out.

I have manually verified that test cases from #51381, #51346 and #51380 are no longer reproducible with this version of the change, but a full run of the legs that run these tests would be required before this can be merged.

I have decided to reintroduce one part of #50450 first instead of going straight for the removal of folding for problematic cases because I think the changes in #50450 around gtFoldExprConst & CheckedOps will be helpful when addressing that issue.

The code has been left as-is expect for some folding of formatting fixups into their respective commits, two /*printResult*/ comments, and moving commits disabling the test on Mono and adding vartypesdef.h earlier in the commit chain.

Draft pending green regular CI and verification that diffs are as expected.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 17, 2021
@SingleAccretion SingleAccretion force-pushed the Fold-Overflow-Operations-In-Value-Numbering-No-Helpers branch 2 times, most recently from fba8d71 to 892ff63 Compare April 17, 2021 17:25
@SingleAccretion
Copy link
Contributor Author

Diffs are as expected, no changes with respect to #50450 modulo expected (now not folded) cases in tests.

@SingleAccretion
Copy link
Contributor Author

Failures look unrelated, so switching as ready for review.

@BruceForstall
Copy link
Member

@dotnet/jit-contrib PTAL

@AndyAyersMS
Copy link
Member

@sandreenko can you review this?

@sandreenko sandreenko self-requested a review May 10, 2021 17:38
@AndyAyersMS
Copy link
Member

@sandreeko let me know if you can't get to this soon....

@SingleAccretion SingleAccretion force-pushed the Fold-Overflow-Operations-In-Value-Numbering-No-Helpers branch from 892ff63 to 99a9829 Compare May 22, 2021 11:40
@SingleAccretion
Copy link
Contributor Author

Rebased to resolve a conflict.

Co-authored-by: Anton Lapounov <anton.lapounov@microsoft.com>
@SingleAccretion SingleAccretion force-pushed the Fold-Overflow-Operations-In-Value-Numbering-No-Helpers branch from ada5b61 to 610b612 Compare May 24, 2021 23:35
@sandreenko
Copy link
Contributor

/azp run runtime-coreclr jitstress, runtime-coreclr outerloop, runtime-libraries-coreclr outerloop, runtime-coreclr libraries-jitstress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

Copy link
Contributor

@sandreenko sandreenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I like the new code better, and if testing does not show anything bad that I am good with this change.

src/coreclr/jit/valuenum.cpp Outdated Show resolved Hide resolved
Copy link
Member

@AntonLapounov AntonLapounov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for incorporating feedback. I have also some suggestions on the long comment in utils.cpp.

src/coreclr/jit/utils.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/utils.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/utils.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/utils.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/utils.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/utils.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/utils.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/utils.cpp Show resolved Hide resolved
AntonLapounov and others added 4 commits May 25, 2021 00:00
Co-authored-by: Anton Lapounov <anton.lapounov@microsoft.com>
The method is not prepared to handle them.
Also add a note about that to the header.
Also delete TODO-Review about it.

Right now the only caller of VNEvalShouldFold guards against
TYP_BYREF folding, so this assert is a safety measure against
future callers not taking byrefs into account.
@SingleAccretion
Copy link
Contributor Author

SingleAccretion commented May 25, 2021

Despite not all stress/outerloop legs having the chance to finish there were (I have to say, quite) some failures. Some look like known issues (#53042, #52954), others I do not understand (Discovered: System.Runtime.Handles.Tests (found 0 of 14 test cases) - consequence of the job being cancelled?), yet others look like something I should investigate (System.Linq.Expressions.Tests: Stack overflow.) - which I will do.

@sandreenko
Copy link
Contributor

Discovered: System.Runtime.Handles.Tests (found 0 of 14 test cases) - consequence of the job being cancelled?

Yes.

Let's start them again and wait for the results.

@sandreenko
Copy link
Contributor

/azp run runtime-coreclr jitstress, runtime-coreclr outerloop, runtime-libraries-coreclr outerloop, runtime-coreclr libraries-jitstress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@SingleAccretion
Copy link
Contributor Author

SingleAccretion commented May 25, 2021

Starting to classify the failures:

runtime-coreclr jitstress:
1. Libraries Test Run checked coreclr Linux x64 Release - segfault, could be related.
2. Libraries Test Run checked coreclr windows arm64 Release - CodeDom tests fail with NREs. Looks like #53042.

runtime-coreclr outerloop:
3. A lot of structreturn failures, look like #52954.

runtime-libraries-coreclr outerloop
4. System.Linq.Expressions.Tests fail with SO (see below).
Reproduces on Windows x64, Checked Debug Jit (with TC off - does not matter because dynamic methods are always Tier1). Does not Does reproduce on the base runtime (I was wrongly testing Checked base). Turns out this test has a history: #21374. Looks very possible that this PR has in some way increased the stack space required to compile the method. Working hypothesis: gtFoldExprConst, due to the new calls, now requires more stack space. That said, it is puzzling it fails so readily - the test only creates ~200 nodes, 100 calls to gtFoldExprConst, and apparently reserves 128 KB of stack.

Some further investigation appears not to support this. I made a method just like one in the test (100 adds that form a deep tree with 200 nodes), and tested the following code:

var t = new Thread(x => RuntimeHelpers.PrepareMethod(type.GetMethod("Problem", flags).MethodHandle), X * 1024);
t.Start();

For both base and diff, if X is less than 321, SO happens, and does not happen otherwise.

5. A bunch of Discovered: ... (found 0 of X test cases) type failures (I double-checked that I am looking at the restarted run). I think the cause is there is no category=OuterLoops for these tests - ?
6. Libraries Build Linux x64 Debug, System.Runtime.Tests.dll, ./RunTests.sh: line 162: 1193 Killed <the test process> - puzzling.

@sandreenko
Copy link
Contributor

sandreenko commented May 25, 2021

there are strange spmi failures on arm64 apple (link ) that I did not see before, will take a look.
Edit: it was failing before, not related

runtime-coreclr outerloop:
3. A lot of structreturn failures, look like #52954.

it was just resolved, lets trigger the job again to clear the output.

  1. A bunch of Discovered: ... (found 0 of X test cases) type failures (I double-checked that I am looking at the restarted run). I think the cause is there is no category=OuterLoops for these tests - ?

could you give me an example?

@sandreenko
Copy link
Contributor

/azp run runtime-coreclr jitstress, runtime-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@SingleAccretion
Copy link
Contributor Author

SingleAccretion commented May 25, 2021

@SingleAccretion
Copy link
Contributor Author

it was just resolved, lets trigger the job again to clear the output.

Hmm, I think I need to rebase for that one, don't I?

@sandreenko
Copy link
Contributor

sandreenko commented May 25, 2021

Hmm, I think I need to rebase for that one, don't I?

I could be wrong but I expect that runs that are triggered via /azp run are merged with the current main head, just like the default runtime-dev-innerloop.

From the log:
https://dev.azure.com/dnceng/public/_build/results?buildId=1156405&view=logs&j=e2176d62-3fb6-5179-e340-9cc434b4aa5a&t=a3ddac58-ea20-55ce-a88e-6de49e130568&l=596
Merge 7027588 into 959c327

@sandreenko
Copy link
Contributor

Ok, so all runtime-coreclr outerloop/jitstress failures are not related (#53267 for outerloop).

@SingleAccretion
Copy link
Contributor Author

SingleAccretion commented May 26, 2021

Final bucketing for failures of libraries tests.

runtime-libraries-coreclr outerloop:

1. S.L.E SO is a persistent failure. I looked at the pipeline, and it failed here, however, did not fail here, on 21st, suggesting something introduced it recently, but it should not be this PR (I am still looking at it).
2. Failures for Libraries Build OSX arm64 Debug all have the same pattern: 0 tests discovered and python crashes at the end: log 1, log 2, log 3.
3. Libraries Build Linux x64 Debug:

  • More 0 tests + python crashed failures: log 1, log 2.
  • System.Runtime.Tests killed for OOM: log. Identical failure also seen here - log.
  • System.Runtime.Tests killed: log. Too seen here: log

runtime-coreclr libraries-jitstress:

1. CodeDom tests fail with NREs - #53042.
2. Libraries Test Run checked coreclr Linux x64 Release: System.Runtime.Tests fail with a segfault (log). Also seen in a more recent run.

@sandreenko
Copy link
Contributor

Thanks @SingleAccretion for the detailed analysis and addressing the feedback.
I am going to merge it now, we should not have a global issue as we had after the first merge so even if something pops up later we should be able to disable individual tests and resolve the issues without reverting this work.

@sandreenko sandreenko merged commit 9603754 into dotnet:main May 26, 2021
@SingleAccretion SingleAccretion deleted the Fold-Overflow-Operations-In-Value-Numbering-No-Helpers branch June 8, 2021 01:33
@ghost ghost locked as resolved and limited conversation to collaborators Jul 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants