Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: use synthesis to repair some reconstruction issues #84312

Merged
merged 1 commit into from
Apr 4, 2023

Conversation

AndyAyersMS
Copy link
Member

In particular, run synthesis in repair mode for cases where there are profile counts within the method but zero counts in fgFirstBB.

Recall that sparse profiling effectively probes return blocks to determine the method entry count.

So the zero-entry but not zero-everywhere case can happen if we have a method with a very long running loop plus sparse profiling plus OSR -- we will only get profile counts from the instrumented Tier0 method, and it will never return (instead it will always escape to an OSR version which will eventually return, but that version won't be instrumented).

I originally was a bit more ambitious and ran repair for a broader set of reconstruction issues, but lead to a large number of diffs, in part because repair doesn't cope well with irreducible loops.

Leaving the entry count zero can have fairly disastrous impact on the quality of optimizations done in the method.

Addresses quite a few of the worst-performing benchmarks in #84264.

In particular, run synthesis in repair mode for cases where there are profile
counts within the method but zero counts in `fgFirstBB`.

Recall that sparse profiling effectively probes return blocks to determine the
method entry count.

So the zero-entry but not zero-everywhere case can happen if we have a method
with a very long running loop plus sparse profiling plus OSR -- we will only
get profile counts from the instrumented Tier0 method, and it will never return
(instead it will always escape to an OSR version which will eventually return,
but that version won't be instrumented).

I originally was a bit more ambitious and ran repair for a broader set of
reconstruction issues, but lead to a large number of diffs, in part because
repair doesn't cope well with irreducible loops.

Leaving the entry count zero can have fairly disastrous impact on the quality
of optimizations done in the method.

Addresses quite a few of the worst-performing benchmarks in dotnet#84264.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 4, 2023
@ghost ghost assigned AndyAyersMS Apr 4, 2023
@ghost
Copy link

ghost commented Apr 4, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

In particular, run synthesis in repair mode for cases where there are profile counts within the method but zero counts in fgFirstBB.

Recall that sparse profiling effectively probes return blocks to determine the method entry count.

So the zero-entry but not zero-everywhere case can happen if we have a method with a very long running loop plus sparse profiling plus OSR -- we will only get profile counts from the instrumented Tier0 method, and it will never return (instead it will always escape to an OSR version which will eventually return, but that version won't be instrumented).

I originally was a bit more ambitious and ran repair for a broader set of reconstruction issues, but lead to a large number of diffs, in part because repair doesn't cope well with irreducible loops.

Leaving the entry count zero can have fairly disastrous impact on the quality of optimizations done in the method.

Addresses quite a few of the worst-performing benchmarks in #84264.

Author: AndyAyersMS
Assignees: AndyAyersMS
Labels:

area-CodeGen-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member Author

Handful of diffs expected. If we had a PGO benchmarks collection we'd see quite a few more.

@EgorBo PTAL
cc @dotnet/jit-contrib

@AndyAyersMS AndyAyersMS requested a review from EgorBo April 4, 2023 16:44
Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two questions:

  1. Is it possible today to detect stale static PGO data and should we still use it or discard completely?
  2. Should we enable Dynamic PGO for coreclr_tests.run collection?

@AndyAyersMS
Copy link
Member Author

Two questions:

  1. Is it possible today to detect stale static PGO data and should we still use it or discard completely?

If the flowgraph for the method is quite different (that is, if we fail to find an edge in the flowgraph based on IL offsets from schema entries) we will hit the Mismatch cases and throw out all the data. We only expect this to happen with static PGO data but currently we don't assert that this must be so.

This can mean that some trivial/harmless edits to methods will lead to us tossing usable data. It is possible (though not easy) to build approximate matching algorithms that try and recognize when the graphs have the same shape but not the same identifying marks and use that to propagate the stale data. And I suppose do something simialr for class profiles if their IL offsets shift a bit. But not sure it's worth the trouble.

  1. Should we enable Dynamic PGO for coreclr_tests.run collection?

Yes, we need to enable more PGO driven collections. Benchmarks would be good to have too.

@AndyAyersMS AndyAyersMS merged commit 4b5491e into dotnet:main Apr 4, 2023
@ghost ghost locked as resolved and limited conversation to collaborators May 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants