-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PGO: Guarded isinst #55907
PGO: Guarded isinst #55907
Conversation
Looks like there is a superfluous null-check in the hot path now, can we get rid of it? |
I'll see what I can do there, but for now I'm mostly interested in perf numbers because without proper benefits these additional class probes will be a useless overhead for tier0. |
3defb45
to
bedbd5c
Compare
it inserts a fast-path for |
Btw, we already even in Default mode expand casts to non-sealed classes with a quick fast-path that can be wrong - PGO can help here as well, let me add |
called from
It seems this benchmark is improved by this PR:
|
What do you mean that it can be wrong? I do not see anything wrong with it. |
What is our test coverage for these PGO-specific expansions? |
There are nightly runtime outerloop and weekend libraries tests under PGO:
We also run TE scenarios nightly via ASP.NET team's harness. |
@jkotas By "wrong" I mean the fast path is not taken, and even creates some overhead in cases where object is actually a subclass. Example (Default mode, Main): We have a fast path for |
@EgorBo any idea how many class probes this adds? Seems like (as we discussed elsewhere) many of these will be redundant with existing class probes, and I wonder whether we should hold off on this until we have some idea how to better address/exploit that redundancy. Note we currently need to produce the same schema for both static and dynamic PGO, so if we enable this for dynamic PGO we'll also enable it for static PGO and thus potentially increase the size of the profile data in SPC and similar. |
Good. Do we have an idea whether the hot paths in these tests provide good enough functional coverage? Should we be adding these optional PGO-specific expansions to JIT stress or something like that so that we get larger functional matrix coverage for them? |
Not sure yet, I was mostly interested in perf numbers, I've ran most of the TE benchmarks, a few of them look slightly improved, but I expected a bit more. I still see various CastHelpers in the hot paths, but I suspect those are connected with shared generics and this PR just gives up when it sees them. So I probably will close it for now, will try to guess a likelyclass from virtual calls inside blocks dominated by isinst/casts in future. PS: Plaintext-MVC is definitely improved by 2.3%, I've just had another round of 8 iterations of Main vs PR and the improvements are stable again. |
There was an idea to invoke "Main", say, 100 times in the test runner for runtime tests (the ones without static states) or maybe even in tests for Libs. Prototype: #52874 Currently a very little of them make it to tier1 without TC=0. |
Until recently the jit had more or less the same logic with static and dynamic PGO. Static PGO data is pervasive in our assemblies now so we get some PGO coverage all the time just from that. Coverage for dynamic PGO alone is not great (and similarly, it must be said, for tiered compilation in general). Last I looked in coreclr outlerloop only about 2500 methods make it to Tier1 with default tiering options. Libraries may fare better ... at least the orchestration code probably gets a good workout. For TE runs we get ~10K methods jitted with dynamic PGO:
I have an experimental harness I started working up which repeatedly invokes There are ways to drive tests with out of band PGO data (sideloaded text format pgo) but I haven't tried leveraging this much during testing. It's not clear how durable the PGO data would be. We also have some stress randomization capabilities within PGO in the jit, but they're not enabled in testing yet. We have not tried running PGO + JitStress or PGO + GcStress. We probably should (at least the latter). |
Closes #55325
Insert class probes for all
isinst
andcastclass
opcodes so we can later insert a fast path for a class that showed up in PGO data.Codegen for
Test
diff: https://www.diffchecker.com/4pbk6Dls (DOTNET_TieredPGO=1
)I'll check if it has any affect on TE benchmarks.
/cc @dotnet/jit-contrib @AndyAyersMS