Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Handle some "field offset computation" patterns #81998

Merged
merged 10 commits into from
Apr 4, 2023

Conversation

jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Feb 11, 2023

Both during local morph and during VN.

Fix #40021

Saves 3 KB on BasicMinimalApi after #84095 (there's 1111 __GetFieldHelper functions). About 0.04%. There's still a null check kept for each offset computation, which we cannot really get rid of, but NAOT could maybe emit the IL such that there is a dominating null check so that only one is emitted.

Example:
Base:

.managed:0000000140347CC0 loc_140347CC0:                          ; CODE XREF: S_P_CoreLib_System_Collections_Generic_KeyValuePair_2_System_Net_Security_System_Net_Security_SslSessionsCache_SslCredKey__System___Canon_____GetFieldHelper+23↑j
.managed:0000000140347CC0                                         ; DATA XREF: .rdata:__readonlydata_S_P_CoreLib_System_Collections_Generic_KeyValuePair_2_System_Net_Security_System_Net_Security_SslSessionsCache_SslCredKey__System___Canon_____GetFieldHelper↓o
.managed:0000000140347CC0                 lea     rax, ??_7Boxed_System_Net_Security_System_Net_Security_SslSessionsCache_SslCredKey@@6B@ ; jumptable 0000000140347CB3 case 0
.managed:0000000140347CC7                 mov     [r9], rax
.managed:0000000140347CCA                 cmp     [rcx], cl
.managed:0000000140347CCC                 lea     rax, [rcx+8]
.managed:0000000140347CD0                 sub     rax, rcx
.managed:0000000140347CD3                 add     rsp, 8
.managed:0000000140347CD7                 retn

Diff:

.managed:0000000140347AA0 loc_140347AA0:                          ; CODE XREF: S_P_CoreLib_System_Collections_Generic_KeyValuePair_2_System_Net_Security_System_Net_Security_SslSessionsCache_SslCredKey__System___Canon_____GetFieldHelper+23↑j
.managed:0000000140347AA0                                         ; DATA XREF: .rdata:__readonlydata_S_P_CoreLib_System_Collections_Generic_KeyValuePair_2_System_Net_Security_System_Net_Security_SslSessionsCache_SslCredKey__System___Canon_____GetFieldHelper↓o
.managed:0000000140347AA0                 lea     rax, ??_7Boxed_System_Net_Security_System_Net_Security_SslSessionsCache_SslCredKey@@6B@ ; jumptable 0000000140347A93 case 0
.managed:0000000140347AA7                 mov     [r9], rax
.managed:0000000140347AAA                 cmp     [rcx], cl
.managed:0000000140347AAC                 mov     eax, 8
.managed:0000000140347AB1                 add     rsp, 8
.managed:0000000140347AB5                 retn

Local morph changes handle the pattern for local structs -- VN changes handle the pattern for classes (and more complicated struct cases, like storing them in locals, which there are a few examples of in #40021).

Both during import and during VN. Just want to see TP impact, not sure
if this is worth it.
@ghost ghost assigned jakobbotsch Feb 11, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 11, 2023
@ghost
Copy link

ghost commented Feb 11, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

Both during import and during VN. Just want to see TP impact, not sure if this is worth it.

Author: jakobbotsch
Assignees: jakobbotsch
Labels:

area-CodeGen-coreclr

Milestone: -

@MichalStrehovsky
Copy link
Member

Would this help in situation like #9791? NativeAOT generates similar code in support of ValueType.Equals/GetHashCode. The code RyuJIT currently generates for it is not particularly compact:

// An override of this method will be injected by the compiler into all valuetypes that cannot be compared
// using a simple memory comparison.
// This API is a bit awkward because we want to avoid burning more than one vtable slot on this.
// When index == GetNumFields, this method is expected to return the number of fields of this
// valuetype. Otherwise, it returns the offset and type handle of the index-th field on this type.
internal virtual int __GetFieldHelper(int index, out EETypePtr eeType)
{
// Value types that don't override this method will use the fast path that looks at bytes, not fields.
Debug.Assert(index == GetNumFields);
eeType = default;
return UseFastHelper;
}

@jakobbotsch
Copy link
Member Author

Would this help in situation like #9791?

Yes, this should handle those kinds of patterns.

@ghost ghost closed this Mar 15, 2023
@ghost
Copy link

ghost commented Mar 15, 2023

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

@jakobbotsch
Copy link
Member Author

Reopening this given the additional motivation of #84095

@jakobbotsch jakobbotsch reopened this Mar 29, 2023
Comment on lines 688 to 689
// TODO-Correctness: Due to inlining we may end up with incorrectly typed SUB trees here.
assert(node->TypeIs(TYP_I_IMPL, TYP_BYREF));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #84291 for this.

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Apr 4, 2023

Diffs. This is primarily for NAOT so doesn't really reflect in the diffs. BasicMinimalApi has 1111 injected __GetFieldHelper methods that use this pattern.

TP cost is really from the local morph changes:

Base: 99417650982, Diff: 99454354685, +0.0369%

?PostOrderVisit@LocalAddressVisitor@@QEAA?AW4fgWalkResult@Compiler@@PEAPEAUGenTree@@PEAU4@@Z : 31130610 : +6.61% : 73.44% : +0.0313%
?EvalUsingMathIdentity@ValueNumStore@@AEAAIW4var_types@@W4VNFunc@@II@Z                       : 6168480  : +6.05% : 14.55% : +0.0062%
`ValueNumStore::EvalUsingMathIdentity'::`2'::<lambda_2>::operator()                          : 860150   : NA     : 2.03%  : +0.0009%
?fgPerBlockLocalVarLiveness@Compiler@@QEAAXXZ                                                : 174224   : +0.04% : 0.41%  : +0.0002%
?fgMorphSmpOp@Compiler@@AEAAPEAUGenTree@@PEAU2@PEAUMorphAddrContext@1@PEA_N@Z                : 141174   : +0.01% : 0.33%  : +0.0001%
?gtSetEvalOrder@Compiler@@QEAAIPEAUGenTree@@@Z                                               : 115047   : +0.01% : 0.27%  : +0.0001%
GenTreeVisitor<`Compiler::optCSE_GetMaskData'::`2'::MaskDataWalker>::WalkTree                : 59296    : +0.14% : 0.14%  : +0.0001%
GenTreeVisitor<`Compiler::fgSetTreeSeq'::`2'::SetTreeSeqVisitor>::WalkTree                   : 57562    : +0.01% : 0.14%  : +0.0001%
?fgMorphTree@Compiler@@QEAAPEAUGenTree@@PEAU2@PEAUMorphAddrContext@1@@Z                      : 51860    : +0.01% : 0.12%  : +0.0001%
memset                                                                                       : -2570677 : -0.34% : 6.06%  : -0.0026%

Haven't really checked more in detail.

cc @dotnet/jit-contrib PTAL @EgorBo

@jakobbotsch jakobbotsch marked this pull request as ready for review April 4, 2023 17:04
src/coreclr/jit/lclmorph.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/lclmorph.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/valuenum.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/valuenum.cpp Outdated Show resolved Hide resolved
Co-authored-by: SingleAccretion <62474226+SingleAccretion@users.noreply.github.com>
@jakobbotsch jakobbotsch merged commit 337999d into dotnet:main Apr 4, 2023
@jakobbotsch jakobbotsch deleted the fold-local-offsets branch April 4, 2023 22:26
@ghost ghost locked as resolved and limited conversation to collaborators May 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JIT poor codegen for calculating field offsets of structs
4 participants