Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fold "cast(cast(obj, cls), cls)" to "cast(obj, cls)" #98337

Merged
merged 14 commits into from
Feb 15, 2024

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Feb 13, 2024

Example:

static T Test<T>(object o) where T : class
{
    return o as T;
}

Current (Main) codegen:

; Assembly listing for method Program:Test[System.__Canon](System.Object)
       push     rbx
       sub      rsp, 48
       mov      qword ptr [rsp+0x28], rcx
       mov      rbx, rcx
       mov      rcx, qword ptr [rbx+0x38]
       mov      rcx, qword ptr [rcx]
       call     CORINFO_HELP_ISINSTANCEOFANY
       mov      r8, rax
       test     r8, r8
       je       SHORT G_M31297_IG05
       mov      rcx, qword ptr [rbx+0x38]
       mov      rcx, qword ptr [rcx]
       cmp      qword ptr [r8], rcx
       je       SHORT G_M31297_IG05
       mov      rdx, rax
       call     CORINFO_HELP_CHKCASTANY
       mov      r8, rax
G_M31297_IG05:
       mov      rax, r8
       add      rsp, 48
       pop      rbx
       ret      
; Total bytes of code 65

New (PR) codegen:

; Assembly listing for method Program:Test[System.__Canon](System.Object)
       sub      rsp, 40
       mov      qword ptr [rsp+0x20], rcx
       mov      rcx, qword ptr [rcx+0x38]
       mov      rcx, qword ptr [rcx]
       call     CORINFO_HELP_ISINSTANCEOFANY
       nop      
       add      rsp, 40
       ret      
; Total bytes of code 27

@ghost ghost assigned EgorBo Feb 13, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 13, 2024
@ghost
Copy link

ghost commented Feb 13, 2024

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Example:

static T Test<T>(object o) where T : class
{
    return o as T;
}

Current (Main) codegen:

; Assembly listing for method Program:Test[System.__Canon](System.Object)
       push     rbx
       sub      rsp, 48
       mov      qword ptr [rsp+0x28], rcx
       mov      rbx, rcx
       mov      rcx, qword ptr [rbx+0x38]
       mov      rcx, qword ptr [rcx]
       call     CORINFO_HELP_ISINSTANCEOFANY
       mov      r8, rax
       test     r8, r8
       je       SHORT G_M31297_IG05
       mov      rcx, qword ptr [rbx+0x38]
       mov      rcx, qword ptr [rcx]
       cmp      qword ptr [r8], rcx
       je       SHORT G_M31297_IG05
       mov      rdx, rax
       call     CORINFO_HELP_CHKCASTANY
       mov      r8, rax
G_M31297_IG05:
       mov      rax, r8
       add      rsp, 48
       pop      rbx
       ret      
; Total bytes of code 65

New (PR) codegen:

; Assembly listing for method Program:Test[System.__Canon](System.Object)
       sub      rsp, 40
       mov      qword ptr [rsp+0x20], rcx
       mov      rcx, qword ptr [rcx+0x38]
       mov      rcx, qword ptr [rcx]
       call     CORINFO_HELP_ISINSTANCEOFANY
       nop      
       add      rsp, 40
       ret      
; Total bytes of code 27
Author: EgorBo
Assignees: EgorBo
Labels:

area-CodeGen-coreclr

Milestone: -

@EgorBo EgorBo force-pushed the opt-cast-cast-lookup branch from 72d642c to 4383fba Compare February 13, 2024 03:21
@EgorBo EgorBo marked this pull request as ready for review February 13, 2024 11:03
@@ -1677,8 +1677,7 @@ GenTree* Compiler::impRuntimeLookupToTree(CORINFO_RESOLVED_TOKEN* pResolvedToken
return slotPtrTree;
}

slotPtrTree = gtNewIndir(TYP_I_IMPL, slotPtrTree, GTF_IND_NONFAULTING);
slotPtrTree->gtFlags &= ~GTF_GLOB_REF; // TODO-Bug?: this is a quirk. Can we mark this indirection invariant?
slotPtrTree = gtNewIndir(TYP_I_IMPL, slotPtrTree, GTF_IND_NONFAULTING | GTF_IND_INVARIANT);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why it can't be invariant (and removed the GTF_GLOB_REF quirk). However, I most likely will have to unmark it as invariant in my "inlining for shared generics" work for certain cases, but it's irrelevant here.

Copy link
Member Author

@EgorBo EgorBo Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was done for my snippet in the PR's description, otherwise two exactly the same runtime lookups had different conservative VNs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would double-check this with somebody on the runtime team; getting this wrong can have subtle consequences.

Copy link
Member Author

@EgorBo EgorBo Feb 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So from my understanding there are two types of runtime lookups:

  1. Caller-side
  2. Callee-side

The caller side is typically not invariant (at least in JIT) and requires a helper call + expansion machinery (checking for null/dictionary resize).

So if a runtime lookup doesn't need a helper-call/null check/size check - it's a fully invariant bunch of ind loads on top of TypeCtx.

It might be a bit more complicated if we implement inlining for shared generics where we're going to have both caller and callee lookups in the same method, but I will keep that in mind once I do that sort of inlining (as part of net9). Currently we never have such cases.

cc @jkotas to confirm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generic dictionary slots that do not require helper-call/null check/size check can be treated as invariant. This change looks fine to me.

So from my understanding there are two types of runtime lookups:
Caller-side
Callee-side

I am not sure what you mean. There is no distinction like this that I am aware of.

Copy link
Member Author

@EgorBo EgorBo Feb 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what you mean. There is no distinction like this that I am aware of.

Yes, please, ignore that part. There are just 4 types of lookups:

  1. Helper call + nullcheck + INDs (fast path)
  2. Helper call + nullcheck + sizecheck + INDs (fast path)
  3. nullcheck + INDs (fast path)
  4. INDs (fast path)
    where INDs is [0..n] nested indirect loads.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nullcheck + INDs (fast path)

Nit: These do not exist. The null check needs a helper call slow path.

@EgorBo
Copy link
Member Author

EgorBo commented Feb 13, 2024

@dotnet/jit-contrib PTAL, ready-for-review. Diffs - a small number of regressions from marking GT_RUNTIMELOOKUP as invariant + CSE.

Hopefully, it setups a handy place for other VN-based foldings.

@EgorBo EgorBo requested a review from AndyAyersMS February 13, 2024 20:53
src/coreclr/jit/gentree.cpp Outdated Show resolved Hide resolved
// The original tree wrapped with a COMMA node that contains the side effects
// or just the tree itself if sideEffectSource has no side effects.
//
GenTree* Compiler::gtWrapWithSideEffects(GenTree* tree,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like commonly tree is a subtree of sideEffectsSource.

If tree itself has side effects, do those get handled properly?

Copy link
Member Author

@EgorBo EgorBo Feb 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good point, added a comment. Technically, it should be either a sideffect-free subtree of sideEffectsSource or pretty much any node with any sideffects if it's not a subtree. In two uses of this API where tree is a subnode of sideEffectsSource we do check it being side-effect free at the callsite. It's probably possible to be smarter here and extract the effects properly for any case, but not now. Setup-arg stuff in args makes it a bit more complicated 🙁

src/coreclr/jit/assertionprop.cpp Outdated Show resolved Hide resolved
Co-authored-by: Andy Ayers <andya@microsoft.com>
@EgorBo EgorBo force-pushed the opt-cast-cast-lookup branch from 36cb35a to 395c103 Compare February 14, 2024 11:12
@EgorBo
Copy link
Member Author

EgorBo commented Feb 14, 2024

@AndyAyersMS I've addressed your concerns and clarified the runtime lookup issue, this is ready for another look 🙂

@EgorBo EgorBo merged commit 0272fcc into dotnet:main Feb 15, 2024
129 checks passed
@EgorBo EgorBo deleted the opt-cast-cast-lookup branch February 15, 2024 19:55
@github-actions github-actions bot locked and limited conversation to collaborators Mar 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants