-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fold "cast(cast(obj, cls), cls)" to "cast(obj, cls)" #98337
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsExample: static T Test<T>(object o) where T : class
{
return o as T;
} Current (Main) codegen: ; Assembly listing for method Program:Test[System.__Canon](System.Object)
push rbx
sub rsp, 48
mov qword ptr [rsp+0x28], rcx
mov rbx, rcx
mov rcx, qword ptr [rbx+0x38]
mov rcx, qword ptr [rcx]
call CORINFO_HELP_ISINSTANCEOFANY
mov r8, rax
test r8, r8
je SHORT G_M31297_IG05
mov rcx, qword ptr [rbx+0x38]
mov rcx, qword ptr [rcx]
cmp qword ptr [r8], rcx
je SHORT G_M31297_IG05
mov rdx, rax
call CORINFO_HELP_CHKCASTANY
mov r8, rax
G_M31297_IG05:
mov rax, r8
add rsp, 48
pop rbx
ret
; Total bytes of code 65 New (PR) codegen: ; Assembly listing for method Program:Test[System.__Canon](System.Object)
sub rsp, 40
mov qword ptr [rsp+0x20], rcx
mov rcx, qword ptr [rcx+0x38]
mov rcx, qword ptr [rcx]
call CORINFO_HELP_ISINSTANCEOFANY
nop
add rsp, 40
ret
; Total bytes of code 27
|
72d642c
to
4383fba
Compare
@@ -1677,8 +1677,7 @@ GenTree* Compiler::impRuntimeLookupToTree(CORINFO_RESOLVED_TOKEN* pResolvedToken | |||
return slotPtrTree; | |||
} | |||
|
|||
slotPtrTree = gtNewIndir(TYP_I_IMPL, slotPtrTree, GTF_IND_NONFAULTING); | |||
slotPtrTree->gtFlags &= ~GTF_GLOB_REF; // TODO-Bug?: this is a quirk. Can we mark this indirection invariant? | |||
slotPtrTree = gtNewIndir(TYP_I_IMPL, slotPtrTree, GTF_IND_NONFAULTING | GTF_IND_INVARIANT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why it can't be invariant (and removed the GTF_GLOB_REF quirk). However, I most likely will have to unmark it as invariant in my "inlining for shared generics" work for certain cases, but it's irrelevant here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was done for my snippet in the PR's description, otherwise two exactly the same runtime lookups had different conservative VNs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would double-check this with somebody on the runtime team; getting this wrong can have subtle consequences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So from my understanding there are two types of runtime lookups:
- Caller-side
- Callee-side
The caller side is typically not invariant (at least in JIT) and requires a helper call + expansion machinery (checking for null/dictionary resize).
So if a runtime lookup doesn't need a helper-call/null check/size check - it's a fully invariant bunch of ind loads on top of TypeCtx.
It might be a bit more complicated if we implement inlining for shared generics where we're going to have both caller and callee lookups in the same method, but I will keep that in mind once I do that sort of inlining (as part of net9). Currently we never have such cases.
cc @jkotas to confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The generic dictionary slots that do not require helper-call/null check/size check can be treated as invariant. This change looks fine to me.
So from my understanding there are two types of runtime lookups:
Caller-side
Callee-side
I am not sure what you mean. There is no distinction like this that I am aware of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what you mean. There is no distinction like this that I am aware of.
Yes, please, ignore that part. There are just 4 types of lookups:
- Helper call + nullcheck + INDs (fast path)
- Helper call + nullcheck + sizecheck + INDs (fast path)
nullcheck + INDs (fast path)- INDs (fast path)
where INDs is [0..n] nested indirect loads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nullcheck + INDs (fast path)
Nit: These do not exist. The null check needs a helper call slow path.
@dotnet/jit-contrib PTAL, ready-for-review. Diffs - a small number of regressions from marking GT_RUNTIMELOOKUP as invariant + CSE. Hopefully, it setups a handy place for other VN-based foldings. |
// The original tree wrapped with a COMMA node that contains the side effects | ||
// or just the tree itself if sideEffectSource has no side effects. | ||
// | ||
GenTree* Compiler::gtWrapWithSideEffects(GenTree* tree, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like commonly tree
is a subtree of sideEffectsSource
.
If tree
itself has side effects, do those get handled properly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good point, added a comment. Technically, it should be either a sideffect-free subtree of sideEffectsSource
or pretty much any node with any sideffects if it's not a subtree. In two uses of this API where tree
is a subnode of sideEffectsSource
we do check it being side-effect free at the callsite. It's probably possible to be smarter here and extract the effects properly for any case, but not now. Setup-arg stuff in args makes it a bit more complicated 🙁
Co-authored-by: Andy Ayers <andya@microsoft.com>
36cb35a
to
395c103
Compare
@AndyAyersMS I've addressed your concerns and clarified the runtime lookup issue, this is ready for another look 🙂 |
Example:
Current (Main) codegen:
New (PR) codegen: