Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optRemoveRedundantZeroInits doesn't know about some safe points #102577

Closed
SingleAccretion opened this issue May 22, 2024 · 9 comments · Fixed by #102580
Closed

optRemoveRedundantZeroInits doesn't know about some safe points #102577

SingleAccretion opened this issue May 22, 2024 · 9 comments · Fixed by #102580
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI bug
Milestone

Comments

@SingleAccretion
Copy link
Contributor

SingleAccretion commented May 22, 2024

Looking at #102531, I realized the logic will miss implicit safe points created by the somewhat recent block store lowering changes.

Reproduction:

object x = null;
byte* p = stackalloc byte[1000];
Random.Shared.NextBytes(new Span<byte>(p, 1000));
Problem(p, p, &x);

[MethodImpl(MethodImplOptions.NoInlining)]
static void Problem(void* pDst, void* pSrc, object* pObj)
{
    Unsafe.CopyBlock(pDst, pSrc, 1000);
    StructWithManyObjs x = default;
    if (pDst != null)
    {
        x.ObjFour = *pObj;
        *(StructWithManyObjs*)pDst = x;
    }
}

struct StructWithManyObjs
{
    public object ObjOne;
    public object ObjTwo;
    public object ObjThree;
    public object ObjFour;
    public object ObjFive;
}

DOTNET_TieredCompilation=0, DOTNET_GCStress=0xC, DOTNET_ReadyToRun=0 (it looks like GC stress doesn't apply to precompiled code).

Expected result: nothing happens.
Actual result:

Assert failure(PID 17360 [0x000043d0], Thread: 14716 [0x397c]): !CREATE_CHECK_STRING(!"Detected use of a corrupted OBJECTREF. Possible GC hole.")

Cause: optRemoveRedundantZeroInits doesn't know that the block copy may become a safe point. Zero-init and future calls introduced late suffer from the same problem.

The most robust fix for this would probably be to move optRemoveRedundantZeroInits later (make it part of late liveness, say).

@SingleAccretion SingleAccretion added bug area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels May 22, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label May 22, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@EgorBo EgorBo self-assigned this May 22, 2024
@EgorBo EgorBo added this to the 9.0.0 milestone May 22, 2024
@dotnet-policy-service dotnet-policy-service bot removed the untriaged New issue has not been triaged by the area owner label May 22, 2024
@VSadov
Copy link
Member

VSadov commented May 22, 2024

probably explains #101890

@VSadov
Copy link
Member

VSadov commented May 22, 2024

The most robust fix for this would probably be to move optRemoveRedundantZeroInits later (make it part of late liveness, say).

It might also be nice to not rely on GTF_CALL.

If this optimization is done at a stage when everything that needs to be a call is already a call, it could be more precise about "No-GC" cases.
I suspect we may be missing some optimizable cases when "No-GC" call is an operand/subexpression.

@VSadov
Copy link
Member

VSadov commented May 22, 2024

Cause: optRemoveRedundantZeroInits doesn't know that the block copy may become a safe point. Zero-init and future calls introduced late suffer from the same problem.

Do we have a similar problem with casts that end up calling managed helpers or they are already calls at the time when this optimization runs?

@EgorBo
Copy link
Member

EgorBo commented May 22, 2024

The most robust fix for this would probably be to move optRemoveRedundantZeroInits later (make it part of late liveness, say).

It might also be nice to not rely on GTF_CALL.

If this optimization is done at a stage when everything that needs to be a call is already a call, it could be more precise about "No-GC" cases. I suspect we may be missing some optimizable cases when "No-GC" call is an operand/subexpression.

My understanding that we rely on liveness phase to drop the redundant zeroings so doing this in a late phase won't be that useful

@EgorBo
Copy link
Member

EgorBo commented May 22, 2024

Cause: optRemoveRedundantZeroInits doesn't know that the block copy may become a safe point. Zero-init and future calls introduced late suffer from the same problem.

Do we have a similar problem with casts that end up calling managed helpers or they are already calls at the time when this optimization runs?

I think they're always calls, aren't they? I mean they might be expanded, but they can't be collapsed back to calls

@VSadov
Copy link
Member

VSadov commented May 22, 2024

I think they're always calls, aren't they? I mean they might be expanded, but they can't be collapsed back to calls

I am just not that familiar with how JIT handles them - whether they are always calls or there are cast operators in the tree that later are lowered into calls.
I guess, considering how long we had managed cast helpers, we would know by now if they cause trouble.

@EgorBo
Copy link
Member

EgorBo commented May 22, 2024

I am just not that familiar with how JIT handles them - whether they are always calls or there are cast operators in the tree that later are lowered into calls.

Ah, you meant GT_CAST probably (to cast primitives). Let me check, maybe we might lower it into a call on some 32bit targets or/and in checked context.

@SingleAccretion
Copy link
Contributor Author

Most things end up as either calls or GTF_CALL-marked something (e. g. GT_INTRINSIC) by the time optRemoveRedundantZeroInits runs. Casts are turned into calls in global morph.

The cases that suffer from this issue are only those recent lowering additions where previously always-no-GC nodes get turned into potentially-GC helpers.

@github-actions github-actions bot locked and limited conversation to collaborators Jun 27, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants