JIT: Keep delegate object alive during invoke #105099

EgorBo · 2024-07-18T16:54:10Z

A quick fix for #105082, we just insert a KeepAlive node for the delegate object, diffWithGC for the snippet in the issue:

dotnet-policy-service · 2024-07-18T16:54:36Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

jkotas · 2024-07-18T19:28:10Z

src/coreclr/jit/lower.cpp

+    // Keep delegate alive
+    GenTree* baseClone     = comp->gtCloneExpr(base);
+    GenTree* keepBaseAlive = comp->gtNewKeepAliveNode(baseClone);
+    BlockRange().InsertAfter(call, baseClone, keepBaseAlive);


Is this fix going to introduce CQ regressions? Ideally, the fix for this issue would be zero codegen diff on x86/x64.

We need to keep delegate alive only until we make the call. The delegate does not need to be alive after call.

For my knowledge, will all the various stubs that we may go through on the way to the final target automatically keep the right loader allocator without the JIT at minimum putting the delegate instance in a callee-saved register?

I think so. There are really only 3:

Shuffle thunks: hand-emitted assembly so the GC cannot kick in while they are executing

Multicast stubs: they keep the delegate thisptr alive. (Adding comment that this is important may be nice.)

Marshalled function pointers: The delegate points to itself, so calling the method on the delegate will keep it alive.

It is possible that I may be missing a corner case somewhere, but I do not expect it to be hard to fix.

jakobbotsch · 2024-07-18T20:32:15Z

src/coreclr/jit/lower.cpp

+    {
+        GenTree* baseClone     = comp->gtCloneExpr(base);
+        GenTree* keepBaseAlive = comp->gtNewKeepAliveNode(baseClone);
+        BlockRange().InsertBefore(call, baseClone, keepBaseAlive);


This InsertBefore only works due to implementation details of the backend (lazy liveness updates for GC information about registers), which is quite subtle. But of course the fully ideal solution requires adding a new operand to GenTreeCall and special casing it throughout the backend as an invisible use, which is probably a bit overkill.

I wonder if we in the uncontained case wouldn't be better off just by inserting GT_START_NONGC and creating a nogc region on the single call instruction. Given that we already have a call and hence a GC safe point this does not seem like it would be very harmful; in a less conservative JIT this basic block might have simply always been partially interruptible for that reason.

hm.. interesting idea! applied

The new diff for the snippet in the issue: https://www.diffchecker.com/CiuxVSBL/

I think you will also need to add a STOP_NOGC to turn it off again. Looks like we don't have that today.

I think you will also need to add a STOP_NOGC to turn it off again. Looks like we don't have that today.

I guess it's not from a correctness stand point, just to reduce the nongc area if we have a lot more statements in the current block?

I guess this one is tricky - we might end whatever was expected to be nongc even before we emitted the start_nogc

It's counted, so it doesn't turn off nogc until the count gets to zero. But regardless I wouldn't expect there to ever be another active nogc region in this case

I guess it's not from a correctness stand point, just to reduce the nongc area if we have a lot more statements in the current block?

I'm not really sure how the VM behaves if the return address is in a nogc region – but in any case it's unnecessary. I think we only need it to be active from the load of the target address (which is the last use) to the call – should be just one instruction in the end

jakobbotsch · 2024-07-19T10:07:56Z

src/coreclr/jit/lower.cpp

+#if !defined(TARGET_XARCH)
+    if (comp->GetInterruptible())
+    {
+        // If the target's backend doesn't support indirect calls with immediate operands (contained)
+        // and the method is marked as interruptible, we need to insert a GT_START_NONGC before the call.
+        // to keep the delegate object alive while we're obtaining the function pointer.
+        GenTree* startNonGCNode = new (comp, GT_START_NONGC) GenTree(GT_START_NONGC, TYP_VOID);
+        BlockRange().InsertAfter(thisArgNode, startNonGCNode);
+
+        // We should try to end the non-GC region just before the actual call.
+        *shouldEndNogcBeforeCall = true;
+    }
+#endif
+


I think it would be cleaner to insert both START and STOP at the end of LowerCall instead of doing half in here and half in LowerCall.

Also, is inserting the stop node before the call node really correct?

Also, is inserting the stop node before the call node really correct?

Why not?

The GC is illegal when we are on the blr instruction -- it is not illegal when we are on the ldr instruction. https://www.diffchecker.com/NwHIxdIA/ looks like it makes the GC illegal on the ldr instruction but not on the blr instruction.

can GC happen on call instructions before we entered the call?
If I need to extend the nogc past the call then it's easier for my PR, I just though that I should not add a call to nogc block (because GC may happens inside the call)

can GC happen on call instructions before we entered the call?

Yes, I think that's the problem here. If GC happens on that call instruction nothing indicates the delegate is live and thus the target we are going to jump to may be collected.

I just though that I should not add a call to nogc block (because GC may happens inside the call)

Hmm, I'm not totally sure what happens, but the safepoint should logically be on the return address, not on the call, so I think we should be fine.

I was just wondering if we have any debug mechanism that checks that all calls inside nogc blocks can't trigger GC even internally, but I presume we don't otherwise we'd catch the recent bug with BULKBARRIER earlier

jakobbotsch · 2024-07-19T10:23:49Z

src/coreclr/jit/lower.cpp

+#if !defined(TARGET_XARCH)
+    if (comp->GetInterruptible())
+    {
+        // If the target's backend doesn't support indirect calls with immediate operands (contained)
+        // and the method is marked as interruptible, we need to insert a GT_START_NONGC before the call.
+        // to keep the delegate object alive while we're obtaining the function pointer.
+        GenTree* startNonGCNode = new (comp, GT_START_NONGC) GenTree(GT_START_NONGC, TYP_VOID);
+        GenTree* stopNonGCNode  = new (comp, GT_STOP_NONGC) GenTree(GT_STOP_NONGC, TYP_VOID);
+        BlockRange().InsertAfter(thisArgNode, startNonGCNode);
+        BlockRange().InsertAfter(call, stopNonGCNode);
+    }
+#endif


I would still do this inside LowerCall since it's LowerCall responsibility to insert the control expression returned by this function. This is adding an assumption about where LowerCall will insert that node.

I would still do this inside LowerCall since it's LowerCall responsibility to insert the control expression returned by this function. This is adding an assumption about where LowerCall will insert that node.

but that's the point, no? LowerDelegateInvoke already inserts some trees and at the end of those we conservatively emit a START_NOGC so other functions like LowerCall itself, LowerCFGCall or LowerTailCallViaJitHelper can insert whatever they want after our NOGC and before the call. Because, presumably, they can extend the dangerous area where GC can kick in and collect our delegate

Hmm yeah, that's a good point too. What I dislike here is just that it still relies on LowerCall to insert the control expression after the START_NONGC. To be conservatively correct we have to insert the START_NOGC before the last use of the delegate, since emitDisableGC() means something like "the next instruction I emit is going to make GC information wrong, so disable GC after that instruction". Hence inserting START_NONGC after thisArgNode is only correct because there ends up being another use inserted after that point by LowerCall. The main idea behind moving the logic into LowerCall is that it has the full overview of what nodes it is going to insert and where, so it can make the decision with more information. So for instance it could insert it before the gtControlExpr in the delegate case, knowing that gtControlExpr has the last use of the delegate instance in the common case.

…is-alive-fptr

…to keep-this-alive-fptr

dotnet-policy-service · 2024-09-01T01:56:42Z

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

EgorBo added 2 commits July 18, 2024 16:26

Disable Comparer_get_Default test on win-arm64-crossgen

4a90d51

Keep 'this' alive for delegate invoke

76c6f4d

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 18, 2024

dotnet-policy-service bot assigned EgorBo Jul 18, 2024

EgorBo changed the title ~~Keep this alive fptr~~ JIT: Keep delegate object alive during invoke Jul 18, 2024

EgorBo added 2 commits July 18, 2024 18:54

Update issues.targets

49b10a6

Update lower.cpp

8bf8365

This was referenced Jul 18, 2024

Build failure: Static graph-based restore failed with exit code .* but did not log an error. #103526

Open

Build failure: Static graph-based restore failed with exit code .* but did not log an error. dotnet/dnceng#3139

Closed

jkotas reviewed Jul 18, 2024

View reviewed changes

build-analysis bot mentioned this pull request Jul 18, 2024

LibraryImportGenerator.Unit.Tests crashing on linux-x64 mono interpreter #100800

Open

EgorBo added 2 commits July 18, 2024 22:07

less conservative version

96543b3

fix define

5db07f8

jakobbotsch reviewed Jul 18, 2024

View reviewed changes

EgorBo added 2 commits July 19, 2024 01:15

Apply Jakob's suggestion

118c8ad

GT_STOP_NOGC

ffe5317

This was referenced Jul 19, 2024

'System.Net.NameResolution.Functional.Tests' Failure #105092

Closed

The Operation will be canceled. The next steps may not contain expected logs. dotnet/dnceng#3008

Open

jkotas mentioned this pull request Jul 19, 2024

Delete dead code related to intercept stubs #105127

Merged

EgorBo added 2 commits July 19, 2024 09:05

Merge branch 'main' into keep-this-alive-fptr

5558203

Update lower.cpp

bfc872f

This was referenced Jul 19, 2024

System.IO.Net5Compat.Tests and System.IO.Tests suddenly exiting with error 137 #100558

Open

SIGKILL (OOM?) while running LibraryImportGenerator.Tests w/o actionable log messages or artifacts dotnet/dnceng#2496

Open

jakobbotsch reviewed Jul 19, 2024

View reviewed changes

EgorBo added 2 commits July 19, 2024 12:20

Address feedback

300ab49

Address feedback

26da6ab

jakobbotsch reviewed Jul 19, 2024

View reviewed changes

This was referenced Jul 19, 2024

[mono][interpreter] Mono interpreter is crashing during System.Data.Odbc.Tests (linux-x64 Release Mono_Interpreter_LibrariesTests) #101370

Open

Assert failure: executionAborted in GcInfoDecoder::EnumerateLiveSlots #102370

Closed

build-analysis bot mentioned this pull request Jul 19, 2024

nativeaot\\SmokeTests\\Reflection\\Reflection\\Reflection.cmd fails with Access Violation #105136

Closed

EgorBo added 2 commits August 1, 2024 03:21

Merge branch 'main' of https://github.com/dotnet/runtime into keep-th…

f95c267

…is-alive-fptr

handle tail calls

b35c892

This was referenced Aug 1, 2024

msbuild crashes with "MSB0001: Internal MSBuild Error: must be valid" dotnet/dnceng#3304

Open

MSBuild crashing in the build #92290

Open

Merge branch 'main' into keep-this-alive-fptr

255c5bd

This was referenced Aug 1, 2024

Assertion failed (pThread->m_StateNC & Thread::TSNC_OwnsSpinLock) == 0 #104449

Closed

System.Linq.Expressions.Tests.WorkItemExecution fails with access violation #105704

Closed

EgorBo added 5 commits August 1, 2024 21:32

remove bogus assert

ffdc8e5

Merge branch 'keep-this-alive-fptr' of github.com:EgorBo/runtime-1 in…

9e34a9a

…to keep-this-alive-fptr

Update lower.cpp

1dac215

Update lower.cpp

7cb1d65

Update lower.cpp

f0964fd

This was referenced Aug 2, 2024

Nullref in FindImplSlotForCurrentType #105304

Closed

[NativeAOT-iOS] MacCatalyst fails to start/log nativeaot.SmokeTests tests #105804

Open

Update lower.cpp

126fca9

build-analysis bot mentioned this pull request Aug 2, 2024

Sigsegv in dotnet/msbuild when building repo #101049

Closed

dotnet-policy-service bot closed this Sep 1, 2024

github-actions bot locked and limited conversation to collaborators Oct 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Keep delegate object alive during invoke #105099

JIT: Keep delegate object alive during invoke #105099

EgorBo commented Jul 18, 2024 •

edited

Loading

dotnet-policy-service bot commented Jul 18, 2024

jkotas Jul 18, 2024 •

edited

Loading

jakobbotsch Jul 18, 2024

jkotas Jul 18, 2024

jakobbotsch Jul 18, 2024

EgorBo Jul 18, 2024

EgorBo Jul 18, 2024

jakobbotsch Jul 18, 2024

EgorBo Jul 18, 2024

EgorBo Jul 18, 2024

jakobbotsch Jul 18, 2024

jakobbotsch Jul 18, 2024 •

edited

Loading

jakobbotsch Jul 19, 2024

EgorBo Jul 19, 2024

jakobbotsch Jul 19, 2024

EgorBo Jul 19, 2024

EgorBo Jul 19, 2024

jakobbotsch Jul 19, 2024

EgorBo Jul 19, 2024

jakobbotsch Jul 19, 2024

EgorBo Jul 19, 2024 •

edited

Loading

jakobbotsch Jul 19, 2024

dotnet-policy-service bot commented Sep 1, 2024

JIT: Keep delegate object alive during invoke #105099

JIT: Keep delegate object alive during invoke #105099

Conversation

EgorBo commented Jul 18, 2024 • edited Loading

dotnet-policy-service bot commented Jul 18, 2024

jkotas Jul 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakobbotsch Jul 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

EgorBo Jul 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dotnet-policy-service bot commented Sep 1, 2024

EgorBo commented Jul 18, 2024 •

edited

Loading

jkotas Jul 18, 2024 •

edited

Loading

jakobbotsch Jul 18, 2024 •

edited

Loading

EgorBo Jul 19, 2024 •

edited

Loading