-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable computation of topStack on Windows platforms #54986
Conversation
- And disable features which depend on it when it is no longer calculated (stack_limit and conservative GC) - Move logic which has been uncoditionally enabled if FEATURE_CONSERVATIVE_GC is enabled to be enabled unconditionally, as that is the current tested scenario. - When disabling stack_limit, keep the size/layout of the ScanContext the same, to preserve abi compat On Windows platforms when we hijack, we use SuspendThread from outside the thread, and if we suspend within an interruptible region, we can stop with the top Frame that is an InlinedCallFrame that is not active. In that situation, it is possible for the actual top of the stack to be a managed function which has a lower SP pointer than is reported from the InlinedCallFrame, or in the case of Windows x86 platforms, the InlinedCallFrame may report a GetCallSiteSP of 0, or of an value associated with some other callsite within the managed function that set up the InlinedCallFrame. This issue could be fixed by capturing the SP after suspension, and storing it on the thread object for use when setting stack_limit, but as conservative GC is not needed on any current Windows platforms, and the Unix scenario is fully functional, this PR change simply disables the feature.
@Maoni0 I made a small tweak to the gc handling of FEATURE_CONSERVATIVE_GC so that disabling that ifdef didn't send us down paths we haven't tested in years. Could you take a quick look when you get a chance? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
#ifdef FEATURE_CONSERVATIVE_GC | ||
if (interior >= heap_segment_allocated(seg)) | ||
return 0; | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
functionally I don't see a problem. I'm curious - could interior be actually >= heap_segment_allocated if GCConfig::GetConservativeGC() is false? this would mean it's reporting an interior pointer that's indeed on a segment but not pointing at a valid object on that seg.
#ifdef FEATURE_SUPPORTS_STACK_LIMIT | ||
uintptr_t stack_limit; // Lowest point on the thread stack that the scanning logic is permitted to read | ||
#else | ||
uintptr_t stack_limitUNUSED; // Keep the shape of ScanContext the same whether or not STACK_LIMIT is supported for use | ||
#endif // FEATURE_SUPPORTS_STACK_LIMIT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really need this distinction in this file? GC doesn't care at all about whether this is a stack limit you maintain or not... it would be good to not have this knowledge in the GC interface unless it's mandatory...
#else | ||
#if defined(FEATURE_CONSERVATIVE_GC) | ||
// On Windows platforms when we hijack, we use SuspendThread from outside the thread, and if we suspend | ||
// within an interruptible region, we can stop with the top Frame that is an InlinedCallFrame |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect RedirectedThreadFrame
to be the top Frame when we stop within an interruptible region and start GC walk. Why are the problematic cases missing RedirectedThreadFrame
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have a point. I thought I had managed to reproduce the scenario (or something very like it) in the debugger, but I agree, I can't quite see how we didn't end up with a proper RedirectedThreadFrame. In any case, the best crash dump I could get from the CI system had a disabled InlinedCallFrame as the top of the framestack. (Unfortunately, for some reason, the thread that frame is on had been terminated, so it wasn't completely workable.) I'll try a again with some debugger fun times.
@jkotas Good catch, this really ends up just silencing the assert. The actual problem is that if we have a function that contains both a reverse p/invoke and an InlinedCallFrame, which is called from native, we see a weird state where the top of stack is an InlinedCallFrame which is inactive, and a actual interesting logic needs to begin, somewhere else. I'm experimenting with various approaches to address the problem, but I'm going to close this PR as not fixing the problem. |
On Windows platforms when we hijack, we use SuspendThread from outside the thread, and if we suspend
within an interruptible region, we can stop with the top Frame that is an InlinedCallFrame
that is not active. In that situation, it is possible for the actual top of the stack to be a managed
function which has a lower SP pointer than is reported from the InlinedCallFrame, or in the case of
Windows x86 platforms, the InlinedCallFrame may report a GetCallSiteSP of 0, or of an value associated
with some other callsite within the managed function that set up the InlinedCallFrame. This issue could
be fixed by capturing the SP after suspension, and storing it on the thread object for use when setting
stack_limit, but as conservative GC is not needed on any current Windows platforms, and the Unix scenario
is fully functional, this PR change simply disables the feature.
Fixes #45326