-
Notifications
You must be signed in to change notification settings - Fork 729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix missing concurrent continuation scanning issue #16614
Conversation
ad77831
to
88c4106
Compare
runtime/oti/j9consts.h
Outdated
@@ -492,8 +492,10 @@ extern "C" { | |||
#define J9_GC_MARK_MAP_LOG_SIZEOF_UDATA 0x5 | |||
#define J9_GC_MARK_MAP_UDATA_MASK 0x1F | |||
#endif /* J9VM_ENV_DATA64 */ | |||
#define J9_GC_CONTINUATION_STATE_INITIAL 0 | |||
#define J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN 0x1 | |||
#define J9_GC_CONTINUATION_STATE_CONCURRENT_NONE 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_CONCURRENT_SCAN_NONE
88c4106
to
1cb453d
Compare
runtime/oti/VMHelpers.hpp
Outdated
uintptr_t state; | ||
uintptr_t retState; | ||
do { | ||
state = continuation->state & (J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_ALL & (~checkConcurrentState)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's introduce a small helper, that could help with easier code understanding. Could be used here twice, and in 'exit' method, too.
/**
- extract concurrent state of local GC, if this GC is global, and vice-versa
- extract concurrent state of global GC, if this GC is local
*/
UDATA complementGCConcurrentState(UDATA continuationState, UDATA thisGCConcurrentState) {
return continuationState & (J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_ALL & ~thisGCConcurentState);
}
1cb453d
to
592bdd8
Compare
@@ -67,7 +67,8 @@ class GC_VMThreadStackSlotIterator | |||
J9MODRON_OSLOTITERATOR *oSlotIterator, | |||
bool includeStackFrameClassReferences, | |||
bool trackVisibleFrameDepth, | |||
bool syncWithContinuationMounting = false); | |||
bool syncWithContinuationMounting = false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small naming cleanup: let's rename this to isConcurrentGC (here and anywhere else this flag is used)
then both isConcurrentGC and isGlobalGC represent the facts as known by caller, without assuming how the callee will use them
@@ -802,10 +802,10 @@ MM_GlobalMarkingScheme::scanContinuationNativeSlots(MM_EnvironmentVLHGC *env, J9 | |||
stackFrameClassWalkNeeded = isDynamicClassUnloadingEnabled(); | |||
#endif /* J9VM_GC_DYNAMIC_CLASS_UNLOADING */ | |||
|
|||
|
|||
/* In STW GC there are no racing carrier threads doing mount and no need for the synchronization. */ | |||
bool syncWithContinuationMounting = (MM_VLHGCIncrementStats::mark_concurrent == static_cast<MM_CycleStateVLHGC*>(env->_cycleState)->_vlhgcIncrementStats._globalMarkIncrementType); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mentioned earlier, let's rename this to isConcurrentGC
@@ -792,7 +792,7 @@ void | |||
MM_GlobalMarkingScheme::scanContinuationNativeSlots(MM_EnvironmentVLHGC *env, J9Object *objectPtr, ScanReason reason) | |||
{ | |||
J9VMThread *currentThread = (J9VMThread *)env->getLanguageVMThread(); | |||
if (MM_GCExtensions::needScanStacksForContinuationObject(currentThread, objectPtr)) { | |||
if (MM_GCExtensions::needScanStacksForContinuationObject(currentThread, objectPtr, true)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's introduce a line
const bool isGlobalGC = true;
and instead of passing 'true' pass isGlobalGC at the 2 spots
that will help with undestanding what these flags we pass are about
update it for all other methods that use GC_VMThreadStackSlotIterator::scanSlots()
let's fix other caller's of GC_VMThreadStackSlotIterator::scanSlots() to not rely on default isGlobalGC = true, where really that's not the case (CopyForward for example). |
runtime/oti/VMHelpers.hpp
Outdated
uintptr_t concurrentState = J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_LOCAL; | ||
if (isGlobalGC) { | ||
concurrentState = J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_GLOBAL; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is an extra emty space of indentation in 3 lines of the 'if' block
runtime/oti/VMHelpers.hpp
Outdated
static VMINLINE bool | ||
isConcurrentlyScannedFromContinuationState(uintptr_t continuationState, bool isGlobalGC) | ||
{ | ||
uintptr_t concurrentState = J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_LOCAL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's really a mask rather then state, so rename it to concurrentGCMask
runtime/oti/j9consts.h
Outdated
#define J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_NONE 0 | ||
#define J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_LOCAL 0x1 | ||
#define J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_GLOBAL 0x2 | ||
#define J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_ALL (J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_LOCAL | J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_GLOBAL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer either _ANY or _MASK, rather than _ALL
runtime/oti/VMHelpers.hpp
Outdated
uintptr_t state; | ||
uintptr_t retState; | ||
do { | ||
state = continuation->state & getGCConcurrentState(!isGlobalGC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's give this state a more specific name (complementGCConcurrentState?) and provide a comment above the line:
/* preserve the concurrent GC state for the other type of GC */
runtime/oti/VMHelpers.hpp
Outdated
uintptr_t retState; | ||
do { | ||
state = continuation->state & getGCConcurrentState(!isGlobalGC); | ||
retState = VM_AtomicSupport::lockCompareExchange(&continuation->state, state, state | getGCConcurrentState(isGlobalGC)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to returnedState
runtime/oti/VMHelpers.hpp
Outdated
do { | ||
state = continuation->state & getGCConcurrentState(!isGlobalGC); | ||
retState = VM_AtomicSupport::lockCompareExchange(&continuation->state, state, state | getGCConcurrentState(isGlobalGC)); | ||
} while (state != (retState & state)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a comment above this line would be good:
if the other GC happened to change its concurrentGC state since us taking a snapshot of their state, we'll have to retry
runtime/oti/VMHelpers.hpp
Outdated
{ | ||
/* clear CONCURRENTSCANNING flag bit0:LocalConcurrentScanning /bit1:GlobalConcurrentScanning */ | ||
uintptr_t oldContinuationState = VM_AtomicSupport::bitAnd(&continuation->state, ~getGCConcurrentState(isGlobalGC)); | ||
if (!(oldContinuationState & getGCConcurrentState(!isGlobalGC))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
store this into complementGCConcurrentState
2fbc065
to
1d3bbe0
Compare
runtime/oti/VMHelpers.hpp
Outdated
if (isGlobalGC) { | ||
return J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_GLOBAL); | ||
} else { | ||
return J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_LOCAL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spaces are used for indentation, while elsewhere tabs are used
@@ -60,14 +60,16 @@ MM_HeapWalkerDelegate::doContinuationNativeSlots(MM_EnvironmentBase *env, omrobj | |||
{ | |||
J9VMThread *currentThread = (J9VMThread *)env->getLanguageVMThread(); | |||
|
|||
if (MM_GCExtensions::needScanStacksForContinuationObject(currentThread, objectPtr)) { | |||
const bool isGlobalGC = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is weird one, since Heap walker is not a GC, to start with.
Either false or true is probably good, but since this it's actually (so far) only instantiated by ParallelGlobalGC and used for whole heap walk, it'd actually mark it as global.
If in future it's used from different GCs, then we would have to resolve this in run-time - similar how we do it for WriteOnceCompactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like you forgot to flip this to true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this has not been fixed yet
|
||
GC_VMThreadStackSlotIterator::scanSlots(currentThread, objectPtr, (void *)&localData, stackSlotIteratorForRealtimeGC, stackFrameClassWalkNeeded, false, syncWithContinuationMounting); | ||
bool isConcurrentGC = _realtimeGC->isCollectorConcurrentTracing(); | ||
GC_VMThreadStackSlotIterator::scanSlots(currentThread, objectPtr, (void *)&localData, stackSlotIteratorForRealtimeGC, stackFrameClassWalkNeeded, false, isConcurrentGC, isGlobalGC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sometimes you have an empty line between the setting concurrent flag and this line, and sometimes not
let's be consistent, and have an empty line (here and check other spots, too)
runtime/oti/VMHelpers.hpp
Outdated
return isContinuationMounted(continuation) || isConcurrentlyScannedFromContinuationState(continuation->state, isGlobalGC); | ||
} | ||
|
||
static VMINLINE uintptr_t getGCConcurrentState(bool isGlobalGC) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could rename it to getConcurrentGCMask, but I'll leave it up to you, if you like it
either way, you could use it at one more spot (where I got a motivation):
sConcurrentlyScannedFromContinuationState(uintptr_t continuationState, bool isGlobalGC)
returnedState = VM_AtomicSupport::lockCompareExchange(&continuation->state, complementGCConcurrentState, complementGCConcurrentState | getGCConcurrentState(isGlobalGC)); | ||
/* if the other GC happened to change its concurrentGC state since us taking a snapshot of their state, we'll have to retry */ | ||
} while (complementGCConcurrentState != (returnedState & complementGCConcurrentState)); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment:
if returned state does not contain carrier ID, return that we won
runtime/oti/VMHelpers.hpp
Outdated
J9VMThread *carrierThread = getCarrierThreadFromContinuationState(oldContinuationState); | ||
if (NULL != carrierThread) { | ||
omrthread_monitor_enter(carrierThread->publicFlagsMutex); | ||
/* notify the waiting carrierThread that we just finished scanning, and it can proceed with mounting. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expand this comment:
notify the waiting carrierThread that we just finished scanning and we were the only/last GC to scan it, so that it can proceed with mounting
runtime/oti/VMHelpers.hpp
Outdated
{ | ||
return J9_GC_CONTINUATION_STATE_INITIAL == VM_AtomicSupport::lockCompareExchange(&continuation->state, J9_GC_CONTINUATION_STATE_INITIAL, J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN); | ||
uintptr_t complementGCConcurrentState; | ||
uintptr_t returnedState; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to comply with coding standards let's initialize these 2 with _NONE
runtime/oti/VMHelpers.hpp
Outdated
if (syncWithContinuationMounting && (NULL != continuation)) { | ||
if (!tryWinningConcurrentGCScan(continuation)) { | ||
if (isConcurrentGC && (NULL != continuation)) { | ||
if (!tryWinningConcurrentGCScan(continuation, isGlobalGC)) { | ||
/* If continuation is mounted or already being scanned by another GC thread, we do nothing */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adjust this comment:
if continuation is mounted or already being scanned by another GC thread of the same GC type, we do nothing
1d3bbe0
to
1f0aba4
Compare
StackIteratorData4Scavenge localData; | ||
localData.scavengerDelegate = this; | ||
localData.env = env; | ||
localData.reason = reason; | ||
localData.shouldRemember = &shouldRemember; | ||
/* In STW GC there are no racing carrier threads doing mount and no need for the synchronization. */ | ||
bool syncWithContinuationMounting = _extensions->isConcurrentScavengerInProgress(); | ||
bool isConcurrentGC = _extensions->isConcurrentScavengerInProgress(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra space (or indentation) between these 2: bool isConcurrentGC
runtime/oti/VMHelpers.hpp
Outdated
@@ -2053,13 +2053,29 @@ class VM_VMHelpers | |||
static VMINLINE J9VMThread * | |||
getCarrierThreadFromContinuationState(uintptr_t continuationState) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this method below next to isContinuationMounted - they are related/similar
runtime/oti/VMHelpers.hpp
Outdated
return isContinuationMounted(continuation) || isConcurrentlyScannedFromContinuationState(continuation->state, isGlobalGC); | ||
} | ||
|
||
static VMINLINE uintptr_t getConcurrentGCMask(bool isGlobalGC) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this method up before isConcurrentlyScannedFromContinuationState that uses it
1f0aba4
to
8f4a858
Compare
runtime/oti/VMHelpers.hpp
Outdated
getCarrierThreadFromContinuationState(uintptr_t continuationState) | ||
{ | ||
return (J9VMThread *)(continuationState & (~(uintptr_t)J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN)); | ||
static VMINLINE uintptr_t getConcurrentGCMask(bool isGlobalGC) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move open curly bracket into the next line
a52672e
to
e32bd04
Compare
runtime/oti/VMHelpers.hpp
Outdated
static VMINLINE bool | ||
isConcurrentlyScannedFromContinuationState(uintptr_t continuationState) | ||
{ | ||
return J9_ARE_ANY_BITS_SET(continuationState, J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN); | ||
return J9_ARE_ANY_BITS_SET(continuationState, J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_ANY); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove empty line
runtime/oti/VMHelpers.hpp
Outdated
{ | ||
uintptr_t concurrentGCMask = getConcurrentGCMask(isGlobalGC); | ||
return J9_ARE_ANY_BITS_SET(continuationState, concurrentGCMask); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove empty line
@tajila pretty much all GC code, but formally a couple of files are VM, so if you want to have a quick look... |
Changes look good, but I think the |
agree, it's been suggested already that all/most of continuation specific code moves out of this generic VMHelpers, and has been attempted, but apparently there were some compile dependencies that were not trivial to untangle |
e32bd04
to
95a0ff2
Compare
StackIteratorData4HeapWalker localData; | ||
localData.heapWalker = _heapWalker; | ||
localData.env = env; | ||
localData.fromObject = objectPtr; | ||
localData.function = function; | ||
localData.userData = userData; | ||
/* so far there is no case we need ClassWalk for heapwalker, so we set stackFrameClassWalkNeeded = false */ | ||
GC_VMThreadStackSlotIterator::scanSlots(currentThread, objectPtr, (void *)&localData, stackSlotIteratorForHeapWalker, false, false); | ||
const bool isConcurrentGC = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not correct, it should be isGlobalGC that should be set to true
The issue is caused by concurrent scavenger scanning and concurrent marking scanning overlap for the same continuation Object, during concurrent continuation scanning, the current synchronization control would block the continuation mounting and ignore another concurrent scanning, but the concurrent scavenger scanning and concurrent marking are irrelevant, ignore another could cause missing scanning and the related "live" object is recycled. - updated J9VMContinuation->state, use two low bits for recording concurrentScanning(bit0 for concurrentScanningLocal and bit1 for concurrentScanningGlobal) instead of only one low bit. - pass flag isConcurrentGC and flag isGlobalGC for GC_VMThreadStackSlotIterator::scanSlots(). - handle J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_LOCAL and J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_GLOBAL independently - only both J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_LOCAL and J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_GLOBAL bits has been cleared we can notify blocked the continuation mounting thread. Signed-off-by: Lin Hu <linhu@ca.ibm.com>
95a0ff2
to
13ebc10
Compare
Jenkins test sanity aix jdk19 |
Jenkins compile win jdk11 |
Jenkins test sanity plinux jdk19 |
Jenkins test sanity aix jdk19 |
The issue is caused by concurrent scavenger scanning and concurrent
marking scanning overlap for the same continuation Object, during
concurrent continuation scanning, the current synchronization control
would block the continuation mounting and ignore another concurrent
scanning, but the concurrent scavenger scanning and concurrent marking
are irrelevant, ignore another could cause missing scanning and the
related "live" object is recycled.
updated J9VMContinuation->state, use two low bits for recording
concurrentScanning(bit0 for concurrentScanningLocal and bit1 for
concurrentScanningGlobal) instead of only one low bit.
pass flag isConcurrentGC and flag isGlobalGC for
GC_VMThreadStackSlotIterator::scanSlots().
handle J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_LOCAL and
J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_GLOBAL independently
only both J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_LOCAL and
J9_GC_CONTINUATION_STATE_CONCURRENT_SCAN_GLOBAL bits has been cleared
we can notify blocked the continuation mounting thread.
fix:#16591
Signed-off-by: Lin Hu linhu@ca.ibm.com