Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow querying if the current GC may move objects. #1128

Merged
merged 7 commits into from
May 16, 2024

Conversation

wks
Copy link
Collaborator

@wks wks commented Apr 25, 2024

The major changes include:

  • Adding a method Plan::current_gc_may_move_object.
  • Immix-based plans now reset the in-defrag state at the end of GC.

The main change of this PR is allowing querying if the current GC may move any object.

Whether a plan moves objects or not is not a simple "yes or no" question. It may vary in each GC.

  • Some plans, such as MarkSweep, never moves any object.
  • Some plans, such as SemiSpace and MarkCompact, moves objects in every GC.
  • Some plans, such as Immix, moves object in defrag GC, while "fast" GCs are non-moving.
  • Some plans, such as Immix and StickyImmix, depends on Cargo features. "immix_non_moving" prevents all movement, and "sticky_immix_non_moving_nursery" prevents movement in nursery GCs.

The information of whether the current GC is useful inside mmtk-core. For example, we can skip clearing on-the-side forwarding bits in non-moving GCs because no objects will be forwarded. Because this should happen during the Release stage, we postponed the time for Immix-based plans to reset the in-defrag state to the end of GC so that Plan::current_gc_may_move_object remains callable during the Release stage. (Note that this PR does not fix #1118, but this mechanism introduced in this PR can be used to implement the clearing of side forwarding bits efficiently.)

The information is also useful for VM bindings. Therefore, Plan::current_gc_may_move_object is part of the public API.

  • For VMs that have "potential pinning parents" (PPP, i.e. non-root objects that have fields that cannot be updated), the VM binding must pin the children of those fields before starting tracing. If the VM binding knows the current GC never moves any object, it can skip pinning the children of PPPs.
  • For VMs that maintain weak tables (tables that contain weak references to heap objects) in the runtime, the VM binding needs to scan the weak tables, updating entries for objects that are moved, and deleting entries for unreachable objects. (Note the cost of updating entries may be high if the table is hashed by address.) If the VM binding knows the current GC never moves any object, it will not need to update the entries for moved live objects. And if the current GC is also a nursery GC, the VM binding will only need to scan entries added before the last GC.

This new method complements existing mechanisms for checking if objects can be moved. They are useful for different purposes and cannot replace each other.

  • The scope of the new Plan::current_gc_may_move_object is one GC. It is useful for the VM to decide whether to pin PPPs objects before a GC and how to process weak references or weak tables.
  • PlanConstraints::moves_objects is a per-plan constant. It is useful for the VM to determine the object layout w.r.t. address-based hashing.
  • The scope of ObjectReference::is_movable is a single object. It is useful for the VM to skip pinning certain objects when interacting with native code.
  • The scope of PlanTraceObject::may_move_object is a trace. It is an internal mechanism for MMTk core to specialize work packets for different kinds of traces.

@k-sareen
Copy link
Collaborator

k-sareen commented Apr 25, 2024

This feature already exists as ObjectReference.is_movable:

/// Can the object be moved?
pub fn is_movable<VM: VMBinding>(self) -> bool {
unsafe { SFT_MAP.get_unchecked(self.to_address::<VM>()) }.is_movable()
}

If you want it to be specific to current GC, you could edit the is_movable function itself instead of creating a new function. (Arguably in cases like the ones you mentioned above, the VM cares about specific objects, as opposed to all objects)

@wks
Copy link
Collaborator Author

wks commented Apr 25, 2024

If you want it to be specific to current GC, you could edit the is_movable function itself instead of creating a new function. (Arguably in cases like the ones you mentioned above, the VM cares about specific objects, as opposed to all objects)

The key is that the VM does not want to iterate through the list of PPP children or entries in weak tables and inspect if each of them may move. If it knows the current GC will not move any object (actually it is more likely for a GC to be non-moving than moving for Immix because only defrag GC moves object, and also true for StickyImmix with non-moving nursery GC), it can skip all PPPs and skip all existing entries in the weak table except the newly added ones.

@qinsoon
Copy link
Member

qinsoon commented Apr 25, 2024

I think the PR is generally okay. But there are a few things we need to get it clear.

  1. When can this method be used? It mentions 'current GC' which assumes it can only be called during a GC. However, it also uses information like is_current_gc_nursery() and immix_space.in_defrag(). is_current_gc_nursery() is available from ScheduleCollection to the end of a GC, and immix_space.in_defrag() is available from ScheduleCollection to release (it is set to false in release). Telling bindings that this method can only be called between ScheduleCollection and release seems revealing too much information about MMTk internals. I am a little concerned about this.
  2. When can a binding call this method? MMTk calls stop_mutators() but the information about this method is not yet available (defrag is unknown at this point). There seems no reasonable global calls to the binding since then until weak reference processing which seems too late for a binding to do anything useful. When do we expect a binding to call this method?
  3. With this new method, we now have many different methods to tell if a GC is moving or not. They are all related, but differs in some way.
    • PlanConstraints.moves_objects: tells if a plan may move objects (a plan will not move objects if all the spaces will never move objects).
    • PolicyTraceObject::may_move_object: tells if a trace may move objects for a policy.
    • PlanTraceObject::may_move_object: tells if a trace may move objects for a plan (a plan will not move objects for a trace if all the spaces will not move objects).
    • SFT::is_movable/ObjectReference.is_movable: tells if a policy may move objects. Ideally PlanConstraints.move_objects should base on this value for all its policies.
    • Plan::current_gc_may_move_object: tells if a plan may move objects for the current GC.

The 3rd issue is not introduced by this PR, so it does not need to be addressed in this PR. But I hope we can have some ideas about how to improve this before merging the PR. 1. In theory, we just need to define a constant may_move_object for each policy based on the trace, and everything else will depend on the constant (given we can list all the traces and all the spaces). 2. The function names should be more meaningful on what they return.

@k-sareen
Copy link
Collaborator

I mean technically is_movable is already incorrect since it does not take pinning bits/(transitively) pinned roots into account. I think this whole thing needs to be rethought.

@qinsoon
Copy link
Member

qinsoon commented Apr 26, 2024

I mean technically is_movable is already incorrect since it does not take pinning bits/(transitively) pinned roots into account. I think this whole thing needs to be rethought.

Only ObjectReference.is_movable knows the actual object into consideration. Everything else is based on policy.

@k-sareen
Copy link
Collaborator

Yes. I meant that a binding can't even rely on is_movable for accurate information since it basically only checks policy/build flags. I guess it depends on how fine-grained we want this information to be. For example, in the case of Immix, technically any object not in a defrag source will not move so we could even check that (I'm not saying we should do this though).

@wks
Copy link
Collaborator Author

wks commented Apr 26, 2024

1. When can this method be used? It mentions 'current GC' which assumes it can only be called during a GC. However, it also uses information like `is_current_gc_nursery()` and `immix_space.in_defrag()`. `is_current_gc_nursery()` is available from `ScheduleCollection` to the end of a GC, and `immix_space.in_defrag()` is available from `ScheduleCollection` to release (it is set to `false` in release). Telling bindings that this method can only be called between `ScheduleCollection` and release seems revealing too much information about MMTk internals. I am a little concerned about this.

This method should be callable from ScheduleCollection to the end of a GC. I think immix_space.in_defrag() should be changed so that it is cleared in EndOfGC instead. The doc comment of in_defrag says:

    /// Check if the current GC is a defrag GC.
    pub fn in_defrag(&self) -> bool {

So its scope should be "the current GC", not "if the GC is currently in the phase of defragmenting the heap". Other per-GC information should be cleared at the end of GC. I am actually considering introducing a data structure about "the current GC", and it should include fields like "is the current GC nursery", "is the current GC moving", "is the current GC xxxxxx", etc. That may be part of GlobalState.

Strangely, CommonGenPlan::gc_full_heap is never cleared. When GC ends, its value can still be queried using GenerationalPlan::last_collection_full_heap(), although it has no call sites in mmtk-core. I doubt if the binding will need to know if the last collection was full heap during mutator time.

2. When can a binding call this method? MMTk calls `stop_mutators()` but the information about this method is not yet available (defrag is unknown at this point). There seems no reasonable global calls to the binding since then until weak reference processing which seems too late for a binding to do anything useful. When do we expect a binding to call this method?

Actually in_defrag is decided in Immix::schedule_immix_full_heap_collection which is called in schedule_collection which happens before StopMutators.

And the newly introduced Plan::current_gc_may_move_object is intended to be called in stop_mutators(). (More precisely, it should be vm_prepare (the counterpart of vm_release which was later renamed to post_forwarding), but we decided that was unnecessary as it would be called right after stop_mutator.) For example, the Ruby binding needs to pin the children of PPPs before the first transitive closure, which means it must happen in Prepare. The work packets for pinning PPP children are created by the VM binding in stop_mutators. If it knows at that time the current GC is non-moving, it will skip creating those work packets.

3. With this new method, we now have many different methods to tell if a GC is moving or not. They are all related, but differs in some way.
   
   * `PlanConstraints.moves_objects`: tells if a plan may move objects (a plan will not move objects if all the spaces will never move objects).
   * `PolicyTraceObject::may_move_object`: tells if a trace may move objects for a policy.
   * `PlanTraceObject::may_move_object`: tells if a trace may move objects for a plan (a plan will not move objects for a trace if all the spaces will not move objects).
   * `SFT::is_movable`/`ObjectReference.is_movable`: tells if a policy may move objects. Ideally `PlanConstraints.move_objects` should base on this value for all its policies.
   * `Plan::current_gc_may_move_object`: tells if a plan may move objects for the current GC.

Those serve different purposes.

  • PlanConstraints.moves_objects: This is the most coarse-grained fact about a plan, i.e. if it would ever move any object. This is a constant, and can be obtained without actually doing a GC. There is no users in mmtk-core. Currently only the JikesRVM binding uses it (to determine the object layout w.r.t. address-based hashing).
  • PlanTraceObject::may_move_object and the per-space implementations PolicyTraceObject::may_move_object: They are used as an optimisation for PlanProcessEdges to elide a check after tracing an object. It is called very often (in the tracing loop), so it must be fast.
  • SFT::is_movable/ObjectReference.is_movable: This is per-object information. JikesRVM uses it (using the wrapper MemoryManager.willNeverMove) in native IO operations to elide pinning. The OpenJDK, Ruby and Julia bindings makes C wrapper bool mmtk_will_never_move(ObjectReference), but it is never used in any of the VMs or the bindings.

And the newly introduced mechanism:

  • Plan::current_gc_may_move_object: As described in this PR, this is mainly intended to let MMTk-core and bindings decide whether to schedule some work packets in the current GC. It's not called often, but it needs up-to-date information, and should not take implementation details (such as TraceKind) as arguments.

I think they serve distinct purposes, and cannot replace each other. For example, the VM must determine the object layout before doing GC, so PlanConstraints.moves_objects is necessary. But if the VM wants to determine whether to pin children of PPPs (which takes time), it must have per-GC information because even for a copying collector like Immix, most GCs are non-copying anyway. When doing IO, the VM cares about a specific object, not other objects.

But it is worth thinking whether it is cheaper to blindly set the pinning bit or checking if an object never moves (because it is in the LOS or the GC is non-moving), or add the object to a set of pinning roots using some kind of scope pinning mechanism like C#'s fixed (byte* ptr = array) { ... } statement.

The 3rd issue is not introduced by this PR, so it does not need to be addressed in this PR. But I hope we can have some ideas about how to improve this before merging the PR.

  1. In theory, we just need to define a constant may_move_object for each policy based on the trace, and everything else will depend on the constant (given we can list all the traces and all the spaces).

This assumes each GC has one trace. I think that's reasonable for all the GCs we currently know. MarkCompact computes two transitive closures in one GC, and we may consider it as a special case. Then the return value of Plan::current_gc_may_move_object can simply depend on the active trace of the current GC. We used to have a Trace type from JikesRVM. It is not exactly identical to ProcessEdgesWork and ScanObject, but we may re-introduce it if we need to. It will be a global data structure that holds information specific to one GC.

But what about reference-counting-based plans that are not based on tracing? Maybe we just call it CurrentCollection or CurrentGC. Trace can be one of its implementation.

  1. The function names should be more meaningful on what they return.

I think may_move_object is good enough in most cases. The scope (this plan, this trace, this object, this GC) can be prefixed by the type name, such as PlanConstrant::may_move_object, PlanTraceObject::<TRACE_KIND_DEFRAG>::may_move_object, ObjectReference::may_move.

Plan::current_gc_may_move_object may be a bit mouthful. Maybe mmtk.global_states.current_gc.may_move_object is easier to understand. If we re-introduce the Trace type, we may have mmtk.plan.current_trace().may_move_object().

@wks wks force-pushed the feature/current-gc-moves-object branch from 946d1fe to e27e488 Compare May 15, 2024 11:48
@wks wks marked this pull request as ready for review May 16, 2024 05:59
@wks wks requested a review from qinsoon May 16, 2024 05:59
/// of objects.
///
/// This function is callable during a GC. From the VM binding's point of view, the information
/// of whether the current GC is a defrag GC is available since `Collection::stop_mutators` is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioning defrag could be confusing here.

Suggested change
/// of whether the current GC is a defrag GC is available since `Collection::stop_mutators` is
/// of whether the current GC moves object or not is available since `Collection::stop_mutators` is

@@ -126,9 +126,6 @@ impl<VM: VMBinding> Plan for StickyImmix<VM> {

fn release(&mut self, tls: crate::util::VMWorkerThread) {
if self.is_current_gc_nursery() {
let was_defrag = self.immix.immix_space.release(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to call immix_space.release() here, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. New objects can be allocated into the immix_space and the LOS, and we still need to release them. I'll fix it.

@@ -158,6 +159,14 @@ impl<VM: VMBinding> Plan for StickyImmix<VM> {
self.gc_full_heap.load(Ordering::Relaxed) && self.immix.last_collection_was_exhaustive()
}

fn current_gc_may_move_object(&self) -> bool {
if self.is_current_gc_nursery() {
!cfg!(feature = "sticky_immix_non_moving_nursery")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be easier to understand and less error prone than directly checking the feature.

Suggested change
!cfg!(feature = "sticky_immix_non_moving_nursery")
crate::policy::immix_space::PREFER_COPY_ON_NURSERY_GC

@qinsoon qinsoon added the PR-extended-testing Run extended tests for the pull request label May 16, 2024
Copy link
Member

@qinsoon qinsoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wks wks enabled auto-merge May 16, 2024 07:48
@wks wks added this pull request to the merge queue May 16, 2024
Merged via the queue into mmtk:master with commit 9047e23 May 16, 2024
28 of 30 checks passed
@wks wks deleted the feature/current-gc-moves-object branch May 16, 2024 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR-extended-testing Run extended tests for the pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stale forwarding bits in nursery GC in StickyImmix
3 participants