Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weak reference processing #564

Merged
merged 30 commits into from
Apr 28, 2022
Merged

Weak reference processing #564

merged 30 commits into from
Apr 28, 2022

Conversation

qinsoon
Copy link
Member

@qinsoon qinsoon commented Mar 24, 2022

This PR revamps the reference processor, and schedules reference processing work. This PR is one step towards #544: it gets the old reference processor working again.

There is some more work needed for weak reference processing. So no_reference_types is true and by default, we do not process weak references.

Changes

  • Rework on ReferenceProcessor:
    • retain soft references properly, depending on whether the collection is an emergency collection or not.
    • add a separate reference enqueue step. For references that need to be enqueue'd (by Java semantics), the reference processor will store them in a buffer, and call to the binding at the end of a GC to enqueue the references. This makes sure the references are valid during the GC.
    • reference table is now an HashSet rather than Vec. We used to assume weak references are added to MMTk once they are created, so for one reference, it will be added once. Now we allow adding weak references during GC, and a weak reference may be traced multiple times in a GC, and added multiple times to MMTk. Using HashSet can deduplicate the references.
    • Remove unnecessary use of UnsafeCell for ReferenceProcessorSync.
    • Remove unnecessary unforwarded_references: I suspect it was intended to deal with meta-circularity in JikesRVM/Java MMTk (https://github.com/JikesRVM/JikesRVM/blob/5072f19761115d987b6ee162f49a03522d36c697/MMTk/ext/vm/jikesrvm/org/jikesrvm/mm/mmtk/ReferenceProcessor.java#L96), and is no longer needed for MMTk core. For MMTk core, any Rust struct and field is 'untraced' unless we specifically 'trace' it.
    • Add a allow_new_candidate field to avoid adding new candidates after we finish processing references. See the comments for the field for details.
    • process_reference() is moved from ReferenceGlue (in VMBinding traits) to ReferenceProcessor.
    • enqueue_references() is added to ReferenceGlue.
  • Add a few work packets for weak reference processing. They are simply wrapping the reference processor's methods, and are single threaded.
  • Add a few more work buckets for weak reference processing: SoftRefClosure, WeakRefClosure, PhantomRefClosure, FinalRefClosure, FinalizableForwarding.
  • Schedule the weak reference processing work properly.
  • Rename ProcessWeakRef (which call VMBinding::VMCollection::process_weak_refs()) to VMProcessWeakRef.
  • add_soft/weak/phantom_candidate() now no longer needs a referent argument.

@qinsoon qinsoon marked this pull request as ready for review April 7, 2022 05:26
@qinsoon qinsoon requested review from wks and caizixian April 7, 2022 05:32
reference: ObjectReference,
tls: VMWorkerThread,
) -> ObjectReference;
/// For reference types, if the referent is cleared during GC, the reference
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK for now because we currently implement Java Reference semantics. But the term "reference" and "reference type" is confusing for both Java and other languages, too, mostly other languages.

We already use the term "reference" to refer to a pointer to an object. Here we refer to a subclass of java.lang.ref.Reference.

In Java, some textbooks use the term "primitive types" (or "value types") to refer to bool, char, byte, short, int, long, float and double, and use the term "reference types" to refer to java.lang.Object and all of its descendants. Other languages may use this term similarly.

We should find a more generally acceptable term for "reference type" and "reference processor" and so on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we can use the word 'weak reference' as a general term to refer to all kinds of weak references (including soft/phantom references). Also we can use WeakReferenceGlue, and WeakReferenceProcessor.

src/scheduler/work_bucket.rs Outdated Show resolved Hide resolved
src/util/reference_processor.rs Outdated Show resolved Hide resolved
src/util/reference_processor.rs Outdated Show resolved Hide resolved
src/util/reference_processor.rs Outdated Show resolved Hide resolved
src/util/reference_processor.rs Outdated Show resolved Hide resolved
src/util/reference_processor.rs Outdated Show resolved Hide resolved
src/util/reference_processor.rs Outdated Show resolved Hide resolved
src/util/reference_processor.rs Outdated Show resolved Hide resolved
if !referent.is_null() {
trace.retain_referent(referent);
Self::keep_referent_alive(trace, referent);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep_referent_alive calls e.trace_object, and it may move the object, and return the forwarded reference. But we are not calling VMReferenceGlue::set_referent here. Currently the two call sites of set_referent are:

  1. in forward for MarkCompact.
  2. in process_reference. But scan calls process_reference when retain is false. But this line is executed when retain is true.

So if we use the SemiSpace plan, and a SoftReference holds the only pointer to another object, and this is the first time this edge is traced, will this SoftReference hold a dangling pointer after that?

Suggested change
Self::keep_referent_alive(trace, referent);
let new_referent = Self::keep_referent_alive(trace, referent);
<E::VM as VMBinding>::VMReferenceGlue::set_referent(reference, new_referent);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not necessary. However, the code is a bit confusing. What actually happens is this:

  1. In SoftRefClosure:
    • if emergency collection { retain soft refs (no update) }
  2. In WeakRefClosure:
    • scan soft refs (with update)
    • scan weak refs (with update)

I will change this to:

  1. In SoftRefClosure:
    • if emergency collection { retain soft refs (no update) }
    • scan soft refs (with update)
  2. In WeakRefClosure:
    • scan weak refs (with update)

@qinsoon
Copy link
Member Author

qinsoon commented Apr 8, 2022

binding-refs
OPENJDK_BINDING_REF=weak-ref
JIKESRVM_BINDING_REF=weak-ref
V8_BINDING_REF=update-pr-564

@qinsoon qinsoon added the PR-testing Run binding tests for the pull request (deprecated: use PR-extended-testing instead) label Apr 8, 2022
@qinsoon
Copy link
Member Author

qinsoon commented Apr 8, 2022

Performance results for the PR (Immix)

Benchmark time: 1% slowdown with weak ref processing
weak-ref-bmtime

STW time: 10% slowdown with weak ref
weak-ref-gctime

Immix process edges time: 5% slowdown with weak ref
weak-ref-process-edges

Data link (not publicly visible) link

@wks
Copy link
Collaborator

wks commented Apr 8, 2022

The following program crashes on OpenJDK with MMTk, but runs fine without -XX:+UseThirdPartyHeap:

import java.lang.ref.SoftReference;

class Foo {
    public SoftReference<Foo> sr;
    int num;
}

public class SoftStrongSoft {
    public static SoftReference<Foo> setUp() {
        Foo foo1 = new Foo();
        foo1.num = 30;

        Foo foo2 = new Foo();
        foo2.num = 40;

        SoftReference<Foo> sr1 = new SoftReference<Foo>(foo1);
        SoftReference<Foo> sr2 = new SoftReference<Foo>(foo2);

        foo1.sr = sr2;

        return sr1;
    }

    public static void fillUpTheHeap() {
        String[] ar = new String[1000000];
        for (int i = 1; i < 1000000; i++) {
            String str = String.valueOf(i);
            if (i % 100 == 0) {
                ar[i] = str;
            }
        }
    }

    public static void main(String[] args) {
        SoftReference<Foo> sr1 = setUp();

        fillUpTheHeap();
        //System.gc();
        //Runtime.getRuntime().gc(); // Just in case System.gc doesn't trigger GC.

        Foo foo1 = sr1.get();
        Foo foo2 = foo1.sr.get();

        System.out.println(foo1.num);
        System.out.println(foo2.num);
    }
}

Run with:

javac SoftStrongSoft.java -source 11 -target 11
export MMTK_NO_REFERENCE_TYPES=false
/path/to/java -XX:+UseThirdPartyHeap -Xmx64M SoftStrongSoft

Result:

[2022-04-08T04:37:18Z INFO  mmtk::memory_manager] Initialized MMTk with SemiSpace
[2022-04-08T04:37:19Z INFO  mmtk::plan::global]   [POLL] copyspace0: Triggering collection
[2022-04-08T04:37:19Z INFO  mmtk::scheduler::gc_work] End of GC
[2022-04-08T04:37:19Z INFO  mmtk::plan::global]   [POLL] copyspace1: Triggering collection
[2022-04-08T04:37:19Z INFO  mmtk::scheduler::gc_work] End of GC
30
Exception in thread "main" java.lang.NullPointerException
        at SoftStrongSoft.main(SoftStrongSoft.java:45)

The first SoftReference, i.e. sr1 is retained, but the second sr2 is not. I confirmed that emergency collection did not happen.

I think we should repeat the soft reference processing until no further objects can be reached by traversing soft-strong-soft-srong-soft-...-soft-strong references. FinalizerReference will need it, too. It should not be needed for weak/phantom references because weak/phantom references never keep their referents alive.

@qinsoon
Copy link
Member Author

qinsoon commented Apr 13, 2022

The following program crashes on OpenJDK with MMTk, but runs fine without -XX:+UseThirdPartyHeap:

import java.lang.ref.SoftReference;

class Foo {
    public SoftReference<Foo> sr;
    int num;
}

public class SoftStrongSoft {
    public static SoftReference<Foo> setUp() {
        Foo foo1 = new Foo();
        foo1.num = 30;

        Foo foo2 = new Foo();
        foo2.num = 40;

        SoftReference<Foo> sr1 = new SoftReference<Foo>(foo1);
        SoftReference<Foo> sr2 = new SoftReference<Foo>(foo2);

        foo1.sr = sr2;

        return sr1;
    }

    public static void fillUpTheHeap() {
        String[] ar = new String[1000000];
        for (int i = 1; i < 1000000; i++) {
            String str = String.valueOf(i);
            if (i % 100 == 0) {
                ar[i] = str;
            }
        }
    }

    public static void main(String[] args) {
        SoftReference<Foo> sr1 = setUp();

        fillUpTheHeap();
        //System.gc();
        //Runtime.getRuntime().gc(); // Just in case System.gc doesn't trigger GC.

        Foo foo1 = sr1.get();
        Foo foo2 = foo1.sr.get();

        System.out.println(foo1.num);
        System.out.println(foo2.num);
    }
}

Run with:

javac SoftStrongSoft.java -source 11 -target 11
export MMTK_NO_REFERENCE_TYPES=false
/path/to/java -XX:+UseThirdPartyHeap -Xmx64M SoftStrongSoft

Result:

[2022-04-08T04:37:18Z INFO  mmtk::memory_manager] Initialized MMTk with SemiSpace
[2022-04-08T04:37:19Z INFO  mmtk::plan::global]   [POLL] copyspace0: Triggering collection
[2022-04-08T04:37:19Z INFO  mmtk::scheduler::gc_work] End of GC
[2022-04-08T04:37:19Z INFO  mmtk::plan::global]   [POLL] copyspace1: Triggering collection
[2022-04-08T04:37:19Z INFO  mmtk::scheduler::gc_work] End of GC
30
Exception in thread "main" java.lang.NullPointerException
        at SoftStrongSoft.main(SoftStrongSoft.java:45)

The first SoftReference, i.e. sr1 is retained, but the second sr2 is not. I confirmed that emergency collection did not happen.

I think we should repeat the soft reference processing until no further objects can be reached by traversing soft-strong-soft-srong-soft-...-soft-strong references. FinalizerReference will need it, too. It should not be needed for weak/phantom references because weak/phantom references never keep their referents alive.

On second thought, I feel there is no guarantee that soft references in this example will be retained, and it is not a correctness bug if the soft references are not retained. It probably will depend on the GC implementation.

Copy link
Collaborator

@wks wks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I observed the same behaviour in JikesRVM. The Java API documentation seems to allow this behaviour for SoftReference.

This PR looks good now.

@qinsoon
Copy link
Member Author

qinsoon commented Apr 21, 2022

This PR introduces weak reference, but weak reference processing is disabled by default. The following results show that with disabled weak reference processing, the GC/STW time is unchanged, in comparison with master (benchmarked with OpenJDK). The performance issues with weak reference processing will be further addressed in future PRs.

I will merge this PR soon.

weak-ref-disable

@qinsoon qinsoon merged commit 2117612 into master Apr 28, 2022
@qinsoon qinsoon deleted the reference-processor branch April 28, 2022 05:29
qinsoon added a commit to mmtk/mmtk-openjdk that referenced this pull request Apr 28, 2022
This adds weak reference support, and updates MMTk core to mmtk/mmtk-core#564.

* if weak reference is enabled in MMTk, we will add weak references during our tracing. Otherwise, still treat them as strong references.
* implement new changes in mmtk/mmtk-core#564
* add test with weak reference enabled (by default, it is disabled).
qinsoon added a commit to mmtk/mmtk-jikesrvm that referenced this pull request Apr 28, 2022
This adds weak reference support, and updates MMTk core to mmtk/mmtk-core#564.

* implement new changes in mmtk/mmtk-core#564
* properly set `referent` in `add_xxxx_candidates()`.
* Implement `get_boolean_option()` (see changes in mmtk/jikesrvm#12).
* add test with weak reference enabled (by default, it is disabled).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR-testing Run binding tests for the pull request (deprecated: use PR-extended-testing instead)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants