-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove NULL ObjectReference #1043
Comments
|
MMTk never provides an official C API. So it is the bindings that wrap MMTk API functions and provide them as functions callable from C. If an MMTk API uses // In MMTk core
fn some_api_function(maybe_object: Option<ObjectReference>) -> Option<ObjectReference> { ... } Its wrapper should be // In VM binding
extern "C" fn mmtk_some_api_function(maybe_object: usize) -> usize {
let arg: Option<ObjectReference> = match NonZeroUsize::new(maybe_object) {
None -> None,
Some(nzu) -> Some(ObjectReference(nzu)),
};
let result: Option<ObjectReference> = some_api_function(arg);
match result {
None -> 0,
Some(object) -> object.as_usize(),
}
} Or more concisely, extern "C" fn mmtk_some_api_function(maybe_object: usize) -> usize {
some_api_function(ObjectReference::from_usize_zeroable(maybe_object)).to_usize_zeroable()
} where |
I kind of disagree that it is not required to be exposed. It is more natural to a binding developer to already use the type MMTk uses, i.e. |
If "their own API" is in Rust, they can use But if the binding needs an API for the runtime implemented in another language, or passing And yes. Having |
That's not FFI then. It's just Rust, which is perfectly fine. Unfortunately, most modern production systems are written in C or C++ so we need to think about FFI regardless. I think then the ideal is that |
Well, not 100% opaque. It has to satisfy some criteria. More detailed discussion is here: #1044 But I don't object the idea that there are other possible ways to implement |
In the preliminary implementation #1064, pub fn from_raw_address(addr: Address) -> Option<ObjectReference>;
pub unsafe fn from_raw_address_unchecked(addr: Address) -> ObjectReference;
In the OpenJDK binding,
Sometimes I feel that adding a |
For languages that use a special singleton object to represent its 'NULL' reference, can they use |
It depends how that singleton is allocated. If the singleton is allocated in the MMTk heap, then If the singleton is a C static variable, or if it is allocated by malloc, it should return In CPython, the In Julia, the
|
This sounds like a pretty confusing definition of Also based on what you said, in some cases, the special singleton object will be put to the tracing queue every time we see such an object in an empty slot, and that probably will incur an overhead. Usually those should be dealt with using a check (like our old null check in |
The current doc is: pub trait Edge: Copy + Send + Debug + PartialEq + Eq + Hash {
/// Load object reference from the slot.
///
/// If the slot is not holding an object reference (For example, if it is holding NULL or a
/// tagged non-reference value. See trait-level doc comment.), this method should return
/// `None`.
///
/// If the slot holds an object reference with tag bits, the returned value shall be the object
/// reference with the tag bits removed.
fn load(&self) -> Option<ObjectReference>; It may be worth mentioning those singleton objects in the comments. The main idea is, return
"Can be traced" rules out "should be traced" main affects objects in the immortal space. It's harmless not to trace those immortal objects as long as (1) they don't point to other objects, or (2) they are treated as root. Statically allocated objects should not be traced by MMTk, unless the VM implements
Some special objects, such as Steve once proposed doing the dispatch when scanning the object and put objects in different queues. If we implement that in the future, those special objects (as static variables, or in immortal spaces, or objects known to be rooted and non-movable) will be filtered out before they are enqueued. |
Those objects should be immortal, and should not be traced. In the current |
I see your point. It's probably not a good idea to define the semantics of
The simplest way to decide is, if the slot contains a reference to an object in the heap, then return In the simplest case, the VM only ever allocate objects using But if the VM implements For Julia, For CPython, I think at this moment, we don't mention the optimization in
The same is true for This also implies that we haven't designed the reference counting counterpart of
And I also expect If we want to do the optimization of omitting some edges (such as those pointing to
Right.
One of them may be more efficient than others according to the nature of the concrete VM. |
@wenyuzhao asked how to store null pointer to an In general, this should be done in a VM-specific way, because not all programming languages have null pointers. For example, I made changes to the OpenJDK binding and added an OpenJDK-specific method impl<const COMPRESSED: bool> OpenJDKEdge<COMPRESSED> {
// ...
pub fn store_null(&self) {
if cfg!(any(target_arch = "x86", target_arch = "x86_64")) {
if COMPRESSED {
if self.is_compressed() {
self.x86_write_unaligned::<u32, true>(0)
} else {
self.x86_write_unaligned::<Address, true>(Address::ZERO)
}
} else {
self.x86_write_unaligned::<Address, false>(Address::ZERO)
}
} else {
debug_assert!(!COMPRESSED);
unsafe { self.addr.store(0) }
}
} But for debug purposes, we may add a method to the
|
MMTk can only use I am wondering if we would like to introduce a type such as This also solves the issue in the above comment, as MMTk is able to represent a null reference. |
But the fact is, there is only one use of Speaking of information loss, the only case of information loss was the bug I mentioned in |
I don't think we want to focus on the current code base when discussing MEP. The current design is obviously very Java centric, and is not general. The question is whether the proposal is flexible for future. Being not able to express a proper null pointer could be a potential issue. But I don't think it is a show stopper, and it can be amended. |
This is slightly off-topic, but perhaps we want to add functions like |
I think it is OK for debugging. But just like I mentioned, currently MMTk doesn't seem to care about any form of null references. Let's see if we need them in the future, and we shall add them when needed. But if we want to implement reference processing, this may not be the right choice. For example, Julia has two forms of null references, |
During the meeting on 31 January 2024, we reached consensus on this MEP. We will wait for 24 hours for anyone to raise objections against this MEP. I nobody express their objection before 2pm, 1 February 2024 Canberra Time (UTC+11:00), we will declare this MEP as passed. |
I made some changes according to our discussion today.
|
Since there is no objections raised after the meeting, this MEP passes the review. |
We changed `ObjectReference` so that it is backed by `NonZeroUsize`, and can no longer represent a NULL reference. For cases where an `ObjectReferences` may or may not exist, `Option<ObjectReference>` should be used instead. This is an API-breaking change. - `ObjectReference::NULL` and `ObjectReference::is_null()` are removed. - `ObjectReference::from_raw_address()` now returns `Option<ObjectReference>`. The unsafe method `ObjectReference::from_raw_address_unchecked()` assumes the address is not zero, and returns `ObjectReference`. - API functions that may or may not return a valid `ObjectReference` now return `Option<ObjectReference>`. Examples include `Edge::load()` and `ReferenceGlue::get_referent()`. If a VM binding needs to expose `Option<ObjectReference>` to native (C/C++) programs, a convenient type `crate::util::api_util::NullableObjectReference` is provided. Fixes: #1043
This is the first attempt to use the MEP process for changing a fundamental part of MMTk.
TL;DR
Currently MMTk assumes
ObjectReference
can be either a pointer to an object or NULL, which is not general for all VMs, especially the VMs that can store tagged values in slots. Meanwhile, MMTk core never processes NULL references. We propose removingObjectReference::NULL
so thatObjectReference
is always a valid object reference.Goal
ObjectReference::NULL
so thatObjectReference
always refers to an object.Non-Goal
Address
to be a non-zero address. If we need, we can add a new typeNonZeroAddress
separately.nothing
andmissing
in Julia,None
in CPython, andnull
andundefined
in V8) during tracing. I have opened a separate issue for it: Skip object graph edges to immortal non-moving objects when tracing #1076Success Metric
object.is_null()
from mmtk-core.ObjectReference::NULL
can still work, by usingNone
or using other designs.Motivation
Status-quo: All
ObjectReference
instances refer to objects, exceptObjectReference::NULL
.Currently,
ObjectReference::NULL
is defined asObjectReference(0)
, and is used ot represent NULL pointers. However, itNULL and 0 are not general enough
Not all languages have NULL references. Haskell, for example, is a functional language and all varaibles are initialized before using.
For some VMs (such as CRuby, V8 and Lua), a slot may hold non-reference values. Ruby and V8 can put small integers in slots. Ruby can also put special values such as
true
,false
andnil
in slots.Even if a language has NULL references of some sort, they are not always encoded the same way. Some VMs (such as V8 and Julia) even have different flavors of NULL or "missing value" types.
null
null
nil
false
is represented as 0null
Oddball
typeundefined
Oddball
typenothing
jl_nothing
)Nothing
typemissing
struct Missing
type, defined in JuliaNone
Py_None
)NoneType
CRuby encodes
nil
as 4 instead of 0. Python uses a valid reference to a singleton objectNone
to represent missing values.Some languages have multiple representations of non-existing values. JavaScript has both
null
andundefined
. Julia has bothnothing
andmissing
.For reasons listed above, a single constant
ObjectReference::NULL
with numerical value0
is not general at all to cover the cases of missing references or special non-reference values in languages and VMs.NULL pollutes the API design.
Previously designed for Java, MMTk assumes that
This has various influences on the API design and the internal implementation of MMTk-core.
Processing slots (edges)
This issue is discussed in greater detail in #1031. It has been fixed in #1032. Before it was fixed, the method
ProcessEdgesWork::process_edge
behaved like this:In these three lines,
slot.load()
loads from the slot verbatim, interpreting 0 asObjectReference::NULL
.trace_object
handlesNULL
"gracefully" by returningNULL
, too.slot.store(object)
may overwritesNULL
withNULL
, which was supposed to be "benign".Such assumptions breaks if (1) the VM does not use 0 to encode NULL, or (2) the VM can hold tagged non-reference values in slots. CRuby is affected by both.
PR #1032 fixes this problem by allowing
slot.load()
to returnObjectReference::NULL
even ifnil
is encoded as 4, or if the slot holds small integers, andprocess_edge
simply skip such slots. It is now general enough to support V8, Julia and CRuby. However, the use ofObjectReference::NULL
to represent skipped fields is not idiomatic in Rust. We should useOption<ObjectReference>
instead.ReferenceProcessor
Note: In the future we may move
ReferenceProcessor
andReferenceGlue
out of mmtk-core. See: #694ReferenceProcessor
is designed for Java, and a JavaReference
(soft/weak/phantom ref) can be cleared by setting the referent tonull
. The default implementation ofReferenceGlue
works this way.ReferenceGlue::clear_referent
sets the referent toObjectReference::NULL
, andReferenceProcessor
checks if aReference
is cleared by callingreferent.is_null()
.It works for Java. But not Julia because Julia uses a pointer
jl_nothing
to represent cleared references. AlthoughReferenceGlue::clear_referent
can be overridden, it was not enough. Commit 9648aed addedReferenceGlue::is_referent_cleared
so thatReferenceProcessor
can compare the referent againstjl_nothing
instead ofObjectReference::NULL
.p.s.
ReferenceGlue::clear_referent
is the only place in mmtk-core (besides tests) that uses the constantObjectReference::NULL
. This means the major part of mmtk-core does not work withNULL
references from the VM.NULL-checking is hard to do right
ObjectReference
can beNULL
, and the type system cannot tell if a value of typeObjectReference
isNULL
or not. As a consequence, programmers have to insert NULL-checking statements everywhere. It's very easy to miss necessary checks and add redundant checks.Missing NULL checks
In the reference processor, the following lines load an
ObjectReference
from a weak reference object, and try to get its forwarded address.The code snippet calls
get_forwarded_referent
regardless whetherold_referent
has been cleared or not. Becauseget_forwarded_referent
callstrace_object
andtrace_object
used to returnNULL
if passedNULL
, the code used to be benign for Java. However, the code will not work if the VM does not use 0 to encode a null reference, or the slot can hold tagged non-reference values, for reasons we discussed before. Since the only VM that overridesReferenceGlue::is_referent_cleared
(Julia) does not use MarkCompact, this bug went undetected.This bug has ben fixed in #1032, but it shows that how hard it is to manually check for
NULL
in all possible places.Unnecessary NULL checks
Inside MMTk core, the most problematic functions are the
trace_object
methods of various spaces.trace_object
: Some spaces checkobject.is_null()
intrace_object
and returnNULL
if it is null. But it is unnecessary because after SFT orPlanTraceObject
dispatches thetrace_object
call to a concrete space by the address ofObjectReference
, it is guaranteed not to be NULL.Some API functions check for
is_null()
because we definedObjectReference
as NULL-able. Those API functions don't make sense for NULL pointers.is_in_mmtk_space(object)
: It checks if the argument is NULL only because theObjectReference
type is NULL-able. Any VMs that use this API function to distinguish references of MMTk objects from pointers frommalloc
, etc., will certainly check NULL first before doing anything else.ObjectReference::is_reachable()
: It checksis_null()
before using SFT to dispatch the call. IfObjectReference
is not NULL-able in the first place, the NULL check will be unnecessary.NULL encourages non-idiomatic Rust code
In Rust, the idiomatic way to represent the absence of a value is
None
(of typeOption<T>
). However,ObjectReference::NULL
is sometimes used to represent the absence ofObjectReference
.In MarkCompactSpace: Our current MarkCompact implementation stores a forwarding pointer in front of each object for forwarding. When the forwarding pointer is not set, that slot holds a
ObjectReference::NULL
(value 0). But what it really means is that "there is no forwarded object reference associated with the object".In
Edge::load()
: As we discussed before, since #1032,Edge::load()
now returnsObjectReference::NULL
, it means "the slot is not holding an object reference" even if the slot is holding a tagged non-reference value or a null reference not encoded as numerical 0. In idiomatic Rust, the return type ofEdge::load()
should beOption<ObjectReference>
and it should returnNone
if it is not holding an object reference. We are currently not usingOption<ObjectReference>
as the return type because theObjectReference
is currently backed byusize
and can be 0. Consequently,Option<ObjectReference>
has to be larger than a word, and will have additional overhead.Description
We propose removing the constant
ObjectReference::NULL
, and makeObjectReference
non-NULL-able.Making ObjectReference non-zero
For performance concerns, we shall change the underlying type of
ObjectReference
fromusize
tostd::num::NonZeroUsize
.And there is another good reason for forbidding 0, because no objects can be allocated at or near the address 0. (That assumes
ObjectReference
is an address. See #1044)By doing this,
Option<ObjectReference>
will have the same size asusize
due to null pointer optimization. PassingOption<ObjectReference>
between functions (including FFI boundary) should have no overhead compared to passingObjectReference
directly.An
ObjectReference
can be converted fromAddress
in two ways.Refactoring the
Edge
traitThe
Edge
trait will be modified so thatEdge::load()
now returnsOption<ObjectReference>
. If a slot does not hold an object reference (null
,nil
,true
,false
, small integers, etc.), it shall returnNone
.Edge::store(object: ObjectReference)
still takes anObjectReference
as parameter because we can only forward valid references.Refactoring the reference processor
Note: Ultimately
ReferenceGlue
andReferenceProcessor
will be moved outside mmtk-core. Here we describe a small-scale refactoring for this MEP.The
ReferenceGlue
andReferenceProcessor
will be modified so thatReferenceGlue::get_referent
now returnsOption<ObjectReference>
. It returnsNone
if the reference is already cleared.ReferenceGlue::is_referent_cleared
will be removed.ReferenceGlue::clear_referent
will no longer have a default implementation because mmtk-core no longer assumes the reference object represents "the referent is cleared" by assigning 0 to the referent field.ReferenceProcessor
will no longer callis_referent_cleared
, but will check ifget_referent
returnsNone
orSome(referent)
.ReferenceProcessor
also contains many assertions to ensure references are not NULL. Those can be removed.Removing unnecessary NULL checks
The PR #1032 already removed the
NULL
checks related totrace_object
.Public API functions
is_in_mmtk_space
andObjectReference::is_reachable
will no longer do NULL checks becauseObjectReference
cannot be NULL in the first place.The forwarding pointer in MarkCompact
Instead of loading the forwarding pointer as
ObjectReference
directly, we load the forwarding pointer as an address, and convert it toOption<ObjectReference>
. The convertion itself is a no-op.MarkCompactSpace::compact()
callsget_header_forwarding_pointer(obj)
. It always needs to check ifobj
has forwarding pointer becauseobj
may be dead, and dead objects don't have forwarding pointers (i.e.get_header_forwarding_pointer(obj)
returnsNone
ifobj
is dead). It used to check withforwarding_pointer.is_null()
.Write barrier
Main issue: #1038
The barrier function
Barrier::object_reference_write
takesObjectReference
as parameters:Here
target
is NULL-able because a user program may executesrc.slot = null
. (More generally, a JS program may havesrc.slot = "str"; src.slot = 42;
, overwriting a reference with a number.) The type oftarget
can be changed toOption<ObjectReference>
. However, the main problem is thatslot.store()
no longer accept NULL pointers. The root problem is the design ofBarrier::object_reference_write
, and that needs to be addressed separately. See #1038The
object_reference_write_pre
andobject_reference_write_post
methods should still work after changingtarget
toOption<ObjectReference>
. The "pre" and "post" functions do not modify the slot.For now, we may keep
Barrier::object_reference_write
as is, but it will not be applicable iftarget
is NULL. Currently no officially supported bindings useBarrier::object_reference_write
. Other bindings should callobject_reference_write_pre
andobject_reference_write_post
separately and manually stores the new value to the store before #1038 is properly addressed.Impact on Performance
This MEP should have no visible impact on performance. Preliminary performance evaluation supports this: #1064
Because of null pointer optimization,
Option<ObjectReference>
,ObjectReference
,Option<NonZeroUsize>
,NonZeroUsize
andusize
all have the same layout.When converting from
Address
toObjectReference
, neitherObjectReference::from_raw_address
(returnsOption<ObjectReference>
) norObjectReference::from_raw_address_unchecked
(returnsObjectReference
directly) have overhead. But when unwrapping theOption<ObjectReference>
, it will involve a run-time check.The overhead of the
None
check (pattern matching oropt_objref.unwrap()
) should be very small. But if the zero check is a performance bottleneck, we can always useObjectReference::from_raw_address_unchecked
as a fall-back, provided that we know it can't be zero.There are three known use cases of
Option<ObjectReference>
in mmtk-core:slot.load()
returnsNone
if a slot doesn't hold a reference,ReferenceGlue::get_referent()
returnsNone
if a (weak)Reference
is cleared, andIn all those cases, the checks for
None
are necessary for correctness. Previously, those places check againstObjectReference::NULL
.Impact on Software Engineering
mmtk-core
With
ObjectReference
guaranteed to be non-NULL,Option<ObjectReference>
can be used to indicate anObjectReference
may not exist. As discussed above, typical use cases ofOption<ObjectReference>
are (1)slot.load()
, (2)ReferenceGlue::get_referent()
and (3) the forwarding pointer in MarkCompact. The use ofOption<T>
forces a check to convertOption<ObjectReference>
toObjectReference
. By doing this, we can avoid bugs related to missing or redundant NULL checks.Bindings
Some code needs to be changed in the OpenJDK binding due to this API change. The OpenJDK binding uses
struct OpenJDKEdge
(which implementstrait Edge
) to represent a slot in OpenJDK. Becausetrait Edge
is designed from the perspective of mmtk-core, theEdge
trait itself does not support storingNULL
into the slot. I have to add an OpenJDK-specific methodOpenJDKEdge::store_null()
to storenull
to the slot in an OpenJDK-specific way. This is actually expected because not all VMs havenull
pointers, nor do they encodenull
,nil
,nothing
, etc. in the same way.OpenJDKEdge::store_null()
also bypasses some bit operations related to compressed OOPs. This change added compexity to the OpenJDK binding, but I think it is the right way to do it.Another quirk in software engineering is that we sometimes have to call
unsafe { ObjectReference::from_raw_address_unchecked(addr) }
to bypass the check against zero because we (as humans) are sureaddr
is never zero. That happens when:ObjectReference
from the result ofalloc
oralloc_copy
. We know newly allocated objects cannot have zero as their addresses, but the Rust language cannot figure it out unless we addNonZeroAddress
, too.alloc
, MMTk may find it is out of memory. Currently, the behavior is, MMTk core will callCollection::out_of_memory
, and thenalloc
will returnAddress(0)
to the caller. But the default implementation ofCollection::out_of_memory
is panicking, so the binding may assumealloc
never returnsAddress(0)
on normal returns. But if the binding overridesCollection::out_of_memory
, it will need to actually check if the return value ofalloc
is 0 instead of using the unsafefrom_raw_address_unchecked
function.unsafe { ObjectReference::from_raw_address_unchecked(BASE.load(Ordering::Relaxed) + ((v as usize) << SHIFT.load(Ordering::Relaxed))) }
, too. The Rust langauge cannot prove that the result can't be zero ifv
is not zero, but we as humans know the check against zero is unnecessary.The presence of
unsafe { ... }
makes the code look unsafe, but it is actually as safe (or as unsafe) as before.Risks
Long Term Performance Risks
Converting
Address
toObjectReference
has overhead only if we don't know whether the address can be zero or not. (We can always useunsafe { ObjectReference::from_raw_address_unchecked(addr) }
if we knowaddr
cannot be zero.)This will remain true in the future. If we don't know if it is zero at compile time, then run-time checking will be necessary, and this MEP enforces the check to be done. Such overhead should always exist regardless whether we allow
ObjectReference
to beNULL
or not (and the overhead may be erroneously omitted if we fail to add a necessary NULL check).Long Term Software Engineering Risks
Option<ObjectReference>
across FFI boundaryOne potential problem is the convenience of exposing
Option<ObjectReference>
to C code via FFI. Ideally, C programs should useuintptr_t
forOption<NonZeroUsize>
, with 0 representingNone
. However, Rust currently does not define the layout ofOption<NonZeroUsize>
. Even though the only possible encoding ofNone
(of typeOption<NonZeroUsize>
) is0usize
, the Rust reference still states that transmutingNone
(of typeOption<NonZeroUsize>
) tousize
has undefined behavior. So we have to manually write code to do the conversion, mappingNone
to0usize
. Despite that, the conversion functions should be easy to implement. We can implement two functions to make the conversion easy:That should be concise enough for most use cases.
Currently, very few public API functions exposes the
Option<ObjectReference>
type. They are:ObjectReference::get_forwarded_referent(self) -> Option<Self>
vo_bit::is_vo_bit_set_for_addr(address: Address) -> Option<ObjectReference>
: Although public, VM bindings tend to useis_mmtk_object
instead.With this MEP implemented,
Edge::load() -> Option<ObjectReference>
will be a new use case.The software engineering burden should be reasonable for those three API functions. Specifically, the OpenJDK binding currently does not use
get_forwarded_referent
noris_vo_bit_set_for_addr
, andEdge::load()
is trivial to refactor.If, in the future, the mmtk-core introduces more API functions that involve
Option<ObjectReference>
(which I don't think is likely to happen), we (or the VM bindings) may introduce macros to automate the conversion.VM Binding considerations
VM bindings can no longer use the
ObjectReference
type from mmtk-core to represent its reference types if the VM allows NULL references. Binding writers may find it inconvenient because they need to define their own null-able reference types. But existing VMs already have related types. The OpenJDK binding already has theoop
type, and we know it may be encoded asu32
orusize
depending on whether CompressedOOP is enabled. The Ruby binding has theVALUE
type which is backed byunsigned long
and can encode tagged union.I don't worry about new bindings because if the developer knows a
ObjectReference
must refer to an object and cannot be NULL or small integers, they will roll their own nullable or tagged reference type and get things right from the start. The problem may be with existing bindings (OpenJDK, Ruby, Julia and V8). If they assumedObjectReference
may be NULL or may hold tagged references, they need to be refactored.Impact on Public API
The most obvious change is the
Edge
trait.Edge::load()
will returnOption<ObjectReference>
, andEdge::store(object)
will ensure the argument is not NULL. As stated above,OpenJDKEdge::load()
has been trivially refactored to adapt to this change.Other public API functions will no longer accept NULL
ObjectReference
, but most public API functions never accepted NULL as argument before.The main problem is
object_reference_write
and its_pre
,_post
and_slow
variants. As we discussed in the Write barrier section,object_reference_write
will stop working for VMs that support null pointers or tagged pointers because we can no longer storeNULL
to anEdge
. However, VMs are still able to use write barriers by calling the_pre
and_post
functions separately, or inlining the fast path and calling the_slow
function separately.Currently,
_pre
and_post
functions separately._post
and the_slow
functions.Since currently no officially supported VM bindings use
object_reference_write
directly, there is no immediate impact.But in the long term, we should redesign the write barrier functions to make them more general. See: #1038
Testing
We may add unit tests to ensure
Option<ObjectReference>
,ObjectReference
,NonZeroUsize
andusize
all have the same size.Address
to/fromOption<ObjectReference>
properly handlesAddress(0)
andNone
.And we should add micro benchmarks to ensure
Address
andOption<ObjectReference>
should have no performance penalty.Address
toObjectReference
should have no performance penalty.Some(ObjectReference)
toObjectReference
(via matching) should be efficient.Option<ObjectReference>
should be efficient.It is better if we can verify the generated assembly code of the "no penalty" cases to make sure they are no-op.
No tests need to be added around
trace_object
implementations because the Rust language will ensure the underlyingNonZeroUsize
will never hold the value 0.Currently one test involves
ObjectReference::NULL
, that is, the test foris_in_mmtk_space
. It tests if the function returnsfalse
when the argument isObjectReference::NULL
. We may remove that test case because we removedObjectReference::NULL
.Alternatives
We may do nothing, keeping
ObjectReference::NULL
and use it to represent a missingObjectReference
. MMTk is still capable of performing GC and supporting our current supported VMs wihtout this refactoring. But the problem of this approach has been listed in the Motivation section, namely not general enough, polluting the API, hard to get NULL checks right, and non-idiomatic in Rust.We may do the opposite, i.e. allowing
ObjectReference
to represent not onlyNULL
encoded as 0, but also language-specific NULL variants such asnil
,nothing
,missing
,undefined
, etc., and allow the binding to define the possible NULL-like values. But if we take this approach, MMTk core will not only have to check forNULL
everywhere, but also need to check for other special NULL-like values everywhere, too, making software engineering more difficult.Assumptions
Currently
ObjectReference
is backed byusize
, and all existing VM bindings implementObjectReference
as a pointer to an object, or to some offset from the start of an object. While this design (implementingObjectReference
as a pointer to object, possibly with an offset) is able to support fat pointers, offsetted pointers, and handles, we acknowledge that it may not be the only possible design. For example, we currently assume thatObjectReference
can only represent references, but not non-reference values such as NULL, small integers,true
,false
,nil
,undefined
, etc.If, in the future, we change the definition so that
ObjectReference
can also holdNULL
,nil
,true
,false
, small integers, etc., we will need to think about this MEP again. I (Kunshan) personally strongly disagree with the idea of lettingObjectReference
hold a tagged non-reference value, such as small integer. IfObjectReference
can benil
,true
,false
, and small integers, then mmtk-core will need to check whether a givenObjectReference
is such special non-ref values everywhere, which is even worse than adding NULL checks everywhere.MMTk core makes no assumption about how an object reference is stored in a slot. The VM (such as OpenJDK) may store compressed pointers in some slots. That is abstracted out by the
Edge::load()
method which decompresses the pointer and returns aSome(ObjectReference)
orNone
. If the VM finds the slot is holding a NULL reference after decoding (or before decoding if0u32
also represents NULL, as in OpenJDK), it still returnsNone
.Related Issues
Preliminary implementation PRs:
Other related issues and PRs:
Edge::load()
can now return NULL for tagged values so that the slot can be skipped. It can be improved by this MEP if we useNone
instead of NULL. Also fixes a missing NULL check in the ReferenceProcessor.ObjectReference
. CanObjectReference
be addresses, handles, tagged pointers, etc.?The text was updated successfully, but these errors were encountered: