-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only add big objarray to remset once #49315
Conversation
What's the performance impact of this PR on #49205? |
Makes it faster than on 1.8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for decide whether we need to chunk or push the array into the remset
could be a bit cleaner. I'd refactor:
size_t too_big = (obj_end - obj_begin) / MAX_REFS_AT_ONCE > step; // use this order of operations to avoid idiv
jl_value_t **scan_end = obj_end;
if (too_big) {
scan_end = obj_begin + step * MAX_REFS_AT_ONCE;
if ((nptr & 0x3) != 0x2) {
jl_gc_chunk_t c = {GC_objary_chunk, obj_parent, scan_end,
obj_end, NULL, NULL,
step, nptr};
gc_chunkqueue_push(mq, &c);
obj_end = scan_end;
}
}
for (; obj_begin < scan_end; obj_begin += step) {
new_obj = *obj_begin;
if (new_obj != NULL) {
verify_parent2("obj array", obj_parent, obj_begin, "elem(%d)",
gc_slot_to_arrayidx(obj_parent, obj_begin));
gc_try_claim_and_push(mq, new_obj, &nptr);
gc_heap_snapshot_record_array_edge(obj_parent, &new_obj);
}
}
if (too_big) {
if (obj_end != scan_end) {
jl_gc_chunk_t c = {GC_objary_chunk, obj_parent, scan_end,
obj_end, NULL, NULL,
step, nptr};
gc_chunkqueue_push(mq, &c);
}
}
else {
gc_mark_push_remset(ptls, obj_parent, nptr);
}
into:
size_t too_big = (obj_end - obj_begin) / MAX_REFS_AT_ONCE > step; // use this order of operations to avoid idiv
jl_value_t **scan_end = obj_end;
if (too_big) {
scan_end = obj_begin + step * MAX_REFS_AT_ONCE;
}
for (; obj_begin < scan_end; obj_begin += step) {
new_obj = *obj_begin;
if (new_obj != NULL) {
verify_parent2("obj array", obj_parent, obj_begin, "elem(%d)",
gc_slot_to_arrayidx(obj_parent, obj_begin));
gc_try_claim_and_push(mq, new_obj, &nptr);
gc_heap_snapshot_record_array_edge(obj_parent, &new_obj);
}
}
if (too_big) {
jl_gc_chunk_t c = {GC_objary_chunk, obj_parent, scan_end,
obj_end, NULL, NULL,
step, nptr};
gc_chunkqueue_push(mq, &c);
}
else {
gc_mark_push_remset(ptls, obj_parent, nptr);
}
Same change for gc_mark_array8/16
.
As long as you don't care that your change eliminates the possibility of intra-array thread-stealing and thus slightly reduces parallelism of threaded GC, yeah, that rearrangement is fine. |
This should be fine. If an array is large enough to be chunked then it will expose a lot of |
Also, in #48600 we prioritize stealing pointers before chunks. |
src/gc.c
Outdated
if (too_big) { | ||
if (ary8_end != scan_end) { | ||
jl_gc_chunk_t c = {GC_objary_chunk, ary8_parent, scan_end, | ||
ary8_end, NULL, NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace NULL, NULL
here with elem_begin, elem_end
. Same for array16
.
Co-authored-by: Jameson Nash <vtjnash@gmail.com>
Fixes #49205