The new marking loop has a regression when marking arrays of pointers #49205

gbaraldi · 2023-03-31T13:39:50Z

When looking into #49120, it seems the regression seen there was hiding a part of the new mark loop where it's slower when marking a very large array of pointers.

using Random: seed!
seed!(1)

abstract type Cell end

struct CellA<:Cell
    a::Int
end

struct CellB<:Cell
    b::String
end

function fillcells!(mc::Array{Cell})
    for ind in eachindex(mc)
        mc[ind] = ifelse(rand() > 0.5, CellA(ind), CellB(string(ind)))
    end
    return mc
end

mcells = Array{Cell}(undef, 5000, 5000 )
t1 = @elapsed fillcells!(mcells)
t2 = @elapsed fillcells!(mcells)

println("filling: $t1 s\nfilling again: $t2 s")

@time GC.gc()
@time GC.gc()

One of the GCs after will do a full mark of the array and take a long time to do it. This probably wasn't seen before because arrays of many pointers had really bad GC behaviour overall, but with #49185 it's quite clear.

d-netto · 2023-03-31T14:04:09Z

Seems like #49185 is adding a GC bit to encode whether an object is in the image.

Why could that amplify a potential regression when marking large arrays of pointers?

gbaraldi · 2023-03-31T14:19:20Z

That PR is doing two different GC fixes, I haven't updated the title. It also fixes a behaviour where we stopped increasing the interval size when encountering many pointers which led to the regression in #49120. It's just that it seems we were doing so many GCs that it hid whatever is going on.
#48935 has the fix but not the new mark loop and it's quite a bit faster if you want an easy test to compare.

d-netto · 2023-03-31T15:18:01Z

I can reproduce the performance difference across backports-release-1.9 and #49185.

Most of the degradation seems to be coming from chunking (batching mechanism we use to mark large arrays of pointers):

gbaraldi · 2023-03-31T15:22:23Z

I think the chunking itself is fine, if you look it's all in the hot loop of the function.

d-netto · 2023-03-31T15:27:51Z

I'm inclined to say it's coming from chunking. After increasing the batch size MAX_REFS_AT_ONCE from 2^16 to 2^20, the pathological full mark goes down from 20s to 3s.

This is not necessarily a solution, but could suggest that our chunking algorithm may need some adjustments.

gbaraldi · 2023-03-31T16:10:06Z

Oh, that's quite interesting.

d-netto · 2023-03-31T16:53:47Z

CC: @vchuravy

Fixes #49205

d-netto added regression Regression in behavior compared to a previous version GC Garbage collector labels Mar 31, 2023

gbaraldi mentioned this issue Apr 3, 2023

v1.9 rc1 significant slowdown in garbage collection #49120

Closed

d-netto mentioned this issue Apr 10, 2023

Only add big objarray to remset once #49315

Merged

vtjnash closed this as completed in #49315 Apr 20, 2023

vtjnash pushed a commit that referenced this issue Apr 20, 2023

Only add big objarray to remset once (#49315)

1f94d2e

Fixes #49205

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The new marking loop has a regression when marking arrays of pointers #49205

The new marking loop has a regression when marking arrays of pointers #49205

gbaraldi commented Mar 31, 2023

d-netto commented Mar 31, 2023 •

edited

Loading

gbaraldi commented Mar 31, 2023 •

edited

Loading

d-netto commented Mar 31, 2023

gbaraldi commented Mar 31, 2023

d-netto commented Mar 31, 2023

gbaraldi commented Mar 31, 2023

d-netto commented Mar 31, 2023

The new marking loop has a regression when marking arrays of pointers #49205

The new marking loop has a regression when marking arrays of pointers #49205

Comments

gbaraldi commented Mar 31, 2023

d-netto commented Mar 31, 2023 • edited Loading

gbaraldi commented Mar 31, 2023 • edited Loading

d-netto commented Mar 31, 2023

gbaraldi commented Mar 31, 2023

d-netto commented Mar 31, 2023

gbaraldi commented Mar 31, 2023

d-netto commented Mar 31, 2023

d-netto commented Mar 31, 2023 •

edited

Loading

gbaraldi commented Mar 31, 2023 •

edited

Loading