-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cross platform weak bag implementation #2673
Conversation
Co-authored-by: Arman Bilge <armanbilge@gmail.com>
build.sbt
Outdated
("org.scala-js" %%% "scalajs-weakreferences" % JsWeakReferencesVersion) | ||
.cross(CrossVersion.for3Use2_13) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some interesting reading about this scala-js/scala-js-weakreferences#7 (comment)
"io.vasilev" %% "cats-effect" % "3.3-87-1d1f0ff" |
Benchmark results: Baseline on benchmarks/Jmh/run -wi 10 -i 10 -f 2 -t 1 -prof gc --jvmArgs -Dcats.effect.tracing.mode=none --jvmArgs -Dcats.effect.tracing.exceptions.enhanced=false ParallelBenchmark.par
Benchmark (cpuTokens) (size) Mode Cnt Score Error Units
ParallelBenchmark.parTraverse 10000 1000 thrpt 20 316.579 ± 3.402 ops/s
ParallelBenchmark.parTraverse:·gc.alloc.rate 10000 1000 thrpt 20 1417.315 ± 15.392 MB/sec
ParallelBenchmark.parTraverse:·gc.alloc.rate.norm 10000 1000 thrpt 20 4929586.675 ± 579.390 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space 10000 1000 thrpt 20 1429.037 ± 22.488 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space.norm 10000 1000 thrpt 20 4970221.334 ± 47058.032 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space 10000 1000 thrpt 20 0.222 ± 0.022 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space.norm 10000 1000 thrpt 20 771.322 ± 75.550 B/op
ParallelBenchmark.parTraverse:·gc.count 10000 1000 thrpt 20 919.000 counts
ParallelBenchmark.parTraverse:·gc.time 10000 1000 thrpt 20 1325.000 ms Baseline on benchmarks/Jmh/run -wi 10 -i 10 -f 2 -t 1 -prof gc --jvmArgs -Dcats.effect.tracing.mode=cached --jvmArgs -Dcats.effect.tracing.exceptions.enhanced=true ParallelBenchmark.par
Benchmark (cpuTokens) (size) Mode Cnt Score Error Units
ParallelBenchmark.parTraverse 10000 1000 thrpt 20 204.719 ± 19.250 ops/s
ParallelBenchmark.parTraverse:·gc.alloc.rate 10000 1000 thrpt 20 1044.091 ± 97.878 MB/sec
ParallelBenchmark.parTraverse:·gc.alloc.rate.norm 10000 1000 thrpt 20 5616091.869 ± 2648.642 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space 10000 1000 thrpt 20 1052.138 ± 110.314 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space.norm 10000 1000 thrpt 20 5665606.945 ± 358765.601 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space 10000 1000 thrpt 20 0.814 ± 0.502 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space.norm 10000 1000 thrpt 20 4325.508 ± 2692.273 B/op
ParallelBenchmark.parTraverse:·gc.count 10000 1000 thrpt 20 92.000 counts
ParallelBenchmark.parTraverse:·gc.time 10000 1000 thrpt 20 17783.000 ms This PR with tracing on: benchmarks/Jmh/run -wi 10 -i 10 -f 2 -t 1 -prof gc --jvmArgs -Dcats.effect.tracing.mode=cached --jvmArgs -Dcats.effect.tracing.exceptions.enhanced=true ParallelBenchmark.par
Benchmark (cpuTokens) (size) Mode Cnt Score Error Units
ParallelBenchmark.parTraverse 10000 1000 thrpt 20 278.251 ± 1.785 ops/s
ParallelBenchmark.parTraverse:·gc.alloc.rate 10000 1000 thrpt 20 1391.663 ± 9.091 MB/sec
ParallelBenchmark.parTraverse:·gc.alloc.rate.norm 10000 1000 thrpt 20 5507256.392 ± 575.319 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space 10000 1000 thrpt 20 1407.085 ± 15.138 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space.norm 10000 1000 thrpt 20 5568406.214 ± 57756.867 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space 10000 1000 thrpt 20 0.246 ± 0.020 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space.norm 10000 1000 thrpt 20 972.881 ± 79.171 B/op
ParallelBenchmark.parTraverse:·gc.count 10000 1000 thrpt 20 812.000 counts
ParallelBenchmark.parTraverse:·gc.time 10000 1000 thrpt 20 1386.000 ms This PR improves performance significantly and reduces GC time to the levels of the case without tracing. |
Under the advice of @armanbilge, this PR shades https://github.com/scala-js/scala-js-weakreferences by copying the source code (~100 LOC, not including license headers). This is due to the existence of https://github.com/scala-js/scala-js-fake-weakreferences which are implemented in terms of strong references and would wreak havoc if selected over the weak reference implementations. |
For reference these results are from With tracing: benchmarks/Jmh/run -wi 10 -i 10 -f 2 -t 1 -prof gc --jvmArgs -Dcats.effect.tracing.mode=cached --jvmArgs -Dcats.effect.tracing.exceptions.enhanced=true ParallelBenchmark.par
Benchmark (cpuTokens) (size) Mode Cnt Score Error Units
ParallelBenchmark.parTraverse 10000 1000 thrpt 20 290.942 ± 2.351 ops/s
ParallelBenchmark.parTraverse:·gc.alloc.rate 10000 1000 thrpt 20 1521.536 ± 12.438 MB/sec
ParallelBenchmark.parTraverse:·gc.alloc.rate.norm 10000 1000 thrpt 20 5758396.046 ± 580.724 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space 10000 1000 thrpt 20 1536.991 ± 20.210 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space.norm 10000 1000 thrpt 20 5816806.929 ± 53534.035 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space 10000 1000 thrpt 20 0.303 ± 0.024 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space.norm 10000 1000 thrpt 20 1147.693 ± 89.317 B/op
ParallelBenchmark.parTraverse:·gc.count 10000 1000 thrpt 20 887.000 counts
ParallelBenchmark.parTraverse:·gc.time 10000 1000 thrpt 20 1316.000 ms Without tracing: Benchmark (cpuTokens) (size) Mode Cnt Score Error Units
ParallelBenchmark.parTraverse 10000 1000 thrpt 20 295.814 ± 2.993 ops/s
ParallelBenchmark.parTraverse:·gc.alloc.rate 10000 1000 thrpt 20 1559.392 ± 10.000 MB/sec
ParallelBenchmark.parTraverse:·gc.alloc.rate.norm 10000 1000 thrpt 20 5804947.416 ± 40176.019 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space 10000 1000 thrpt 20 1575.089 ± 15.339 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space.norm 10000 1000 thrpt 20 5863488.584 ± 67068.118 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space 10000 1000 thrpt 20 0.318 ± 0.025 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space.norm 10000 1000 thrpt 20 1181.857 ± 90.029 B/op
ParallelBenchmark.parTraverse:·gc.count 10000 1000 thrpt 20 909.000 counts
ParallelBenchmark.parTraverse:·gc.time 10000 1000 thrpt 20 1338.000 ms |
Completely replaces the
WeakHashMap
mechanism, on eachWorkerThread
, the fallback and JS.A possible remedy for #2634.