Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ParallelBenchmark #2167

Merged
merged 2 commits into from
Jul 28, 2021
Merged

Conversation

manufacturist
Copy link
Contributor

@manufacturist manufacturist commented Jul 27, 2021

As requested by @vasilmkd, moved the ParallelBenchmark to a new PR. The reason for the 6 benchmarks is to compare the performance of parTraverse vs traverse under different CPU load.

Posted the benchmarks for the series/3.x:

[info] Benchmark                                    (size)   Mode  Cnt    Score   Error  Units
[info] ParallelBenchmark.parTraverseCpuTokens100     10000  thrpt   20   52.561 ± 1.691  ops/s
[info] ParallelBenchmark.parTraverseCpuTokens1000    10000  thrpt   20   44.141 ± 1.213  ops/s
[info] ParallelBenchmark.parTraverseCpuTokens10000   10000  thrpt   20   18.174 ± 0.254  ops/s
[info] ParallelBenchmark.traverseCpuTokens100        10000  thrpt   20  741.288 ± 9.223  ops/s
[info] ParallelBenchmark.traverseCpuTokens1000       10000  thrpt   20  134.859 ± 5.774  ops/s
[info] ParallelBenchmark.traverseCpuTokens10000      10000  thrpt   20   15.097 ± 0.458  ops/s

LE:

Executing Machine Details

Processor Name:         Quad-Core Intel Core i7
Processor Speed:        2,2 GHz
Number of Processors:   1
Total Number of Cores:  4

@djspiewak
Copy link
Member

djspiewak commented Jul 27, 2021

Outstanding! If I may ask, how many physical threads does your machine have? (Runtime.getRuntime().availableProcessors())

Edit: Oh, reading. :-D

@manufacturist
Copy link
Contributor Author

Outstanding! If I may ask, how many physical threads does your machine have? (Runtime.getRuntime().availableProcessors())

^ Just updated the machine details

@djspiewak djspiewak mentioned this pull request Jul 27, 2021
6 tasks
@vasilmkd
Copy link
Member

I'm running the benchmark too, I will post base results.

@vasilmkd
Copy link
Member

Benchmark                                    (size)   Mode  Cnt    Score   Error  Units
ParallelBenchmark.parTraverseCpuTokens100     10000  thrpt   20   95.980 ± 0.942  ops/s
ParallelBenchmark.parTraverseCpuTokens1000    10000  thrpt   20   67.336 ± 8.235  ops/s
ParallelBenchmark.parTraverseCpuTokens10000   10000  thrpt   20   26.116 ± 0.104  ops/s
ParallelBenchmark.traverseCpuTokens100        10000  thrpt   20  361.982 ± 7.369  ops/s
ParallelBenchmark.traverseCpuTokens1000       10000  thrpt   20   54.546 ± 0.286  ops/s
ParallelBenchmark.traverseCpuTokens10000      10000  thrpt   20    5.711 ± 0.016  ops/s

Copy link
Member

@djspiewak djspiewak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hold on merging this. Some changes @manufacturist and I discussed directly

Copy link
Member

@vasilmkd vasilmkd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for discussed changes then.

@manufacturist
Copy link
Contributor Author

manufacturist commented Jul 28, 2021

Ran the benchmark with cpuTokens values 100 & 100000 as well to get more data & pushed the discussed changes from yesterday:

  • Scale the cpuTokens from 0.1k - 10k to 1k & 100k Scale cpuTokens to 0.1k - 1kk
  • Parametrize size as well
[info] Benchmark                      (cpuTokens)  (size)   Mode  Cnt      Score     Error  Units
[info] ParallelBenchmark.parTraverse          100     100  thrpt   20   6487.985 ± 251.364  ops/s
[info] ParallelBenchmark.parTraverse          100    1000  thrpt   20    791.526 ±   6.699  ops/s
[info] ParallelBenchmark.parTraverse          100   10000  thrpt   20     75.203 ±   0.372  ops/s
[info] ParallelBenchmark.parTraverse         1000     100  thrpt   20   4505.063 ± 170.821  ops/s
[info] ParallelBenchmark.parTraverse         1000    1000  thrpt   20    545.267 ±   3.686  ops/s
[info] ParallelBenchmark.parTraverse         1000   10000  thrpt   20     58.031 ±   4.109  ops/s
[info] ParallelBenchmark.parTraverse        10000     100  thrpt   20   2294.218 ±   7.110  ops/s
[info] ParallelBenchmark.parTraverse        10000    1000  thrpt   20    237.894 ±   1.132  ops/s
[info] ParallelBenchmark.parTraverse        10000   10000  thrpt   20     22.876 ±   0.288  ops/s
[info] ParallelBenchmark.parTraverse       100000     100  thrpt   20    320.065 ±   0.294  ops/s
[info] ParallelBenchmark.parTraverse       100000    1000  thrpt   20     32.039 ±   0.173  ops/s
[info] ParallelBenchmark.parTraverse       100000   10000  thrpt   20      3.186 ±   0.040  ops/s
[info] ParallelBenchmark.parTraverse      1000000     100  thrpt   20     33.128 ±   0.042  ops/s
[info] ParallelBenchmark.parTraverse      1000000    1000  thrpt   20      3.318 ±   0.008  ops/s
[info] ParallelBenchmark.parTraverse      1000000   10000  thrpt   20      0.351 ±   0.035  ops/s
[info] ParallelBenchmark.traverse             100     100  thrpt   20  68114.475 ± 750.640  ops/s
[info] ParallelBenchmark.traverse             100    1000  thrpt   20   8024.076 ± 104.133  ops/s
[info] ParallelBenchmark.traverse             100   10000  thrpt   20    918.410 ±   4.294  ops/s
[info] ParallelBenchmark.traverse            1000     100  thrpt   20  15431.432 ±  32.394  ops/s
[info] ParallelBenchmark.traverse            1000    1000  thrpt   20   1615.529 ±  19.228  ops/s
[info] ParallelBenchmark.traverse            1000   10000  thrpt   20    166.345 ±   0.537  ops/s
[info] ParallelBenchmark.traverse           10000     100  thrpt   20   1766.005 ±   7.986  ops/s
[info] ParallelBenchmark.traverse           10000    1000  thrpt   20    178.010 ±   0.157  ops/s
[info] ParallelBenchmark.traverse           10000   10000  thrpt   20     17.718 ±   0.106  ops/s
[info] ParallelBenchmark.traverse          100000     100  thrpt   20    179.566 ±   0.178  ops/s
[info] ParallelBenchmark.traverse          100000    1000  thrpt   20     17.847 ±   0.110  ops/s
[info] ParallelBenchmark.traverse          100000   10000  thrpt   20      1.728 ±   0.061  ops/s
[info] ParallelBenchmark.traverse         1000000     100  thrpt   20     17.867 ±   0.094  ops/s
[info] ParallelBenchmark.traverse         1000000    1000  thrpt   20      1.717 ±   0.084  ops/s
[info] ParallelBenchmark.traverse         1000000   10000  thrpt   20      0.178 ±   0.006  ops/s

@vasilmkd
Copy link
Member

Excellent.

djspiewak
djspiewak previously approved these changes Jul 28, 2021
Copy link
Member

@djspiewak djspiewak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outstanding! That's actually more data points than I was expecting. 🙂 This is going to be super useful for some other changes as well, thank you!

@vasilmkd
Copy link
Member

Thank you @manufacturist! Great work.

@vasilmkd vasilmkd merged commit 167f902 into typelevel:series/3.x Jul 28, 2021
@manufacturist manufacturist deleted the parallel-benchmark branch July 29, 2021 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants