-
Notifications
You must be signed in to change notification settings - Fork 27
Benchmark and analysis for possible reasons that hybrid is slower than MPI
ZHG2017 edited this page Jul 30, 2019
·
1 revision
# CORES | N | B | # THREADS | TOTAL |
---|---|---|---|---|
32 | 500 | 100 | 4 | 19.9467 |
32 | 500 | 100 | 8 | 17.3162 |
32 | 500 | 100 | 16 | 18.9274 |
32 | 500 | 100 | 32 | 19.4621 |
32 | 500 | 100 | 64 | 19.4981 |
32 | 500 | 100 | 128 | 24.2367 |
# CORES | N | B | # THREADS | TOTAL |
---|---|---|---|---|
32 | 500 | 100 | 2 | 41.8546 |
32 | 500 | 100 | 4 | 37.4821 |
32 | 500 | 100 | 8 | 35.8083 |
32 | 500 | 100 | 16 | 35.8087 |
32 | 500 | 100 | 32 | 40.0238 |
Performance counter stats for './benchmark-dense-solve -n 500 -b 100 -d Distributed -M CRA -s 1564490910':
1 431 603 007 cache-misses
36 953 889 dTLB-load-misses
8 702 084 iTLB-load-misses
405,456098902 seconds time elapsed
Performance counter stats for './benchmark-dense-solve -n 500 -b 100 -d Distributed -M CRA -s 1564490910':
1 450 247 295 cache-misses
36 825 319 dTLB-load-misses
9 923 298 iTLB-load-misses
406,176510782 seconds time elapsed
Performance counter stats for './benchmark-dense-solve -n 500 -b 100 -d Distributed -M CRA -s 1564490910':
1 443 507 799 cache-misses
35 004 060 dTLB-load-misses
177 700 722 iTLB-load-misses
406,753558858 seconds time elapsed
Performance counter stats for './benchmark-dense-solve -n 500 -b 100 -d Distributed -M CRA -s 1564490910':
1 446 514 817 cache-misses
37 051 342 dTLB-load-misses
38 164 972 iTLB-load-misses
406,876224780 seconds time elapsed
Performance counter stats for './benchmark-dense-solve -n 500 -b 100 -d Combined -M CRA -s 1564490910 -t 2':
1 181 055 424 cache-misses
29 891 598 dTLB-load-misses
5 144 597 iTLB-load-misses
250,105811705 seconds time elapsed
Performance counter stats for './benchmark-dense-solve -n 500 -b 100 -d Combined -M CRA -s 1564490910 -t 2':
1 188 476 630 cache-misses
30 662 497 dTLB-load-misses
9 799 303 iTLB-load-misses
252,132564516 seconds time elapsed
Performance counter stats for './benchmark-dense-solve -n 500 -b 100 -d Combined -M CRA -s 1564490910 -t 2':
1 203 900 822 cache-misses
31 508 846 dTLB-load-misses
5 358 814 iTLB-load-misses
252,838338050 seconds time elapsed
Performance counter stats for './benchmark-dense-solve -n 500 -b 100 -d Combined -M CRA -s 1564490910 -t 4':
1 097 648 012 cache-misses
28 477 946 dTLB-load-misses
3 887 310 iTLB-load-misses
248,689815914 seconds time elapsed
Performance counter stats for './benchmark-dense-solve -n 500 -b 100 -d Combined -M CRA -s 1564490910 -t 4':
1 104 745 687 cache-misses
30 894 988 dTLB-load-misses
13 148 579 iTLB-load-misses
249,487985907 seconds time elapsed
Performance counter stats for './benchmark-dense-solve -n 500 -b 100 -d Combined -M CRA -s 1564490910 -t 4':
1 094 830 287 cache-misses
27 483 445 dTLB-load-misses
16 707 693 iTLB-load-misses
250,090669714 seconds time elapsed
Performance counter stats for './benchmark-dense-solve -n 500 -b 100 -d Combined -M CRA -s 1564490910 -t 8':
1 183 348 122 cache-misses
28 191 602 dTLB-load-misses
4 249 035 iTLB-load-misses
249,730256828 seconds time elapsed
Performance counter stats for './benchmark-dense-solve -n 500 -b 100 -d Combined -M CRA -s 1564490910 -t 8':
1 187 749 563 cache-misses
26 973 310 dTLB-load-misses
4 091 277 iTLB-load-misses
250,610157321 seconds time elapsed
Performance counter stats for './benchmark-dense-solve -n 500 -b 100 -d Combined -M CRA -s 1564490910 -t 8':
1 170 709 431 cache-misses
26 327 385 dTLB-load-misses
4 242 060 iTLB-load-misses
251,144711325 seconds time elapsed