enh: more realistic benchmark #26

zerothi · 2021-07-02T17:08:10Z

I think it is vital that benchmarks are realistic. Using fixed size arrays is not realistic in my opinion.

This PR does many things:

benchmarks should not rely on convergence, rather a fixed number of iterations is more appropriate, otherwise timings are irrelevant
The problem of inconsistent iterations between the C and Fortran versions is fixed. It was an indexing error when calculating the cell for the rhoarr array (C is 0-based, F is 1-based). So they now run the same number of iterations, regardless of M.
Moved timings about to make them more comparable.
Ensured that all examples now use allocate/malloc
Calls in C and F code are much more similar

For running benchmarks it is vital that M is so large that the 2D array goes beyond the L3 cache to match realistic cases.

certik · 2021-07-02T17:19:50Z

@zerothi thanks for this!

I have an idea: why don't you leave poisson2d/optimized.f90 and poisson2d/optimized.c as they are, but submit new files both for Fortran and C with your version. I actually want to have both. I want to see how long it takes when only the computation is performed, but I also want to see when you benchmark everything, including memory allocation, as you did. The same with the changes you made --- I actually would like to have both.

Regarding naming, we can even just do version1.f90, version2.f90, etc.

What do you think?

zerothi · 2021-07-02T18:15:06Z

Good idea, give me some time :)

Benchmarks should not rely on convergence, rather a fixed number of iterations is more appropriate. The problem of inconsistent iterations between the C and Fortran versions was also fixed because of an indexing error when calculating the cell coordinates (C is 0-based, F is 1-based). So they now run the same number of iterations, regardless of M. Note, this issue still persists in 2 and 3, but not in 4, I can't seem to figure out why. Moved timings about to make them more comparable. All different benchmarks write out the same format output to more easily compare them. The later versions are more realistic since they use allocate/malloc which is more appropriate. Heavily updated the README.md file to explain what is going on.

zerothi · 2021-07-02T19:44:59Z

Please have a look at the updated look. It is now more complete and easier to extend with new versions!

zerothi · 2021-07-02T19:50:43Z

Note that benchmarking the allocations is not really fair I think. In this case I have left all allocation timings out since that should be a benchmark by it-self.
The reason for not doing it is that some things may be allocated on the stack which one can't easily time without complicating matters quite a bit.

zerothi force-pushed the realistic branch from 5184dda to 33d5b68 Compare July 2, 2021 19:43

zerothi marked this pull request as ready for review July 2, 2021 19:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enh: more realistic benchmark #26

enh: more realistic benchmark #26

zerothi commented Jul 2, 2021

certik commented Jul 2, 2021

zerothi commented Jul 2, 2021

zerothi commented Jul 2, 2021

zerothi commented Jul 2, 2021

enh: more realistic benchmark #26

Are you sure you want to change the base?

enh: more realistic benchmark #26

Conversation

zerothi commented Jul 2, 2021

certik commented Jul 2, 2021

zerothi commented Jul 2, 2021

zerothi commented Jul 2, 2021

zerothi commented Jul 2, 2021