Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enh: more realistic benchmark #26

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

zerothi
Copy link

@zerothi zerothi commented Jul 2, 2021

I think it is vital that benchmarks are realistic. Using fixed size arrays is not realistic in my opinion.

This PR does many things:

  • benchmarks should not rely on convergence, rather a fixed number of iterations is more appropriate, otherwise timings are irrelevant
  • The problem of inconsistent iterations between the C and Fortran versions is fixed. It was an indexing error when calculating the cell for the rhoarr array (C is 0-based, F is 1-based). So they now run the same number of iterations, regardless of M.
  • Moved timings about to make them more comparable.
  • Ensured that all examples now use allocate/malloc
  • Calls in C and F code are much more similar

For running benchmarks it is vital that M is so large that the 2D array goes beyond the L3 cache to match realistic cases.

@certik
Copy link
Member

certik commented Jul 2, 2021

@zerothi thanks for this!

I have an idea: why don't you leave poisson2d/optimized.f90 and poisson2d/optimized.c as they are, but submit new files both for Fortran and C with your version. I actually want to have both. I want to see how long it takes when only the computation is performed, but I also want to see when you benchmark everything, including memory allocation, as you did. The same with the changes you made --- I actually would like to have both.

Regarding naming, we can even just do version1.f90, version2.f90, etc.

What do you think?

@zerothi
Copy link
Author

zerothi commented Jul 2, 2021

Good idea, give me some time :)

Benchmarks should not rely on convergence, rather a fixed
number of iterations is more appropriate.

The problem of inconsistent iterations between the C and Fortran versions
was also fixed because of an indexing error when calculating the cell
coordinates (C is 0-based, F is 1-based). So they now run the same number
of iterations, regardless of M.
Note, this issue still persists in 2 and 3, but not in 4, I can't seem
to figure out why.

Moved timings about to make them more comparable.

All different benchmarks write out the same format output to more easily
compare them.

The later versions are more realistic since they use allocate/malloc
which is more appropriate.

Heavily updated the README.md file to explain what is going on.
@zerothi
Copy link
Author

zerothi commented Jul 2, 2021

Please have a look at the updated look. It is now more complete and easier to extend with new versions!

@zerothi zerothi marked this pull request as ready for review July 2, 2021 19:45
@zerothi
Copy link
Author

zerothi commented Jul 2, 2021

Note that benchmarking the allocations is not really fair I think. In this case I have left all allocation timings out since that should be a benchmark by it-self.
The reason for not doing it is that some things may be allocated on the stack which one can't easily time without complicating matters quite a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants