-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PABLO: simplify 2:1 balance #335
Conversation
54d4456
to
9da552c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested using several complex geometries. Basically, the generated mesh is equal to the mesh generated by the previous version. The max speed up I could see in mesh generation is about 5% of the previous version. The tests have been based on: number of global elements, checking 1:1 of slices of the mesh, comparison by subtraction of global ids, checking of number of the elements and of their configuration level by level.
We must continuously pay the our attention to the effects of this mod, because a lot of unknown configurations should be tested.
If we only need the Morton number, we can just compute the Morton number.
…ance It's simpler to process all the ghosts of the first layer rather than relay on the AUX bit for 2:1 balance. Moreover, that bit was not properly reset and 2:1 balance was check on many more octants than actually needed.
2e8a6d5
to
280309b
Compare
Rebased on current master. |
I've rewritten some functions to reduce code duplication. The refactoring removed 1300 lines of code.
There are also some performance improvements. The biggest one is related to the last commit. After the latest code refactoring, it is no more necessary to relay the AUX bit for 2:1 balance. Moreover, that bit was not properly reset and 2:1 balance was check on many more octants than actually needed. This leads to a speed improvement of 25x for the function that perform load balance. All tests pass, however a thorough code review and some more test are needed to make sure the changes are fine.
Original profile of localBalance:
Updated profile of localBalance: