Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update previous cells instead of re-initializing all cells #89

Open
efaulhaber opened this issue May 31, 2023 · 11 comments
Open

Update previous cells instead of re-initializing all cells #89

efaulhaber opened this issue May 31, 2023 · 11 comments
Labels
enhancement New feature or request performance

Comments

@efaulhaber
Copy link
Contributor

Currently, UpdateCellList! clears all cells and adds particles again. For use cases like SPH simulations where only small changes in position happen between updates, it will be much faster to only update cells that changed, using the existing cell lists from the previous step.

@lmiq
Copy link
Member

lmiq commented Jun 3, 2023

That is not trivial, because the construction of the lists must run over particles. One has to check the cell for each particle, and then adding it to a list is not more expensive than doing any kind of cell update. (I do not clear the particle lists memory - wise).

In MD what we do is to not update the lists at every simulation step. We compute the lists up to a cutoff greater than the actual simulation cutoff and update the lists only at every few steps. Possibly by following the displacement of the particles in an auxiliary array.

(Most times packages store the explicit neighbor lists between steps, but this may be prohibited for very large systems).

Maybe something or the sort can be done in SPH simulations as well.

@lmiq
Copy link
Member

lmiq commented Jun 6, 2023

By the way, from what I remember of the presentation of Trixi.jl, the scalability you are getting with it is of another order of what CellListMap can deliver. So I'm unsure if you will be able to use it for the kind of simulations your group is used. Nevertheless, I would be very grateful if you provide any kind of feedback on what could be missing, or what could be improved about the usability here, which may guide me towards future improvements. Also, if there is any kind of system for which CellListMap is useful to you, I would love to know. Thanks for the feedback and contributions.

@efaulhaber
Copy link
Contributor Author

Thanks! I am currently working on an SPH multiphysics code, which is separate from Trixi.jl, but we're trying to get a similar multithreading scalability as with Trixi.jl. So far, I've been using my own cell list implementation for the neighborhood search, but now I implemented one based on CellListMap.jl, and I am looking to compare the two.

@lmiq
Copy link
Member

lmiq commented Jun 7, 2023

Do you need periodic boundaries? Most of the complications here are to support triclinic PBCs.

@efaulhaber
Copy link
Contributor Author

We want to have periodic boundaries soon, that's also a reason why I implemented the NHS based on CellListMap.jl. Our implementation doesn't support that.

@lmiq
Copy link
Member

lmiq commented Jun 7, 2023

Maybe helpful:

Screenshot_20230607_073153_Chrome.jpg

The issue with the scaling here is on the construction of the cell lists. The mapping scales well for systems large enough.

However, with many cores, the cell list construction end up being the limiting step sooner or later.

In my tests, simply reducing lists that are computed in parallel is enough to break the scaling. Probably to improve on that we need to compute redundant lists that are not reduced.

@efaulhaber
Copy link
Contributor Author

I found similar results in my implementation. Reducing was actually worse than the serial implementation for me. I tried one list per thread, which made the update scale perfectly, but ended up ruining the map speed because I had to iterate over all lists.

I ended up looping (multithreaded) over all cells, marking all cells containing particles that are supposed to be in another cell. Then, in a serial loop, I update the marked cells. This doesn't scale for full updates (all particles changed cells), but it scales well when only a few particles changed cells.

@lmiq
Copy link
Member

lmiq commented Jun 7, 2023

Ah, I see, running over all cells to check which cells have to be updated can be multi-threaded, and then that can be useful. I think something like that can be implemented here as well. I'll take a shot at it when I have some time.

@lmiq lmiq added performance enhancement New feature or request labels Oct 25, 2023
@efaulhaber
Copy link
Contributor Author

I recently implemented a cell list neighborhood search that works with GPUs. For that, I had to write a fully parallel update with atomic operations. This might be interesting to you.

trixi-framework/PointNeighbors.jl#42

I now get much better scaling of the incremental update step. I haven't implemented the initialization yet, but I'm sure it will scale similarly.
grafik
"semi parallel" refers to my previous strategy that I explained earlier.

I ended up looping (multithreaded) over all cells, marking all cells containing particles that are supposed to be in another cell. Then, in a serial loop, I update the marked cells.

@lmiq
Copy link
Member

lmiq commented Jul 5, 2024

Do have already some instructions on how to use it? I'm certainly interested.

(does it support already PBCs?)

@efaulhaber
Copy link
Contributor Author

efaulhaber commented Jul 5, 2024

We extracted the neighborhood search from TrixiParticles.jl to a new package.
The main functions are initialize!, update! (self-explanatory) and foreach_point_neighbor, which is similar to your map_pairwise!.
Check out the docs for these functions here: https://trixi-framework.github.io/PointNeighbors.jl/dev/reference/

We also have a PR for a CellListMap.jl backend, which just wraps a CellListMap in our API, so that we can use CellListMap.jl as a backend for TrixiParticles.jl and to easily run benchmarks against your implementation.
trixi-framework/PointNeighbors.jl#8

Our implementations only support coordinate-aligned rectangular boxes for PBC, but after the PR above is merged, I'm planning on using your complex PBCs. Then users can use our GPU-compatible NHS for simulations on GPUs and the CellListMap.jl backend for complex PBCs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance
Projects
None yet
Development

No branches or pull requests

2 participants