Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add capability to use James solver for gravity BCs #2271

Open
zingale opened this issue Aug 12, 2022 · 1 comment
Open

add capability to use James solver for gravity BCs #2271

zingale opened this issue Aug 12, 2022 · 1 comment

Comments

@zingale
Copy link
Member

zingale commented Aug 12, 2022

AMReX now has a James BC solver:

AMReX-Codes/amrex#2912

we should add the ability to use this instead of the multipole solver for BCs.

@maxpkatz
Copy link
Member

As a comparison point for testing how this affects performance for an apples-to-apples test, build Exec/science/wdmerger with make USE_CUDA=TRUE USE_MPI=TRUE DIM=3 TINY_PROFILE=TRUE. This is the test case I ran on Perlmutter across 4 GPUs:

srun nsys profile -f true -s none -o wdmerger_256_%q{SLURM_PROCID} ./Castro3d.gnu.TPROF.MPI.CUDA.ex inputs amr.n_cell = 256 256 256 max_step = 10 amr.max_grid_size = 64 amr.blocking_factor = 16 amr.plot_files_output = 0 amr.checkpoint_files_output = 0

(The interactive job was allocated with salloc -N 1 --gpus-per-task=1 --gpu-bind=map_gpu:0,1,2,3 --tasks-per-node=4 -t 120 --qos=interactive -A m3018_g -C gpu).

This is what the profile looks like on the last timestep.

profile

The gravity solve takes 79 ms, of which 24 ms is spent in the multipole BC and 55 ms is spent in the Poisson solve.

This is not the only relevant configuration to consider; at larger scale, gravity tends to dominate over hydro and the profile looks a bit different, making the BCs less important. But it's a useful starting point for analysis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants