Skip to content

CAOS-NYU/Dedalusv3_GreeneSingularity

Repository files navigation

Using Dedalus on the NYU Greene Cluster

Dedalus is a flexible differential equations solver using spectral methods. It is MPI-parallelized and therefore can make efficient use of high performance computing resources like the NYU Greene Cluster. The cluster uses Singularity containers to manage packages and Slurm for job scheduling. It is tricky to construct and use a Singularity container for Dedalus v3 that interacts with these well. Luckily, the NYU HPC staff has figured out much of the details. This note describes how to use the Singularity, on single node, on multiple nodes, and in JupyterLab. At the end we will also biefly comment on how to build the Singularity so that you can build your customized version.

Note: If anything does not work in this note, or that you have run into trouble, please let me know at my email ryan_sjdu@nyu.edu. I would be happy to help.

Table of contents

  1. Using Dedalus on a single node
    1. On the command line, interactively
    2. On the command line, using Slurm via srun
    3. JupyterLab using Open OnDemand (OOD)
  2. Using Dedalus on multiple nodes
    1. On the command line, using Slurm via srun
    2. Submitting a job using Slurm
  3. Testing performance
  4. Building the Singularity
  5. Acknowledgment

Using Dedalus on a single node

On the command line, interactively

We first use Dedalus on the command line. On the command line (compared to using JupyterLab) Dedalus can use multiple cores (but still on a single node) to improve speed. Simulations that involves heavy computation should use this method.

Once we are logged into the Greene cluster, cd into your scratch directory and request a computing node so that we can run some code for testing (do not run CPU heavy jobs in the log-in node)

cd $SCRATCH
srun --nodes=1 --tasks-per-node=4 --cpus-per-task=1 --time=2:00:00 --mem=4GB --pty /bin/bash

Once we are in, paste the following commands to start the already-made singularity

singularity exec \
  --overlay /scratch/work/public/singularity/dedalus-3.0.0a0-openmpi-4.1.2-ubuntu-22.04.1.sqf:ro \
  /scratch/work/public/singularity/ubuntu-22.04.1.sif /bin/bash
unset -f which
source /ext3/env.sh
export OMP_NUM_THREADS=1; export NUMEXPR_MAX_THREADS=1

The last command essentially turns off any shared parallelism. But this is recommended for Dedalus's performance since Dedalus does not use hybrid parallelism (see Dedalus documentation on Disable multithreading). We can check they indeed worked by running echo $OMP_NUM_THREADS; echo $NUMEXPR_MAX_THREADS and we should get 1 1.

We could now run an example script, e.g.: the Rayleigh-Benard convection (2D IVP) example. We clone this from the Dedalus GitHub repo. To avoid downloading a lot of files, we use sparse checkout.

git clone --depth 1 --filter=blob:none --sparse https://github.com/DedalusProject/dedalus.git
cd dedalus/
git sparse-checkout set examples
cd examples/ivp_2d_rayleigh_benard

Now we can run the example. Note that we requested 4 cores and are using 4 MPI processes, these two numbers should be the same.

mpiexec -n 4 python3 rayleigh_benard.py
mpiexec -n 4 python3 plot_snapshots.py snapshots/*.h5

We now see the script outputting time-stepping information. And if we look at the CPU usage in the node using htop -u ${USER}, we should see near 100% usage on 4 cores. Satisfying.

Note that we did not use the Dedalus provided test python3 -m dedalus test. This is intentional. The test function does not work consistently with our setup. But obviously, we have a working Singularity given that we can run Dedalus examples.

On the command line, using Slurm via srun

All the above commands are wrapped up in a script that we can just call. We can take a peak at the script via

cat /scratch/work/public/singularity/run-dedalus-3.0.0a0.bash

Now to run the same Dedalus code, we can just enter this command in the log-in node:

srun --nodes=1 --tasks-per-node=4 --cpus-per-task=1 --time=2:00:00 --mem=4GB \
  /scratch/work/public/singularity/run-dedalus-3.0.0a0.bash python rayleigh_benard.py

Submitting a job using Slurm

To run many heavy simulations, one should queue the jobs in Greene by using Slurm scripts. Here is the general tutorial for Slurm on Greene. In this section, we will run an example Slurm Dedalus job.

In the repository of this note, there is an example Slurm script that runs the Periodic shear flow (2D IVP) exmple. To use it, we first clone this repo

cd $SCRATCH
git clone https://github.com/CAOS-NYU/Dedalusv3_GreeneSingularity.git
cd Dedalusv3_GreeneSingularity

We can take a look at the script

cat slurm_example_singlenode.SBATCH

You need to fill in your NYU ID to use this script. We see that the script contains the same srun commands used in the interactive command line case. Nothing mysterious.

Now we submit the script

sbatch slurm_example.SBATCH

and check the queue

squeue -u ${USER}

Now we see it in the queue. We can check the reasons for the wait here. After some patience and the job runs (you will receive an email when it is done), we can check the output in the Dedalusv3_GreeneSingularity directory. There should be a file named slurm_<yourjobid>.out that contains the terminal output. We can also find the data output in the code folder

$SCRATCH/dedalus/examples/ivp_2d_shear_flow
ls snapshots

JupyterLab using Open OnDemand (OOD)

Sometimes it is convenient to use JupyterLab for code development. Note that for Dedalus, running it in JupyterLab means we can use only one core. This is acceptable if the computation is light. We should only request one core because more will be wasteful. (Note: you can run mipexec in Jupyter but I think then one should just use the command line.)

The instruction on using Open OnDemand (OOD) with Conda/Singularity for Greene is available here. Since we have an already-made Singularity, we can skip most of the steps.

We create a kernel named dedalus3 by copying my files to your home directory.

mkdir -p ~/.local/share/jupyter/kernels
cd ~/.local/share/jupyter/kernels
cp -R /scratch/work/sd3201/dedalus3/dedalus3 ./dedalus3
cd ./dedalus3

ls
#kernel.json logo-32x32.png logo-64x64.png python 
#files in the ~/.local/share/jupyter/kernels directory

After this, we can enjoy Dedalus in Jupyter on OOD by following this tutorial. Remember to request only one core because we can only use one!

To learn about the details of the files you copied, you could read the python and kernel.json files. The Singularity used is mine. For instructions on how to make your own, see the section on buiding your own Singularity.

Using Dedalus on multiple nodes

Since Dedalus uses MPI, we could use multiple nodes for our computation. In Greene, requesting multiple nodes and lauching python in each nodes is managed by Slurm via srun. Therefore, we cannot use multiple nodes for interactive jobs. Therefore, we start with

On the command line, using Slurm via srun

In a log-in node, run

srun --nodes=4 --tasks-per-node=4 --cpus-per-task=1 --time=2:00:00 --mem=4GB \
  /scratch/work/public/singularity/run-dedalus-3.0.0a0.bash python rayleigh_benard.py

Because we have to disable multithreading, we should keep --cpus-per-task=1.

After some wait for the job to start, we should see the code running. We can see four nodes used via

squeue -u $USER

In each node, there are four CPU cores used, all near 100%. Nice.

Submitting a job using Slurm

It is straightforward to convert the above command into a slurm script. We provide an example in this repo.

Testing performance

Please see the drag_race folder for some performance tests of Dedalus on Greene. The tests shows our set-ups are working well.

Building the Singularity

We will build the Singularity by first following the standard steps. For installing Dedalus, we will by building from source. First run

mkdir $SCRATCH/dedalus_sing
cd $SCRATCH/dedalus_sing

cp -rp /scratch/work/public/overlay-fs-ext3/overlay-1GB-400K.ext3.gz .
gunzip overlay-1GB-400K.ext3.gz

The launch the Singularity

singularity exec \
  --overlay overlay-1GB-400K.ext3 \
  /scratch/work/public/singularity/ubuntu-22.04.1.sif /bin/bash

Inside the Singularity, install miniconda

bash /share/apps/utils/singularity-conda/setup-conda.bash
source /ext3/env.sh

Then clone the Dedalus source code

cd /ext3
git clone https://github.com/DedalusProject/dedalus.git
cd /ext3/dedalus

and build and install Dedalus

CC=mpicc \
  MPI_INCLUDE_PATH=/usr/lib/x86_64-linux-gnu/openmpi/include \
  MPI_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/openmpi \
  FFTW_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu \
  FFTW_INCLUDE_PATH=/usr/include \
  python3 -m pip install --no-cache .

At this stage, we can add more packages to the Singularity. For example, we could add cmocean, a beautiful colormap package to the existing Singularity.

pip install cmocean

To test that we indeed have the package, run

source /ext3/env.sh
python -c "import cmocean; print(cmocean.__version__); print(cmocean.__file__)"
#v3.0.3
#/ext3/miniconda3/lib/python3.10/site-packages/cmocean/__init__.py
#your package should be here, not .local

Now you have your own Dedalus Singularity that you can edit. You could replace /scratch/work/public/singularity/dedalus-3.0.0a0-openmpi-4.1.2-ubuntu-22.04.1.sqf in this note with $SCRATCH/dedalus_sing/overlay-1GB-400K.ext3. If you want to share your Singularity, run inside the Singularity

mksquashfs /ext3 dedalus_readonly.sqf -keep-as-directory

to make a read-only version. You should not let others read your ext3 file: read access means write access for ext3 files!

The version I use will be available at /scratch/work/sd3201/dedalus3/dedalus_ryansingularity.sqf, if you want to use my version. A keen reader might have already realized the Singularity used for the JupyterLab is my version.

Acknowledgment

The Singularity files in this note are made by Shenglong Wang on the NYU HPC team. We thank the NYU HPC team for their help in training and troubleshooting.

About

Tutorial on using Dedalus v3 on NYU Greene HPC Cluster

Resources

Stars

Watchers

Forks