In HPC Intro, learners explored the scheduler on their cluster by
launching a program called amdahl
. The objective of this lesson is
to adapt the manual job submission process into a repeatable, reusable workflow
with minimal human intervention. This is accomplished using
Snakemake, a modern workflow engine.
If you are interested in learning more about workflow tools, please visit The Workflows Community.
NERSC's Snakemake docs lists Snakemake's "cluster mode" as a disadvantage, since it submits each "rule" as a separate job, thereby spamming the scheduler with dependent tasks. The main Snakemake process also resides on the login node until all jobs have finished, occupying some resources.
If you wish to adapt your Python-based program for multi-node cluster execution, consider applying the workflow principles learned from this lesson to the Parsl framework. Again, NERSC's Parsl docs provide helpful tips.
This is a translation of the old HPC Workflows lesson using The Carpentries Workbench and R Markdown (Rmd). You are cordially invited to contribute! Please check the list of issues if you're unsure where to start.
If you edit the lesson, it is important to verify that the changes are rendered
properly in the online version. The best way to do this is to build the lesson
locally. You will need an R environment to do this: as described in the
{sandpaper}
docs, the environment can be either your terminal or
RStudio.
The environment.yml
file describes a Conda virtual environment that
includes R, Snakemake, amdahl,
pandoc, and termplotlib: the tools you'll need to
develop and run this lesson, as well as some depencencies. To prepare the
environment, install Miniconda following the official
instructions. Then open a shell application and create a new environment:
you@yours:~$ cd path/to/local/hpc-workflows
you@yours:hpc-workflows$ conda env create -f environment.yaml
N.B.: the environment will be named "workflows" by default. If you prefer another name, add
-n «alternate_name»
to the command.
{sandpaper} is the engine behind The Carpentries Workbench lesson layout and static website generator. It is an R package, and has not yet been installed. Paraphrasing the installation instructions, start R or radian, then install:
you@yours:hpc-workflows$ R --no-restore --no-save
install.packages(c("sandpaper", "varnish", "pegboard", "tinkr"),
repos = c("https://carpentries.r-universe.dev/", getOption("repos")))
Now you can render the site! From your R session,
library("sandpaper")
sandpaper::serve()
This should output something like the following:
Output created: hpc-workflows/site/docs/index.html
To stop the server, run servr::daemon_stop(1) or restart your R session
Serving the directory hpc-workflows/site/docs at http://127.0.0.1:4321
Click on the link to http://127.0.0.1:4321 or copy and paste it in your browser. You should see any changes you've made to the lesson on the corresponding page(s). If it looks right, you're set to proceed!