Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[user]: Elyssa Collins #27

Open
1 task done
elyssac02 opened this issue Mar 24, 2022 · 2 comments
Open
1 task done

[user]: Elyssa Collins #27

elyssac02 opened this issue Mar 24, 2022 · 2 comments
Labels

Comments

@elyssac02
Copy link

Your Name

Elyssa Collins

Your Institution

Center for Geospatial Analytics, North Carolina State University

Your Location

Raleigh, North Carolina, United States

Your Use Case

I'm using RAPID to develop climate projections (from 32 global climate models) of streamflow throughout the NHDPlus network across a moderate a high greenhouse gas scenario. These streamflow data will then be used with a terrain-based inundation model to investigate how climate change will influence the spatial distribution of future flood probabilities. The study domain is still being decided, but probably across a few HUC6 watersheds in North Carolina. The temporal domain is 2006 - 2100 (2006 - 2025 for the reference period and 2026 - 2100 for the projected period). The runoff data used for RAPID optimization comes from Livneh et al., 2015 and the runoff data used for the climate change part of the study comes from Vano et al., 2020 -- specifically, we are using the LOcalized Constructed Analogs CMIP5 hydrology projections.

Why RAPID?

We chose RAPID because it's open source, it can be applied to large spatial domains/is computationally efficient, has strong documentation, and has many papers to look to for optimization and applications of the model. The fact that RAPID uses mapped rivers as its computational elements was also necessary for us. While RAPID being dockerized wasn't initially something we considered, it has proven to be very beneficial.

How do you use RAPID?

We use your Docker images. I use RAPID on my local machine (Windows 10, but I also use Ubuntu 20.04.1 with WSL) and a high performance computer. For running RAPID on HPC, I converted the Docker image to a Singularity image.

Anything else you'd like us to know?

No response

Are you aware of our philosphy for open source?

  • I have read the last paragraph of Section 6.3 in David et al., 2016 (the one starting with
    "Finally, and contrary to common belief, open source software does
    not mean...")
@c-h-david
Copy link
Owner

@elyssac02
Hi Elyssa, thank you so much for taking the time to fill out the form! It's great to hear about your exciting project. I had heard of Singularity images but have yet to look into it. Do you perhaps have an example somewhere? Maybe something similar to the RAPID Dockerfile? Does Singularity have a hub similar to Docker Hub for RAPID? We could perhaps consider adding a Singularity capability to the RAPID repository. Thanks again!

@elyssac02
Copy link
Author

@c-h-david
Of course! Sorry for taking a while to get back to you on this. A little while ago, I created a repo for how to use Singularity on HPC (with some specific instructions for my university's cluster). In the repo, you'll see that it's relatively straightforward to convert a Docker image to a Singularity image and this is what I did with RAPID. However, I did have to slightly modify a couple lines of code (e.g., in rapid_main.F90), rebuild RAPID, and save a copy of the image as a .tar file (see here) to get it to work with running multiple optimization procedures and simulations simultaneously on HPC. Since the rapid_namelist is read directly, running multiple procedures/simulations simultaneously also required writing to local scratch on the compute node (although, I'm sure there are other approaches to this).

I don't have an example on GitHub yet, however, I would be happy to either create one in my Singularity repo or I could create a pull request containing example scripts here in the RAPID repo (might make more sense). I have examples for running 1) a single optimization procedure, 2) a single regular run, 3) simultaneous optimization procedures, and 4) simultaneous regular runs. The simultaneous procedures/runs involves simultaneously deploying multiple Singularity containers with different inputs as tasks using a tool called pynodelauncher along with writing to local scratch, as mentioned above.

I think the equivalent to a Dockerfile is a Singularity Recipe, however, I didn't have to create this since I convert the Docker image to a Singularity image. It looks like there once was a Singularity Hub, but looks like they decided to go read-only in 2021. In any case, I think including instructions in the RAPID repository for converting the Docker image to a Singularity image could be helpful. If I create a pull request, I could also include the .tar file (same contents as the current RAPID docker image, just with a couple of slight modifications to a couple of scripts) used to build the Singularity image on HPC. This .tar file should work for any use case, I would just need to specify some specific instructions for paths.

Hope this answers your question and is helpful! Let me know if you would like me to create a pull request and if so, what you would like me to include (e.g., the .tar file, example scripts + documentation, etc.).

@c-h-david c-h-david added the user label Sep 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants