Skip to content

Commit

Permalink
Merge pull request #150 from nmfs-opensci/dev
Browse files Browse the repository at this point in the history
remove childimage
  • Loading branch information
eeholmes authored Nov 6, 2024
2 parents afad8e0 + 8f1cf8a commit b0bccf7
Show file tree
Hide file tree
Showing 15 changed files with 322 additions and 180 deletions.
3 changes: 0 additions & 3 deletions appendix
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,6 @@ RUN chown -R root:staff /pyrocket_scripts && \
# Convert NB_USER to ENV (from ARG) so that it passes to the child dockerfile
ENV NB_USER=${NB_USER}

# Copy the child repo files into childimage so they are available to scripts
ONBUILD COPY --chown=${NB_USER}:${NB_USER} . ${REPO_DIR}/childimage

# Revert to default user and home as pwd
USER ${NB_USER}
WORKDIR ${HOME}
27 changes: 21 additions & 6 deletions book/configuration_files.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@ The `install-conda-packages.sh` script will install conda packages to the conda
Here is the code for your Docker file. You can name the conda package file to something other than `environment.yml`. Make sure your file has `name:`. The name is arbitrary. It is ignored but required for the script.

```
COPY environment.yml environment.yml
RUN /pyrocket_scripts/install-conda-packages.sh environment.yml
RUN rm environment.yml
```

This is a standard format. You can add version pinning.
Expand All @@ -27,16 +29,21 @@ dependencies:
Instead of a list of conda packages (typically called environment.yml), you can use a conda lock file instead.

Here is the code for your Docker file. You can name your conda lock file something other than `conda-lock.yml`.

```
COPY conda-lock.yml conda-lock.yml
RUN /pyrocket_scripts/install-conda-packages.sh conda-lock.yml
RUN rm conda-lock.yml
```

## requirments.txt
## requirements.txt

The `install-pip-packages.sh` script will install packages using `pip`. Here is the code for your Docker file. You can name your pip package file something other than `requirements.txt`.

```
COPY requirements.txt requirements.txt
RUN /pyrocket_scripts/install-pip-packages.sh requirements.txt
RUN rm requirements.txt
```

requirements.txt
Expand All @@ -53,7 +60,9 @@ The `install-r-packages.sh` script will run the supplied R script which you can
Here is the code for your Docker file. You can name the R script file to something other than `install.R`. Make sure your file is an R script.

```
COPY install.R install.R
RUN /pyrocket_scripts/install-r-packages.sh install.R
RUN rm install.R
```

install.R example
Expand Down Expand Up @@ -85,7 +94,9 @@ The `install-apt-packages.sh` script will install packages with `apt-get`. Here

```
USER root
COPY apt.txt apt.txt
RUN /pyrocket_scripts/install-apt-packages.sh apt.txt
RUN rm apt.txt
USER ${NB_USER}
```

Expand All @@ -100,10 +111,12 @@ cmocean

## postBuild

The `run-postbuild.sh` script can be run as root or jovyan (`${NB_USER}`). This script does not accept a file name. You need to name your postBuild script `postBuild` and put at the base level with your Docker file.
The `run-postbuild.sh` script can be run as root or jovyan (`${NB_USER}`). The script has some extra code to remove leftover files after installing Python extensions.

```
RUN /pyrocket_scripts/run-postbuild.sh
COPY postBuild postBuild
RUN /pyrocket_scripts/run-postbuild.sh postBuild
RUN rm postBuild
```

postBuild
Expand All @@ -116,12 +129,14 @@ set -e

## start

The `start` bash code is run when the image starts. py-rocker-base has a start script at `${REPO_DIR}/start` which loads the Desktop applications. If you change that start file (by copying your start file onto that location), then the Desktop apps will not be loaded properly. Instead, the `setup-start.sh` will add your start file to the end of `${REPO_DIR}/start` so that the Desktop is still setup properly.
The `start` bash code is run when the image starts. py-rocker-base has a start script at `${REPO_DIR}/start` which loads the Desktop applications. If you change that start file (by copying your start file onto that location), then the Desktop apps will not be loaded properly. Instead, the `setup-start.sh` will add your start file to a directory `${REPO_DIR}/childstarts` and will run all those scripts after `${REPO_DIR}/start`.

The `setup-start.sh` script does not accept a file name. You need to name your start script `start` and put at the base level with your Docker file.
The `setup-start.sh` script will move the file you provide into `${REPO_DIR}/childstarts`. As usual you can name your script something other than `start`.

```
RUN /pyrocket_scripts/setup-start.sh
COPY start start
RUN /pyrocket_scripts/setup-start.sh start
RUN rm start
```

## Desktop applications
Expand Down
36 changes: 32 additions & 4 deletions book/customizing.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,44 @@

py-rocket-base is designed to be used in the FROM line of a Dockerfile similar to rocker images. It includes directories called `\pyrocket_scripts` and `\rocker_scripts` that will help you do common tasks for scientific docker images. You do not have to use these scripts. If you are familiar with writing Docker files, you can write your own code. The exception is installation of Desktop files. Properly adding Desktop applications to py-rocket-base requires the use of the `\pyrocket_scripts/install-desktop.sh` script. The start file is also an exception. See the discussion in the configuration files.

**Calling the scripts**
## helper scripts

The format for calling the pyrocket and rocker scripts is the following.
### pyrocket scripts

How to use the helper scripts is shown in [configuration files](configuration_files.html). The helper scripts provide code to do common tasks. Users can write their own Docker file code to do these tasks but the helper scripts provide standardized code. The scripts are

* `install-conda-packages.sh`
* `install-pip-packages.sh`
* `install-r-packages.sh`
* `install-apt-packages.sh`
* `install-desktop.sh`
* `setup-start.sh`
* `run-postbuild.sh`

### rocker scripts

The rocker docker stack also includes a set of scripts for extending rocker packages. These are included py-rocket-base.

### Calling the scripts

The format for calling the pyrocket and rocker scripts is the following.

pyrocket scripts take files (or a path to a directory with Desktop files) as arguments. The `COPY` command is needed to copy the file into the Docker build context where it can be used in `RUN` commands. Without this you will get a "file not found" error. Removing the file after you are done with it will clean up your image files.
```
COPY environment.yml environment.yml
RUN /pyrocket_scripts/install-conda-packages.sh environment.yml
RUN rm environment.yml
```

Note that PATH must be given for the subshell since rocker installation scripts will fail with conda on the path.
Rocker scripts do not take arguments. Note that PATH must be given since rocker installation scripts will fail with conda on the path. The path specification is only within the specific RUN context.
```
USER root
RUN PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin && \
/rocker_scripts/install_geospatial.sh
USER ${NB_USER}
```

## File structure
## Repository file structure

Here is a typical repo structure. Only the Dockerfile is required. The rest are optional. The files names, `apt.txt`, `environment.yml`, `requirements.txt` and `install.R` are optional. You can name these files other names and still use them in the pyrocket scripts. `Desktop`, `start` and `postBuild` must be named exactly this to use them in the pyrocket scripts (these scripts do not take filename arguments).

Expand Down Expand Up @@ -54,7 +76,9 @@ Dockerfile
```
FROM ghcr.io/nmfs-opensci/py-rocket-base:latest
COPY environment.yml environment.yml
RUN /pyrocket_scripts/install-conda-packages.sh environment.yml
RUN rm environment.yml
```

environment.yml
Expand All @@ -81,7 +105,9 @@ Dockerfile
```
FROM ghcr.io/nmfs-opensci/py-rocket-base:latest
COPY install.R install.R
RUN /pyrocket_scripts/install-r-packages.sh install.R
RUN rm install.R
```

install.R
Expand All @@ -108,7 +134,9 @@ Dockerfile
FROM ghcr.io/nmfs-opensci/py-rocket-base:latest
USER root
COPY apt.txt apt.txt
RUN /pyrocket_scripts/install-apt-packages.sh apt.txt
RUN rm apt.txt
USER ${NB_USER}
```

Expand Down
8 changes: 5 additions & 3 deletions book/desktop.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,19 @@ py-rocket-base puts these `.desktop` files in `/usr/share/Desktop`. Typically th

## Adding an application in your child docker image

Use the pyrocket script, `install-desktop.sh` to set up the desktop and move your files to the proper location. Here is the code for your Docker file. This script must be run as root. It does not take any arguments. Instead you include your desktop files in a directory called `Desktop` at the base level with your Docker file.
Use the pyrocket script `install-desktop.sh` to set up the desktop and move your files to the proper location. Provide a path to a directory with your Desktop files as the argument to the script. Here is the code for your Docker file. This script must be run as root.

```
USER root
RUN /pyrocket_scripts/install-desktop.sh
COPY ./Desktop /tmp/Desktop
RUN /pyrocket_scripts/install-desktop.sh /tmp/Desktop
RUN rm -rf /tmp/Desktop
USER ${NB_USER}
```

### Create the Desktop directory

Create the directory and add the .desktop and optional .png and .xml files. py-rocket-base will move them to the correct places (`/usr/share/applications` and `/usr/share/Desktop`, `/usr/share/mime/packages` and icon locations).
Create the directory and add the .desktop and optional .png and .xml files. The pyrocket script `install_desktop.sh` will move them to the correct places (`/usr/share/applications` and `/usr/share/Desktop`, `/usr/share/mime/packages` and icon locations).

```
your-repo/
Expand Down
27 changes: 16 additions & 11 deletions book/developers.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ py-rocket-base is inspired by [repo2docker](https://github.com/jupyterhub/repo2d

The Pangeo Docker stack does not use repo2docker, but mimics repo2docker's environment design. The Pangeo base-image behaves similar to repo2docker in that using the base-image in the `FROM` line of a Dockerfile causes the build to look for files with the same names as repo2docker's [configuration files](https://repo2docker.readthedocs.io/en/latest/config_files.html) and then do the proper action with those files. This means that routine users do not need to know how to write Dockerfile code in order to extend the image with new packages or applications. py-rocker-base Docker image uses this Pangeo base-image design. It is based on `ONBUILD` commands in the Dockerfile that trigger actions only when the image is used in the `FROM` line of another Dockerfile.

py-rocket-base does not include this `ONBUILD` behavior. Instead it follows the [rocker docker stack](https://github.com/rocker-org/rocker-versioned2) design. py-rocket-base a directory called `\pyrocket_scripts`that will help you do common tasks for scientific docker images.These scripts are not required. If users are familiar with writing Docker files, they can write their own code. The use of helper scripts was used after feedback that the Pangeo ONBUILD behavior makes it harder to customize images that need very specific structure or order of operations.
py-rocket-base does not include this `ONBUILD` behavior. Instead it follows the [rocker docker stack](https://github.com/rocker-org/rocker-versioned2) design and provides helper scripts for building on the base image. py-rocket-base a directory called `\pyrocket_scripts`that will help you do common tasks for scientific docker images.These scripts are not required. If users are familiar with writing Docker files, they can write their own code. The use of helper scripts was used after feedback that the Pangeo ONBUILD behavior makes it harder to customize images that need very specific structure or order of operations.

*There are many ways to install R and RStudio into an image designed for JupyterHubs* The objective of py-rocker-base is not to install R and RStudio, per se, and there are other leaner and faster ways to install R/RStudio if that is your goal[^1]. The objective of py-rocket-base is to create an JupyterHub image such when you click the RStudio button in the JupyterLab UI to enter the RStudio UI, you enter an environment that is the same as if you had used a Rocker image. If you are in the JupyterLab UI, the environment is the same as it you had used repo2docker (or Pangeo base-image) to create the environment.

Expand Down Expand Up @@ -166,12 +166,9 @@ RUN chown -R root:staff /pyrocket_scripts && \ # <8>
# Convert NB_USER to ENV (from ARG) so that it passes to the child dockerfile
ENV NB_USER=${NB_USER} # <9>

# Copy the child repo files into childimage so they are available to scripts
ONBUILD COPY --chown=${NB_USER}:${NB_USER} . ${REPO_DIR}/childimage # <10>

# Revert to default user and home as pwd
USER ${NB_USER} # <11>
WORKDIR ${HOME} # <11>
USER ${NB_USER} # <10>
WORKDIR ${HOME} # <10>
```
1. Some commands need to be run as root, such as installing linux packages with `apt-get`
2. Set variables. CONDA_ENV is useful for child builds
Expand All @@ -182,7 +179,6 @@ WORKDIR ${HOME} # <11>
7. `book` and `docs` are the documentation files and are not needed in the image.
8. Copy the pyrocket helper scripts to the `/pyrocket_scripts` directory and set to executable.
9. The `NB_USER` environmental variable is not exported by repo2docker (it is an argument confined to the parent build) but is very useful for child builds. So it is converted to an environmental variable.
10. Copy the child build context (files with the Docker file) into `${REPO_DIR}`. Make sure that jovyan owns the directory. Note, jovyan owns `${REPO_DIR}` (this is set by repo2docker).
11. The parent docker build completes by setting the user to jovyan and the working directory to `${HOME}`. Within a JupyterHub deployment, `${HOME}` will often be re-mapped to the user persistent memory so it is important not to write anything that needs to be persistent to `${HOME}`, for example configuration. You can do this in the `start` script since that runs after the user directory is mapped or you can put configuration files in some place other than `${HOME}`.

## rocker.sh
Expand Down Expand Up @@ -263,13 +259,22 @@ set -euo pipefail
source ${REPO_DIR}/env.txt # <1>
# End - Set any environment variables here # <1>
# Run child start in a subshell to contain its environment
[ -f ${REPO_DIR}/childimage/start ] && ( source ${REPO_DIR}/childimage/start ) # <2>
# Run child start scripts in a subshell to contain its environment
# ${REPO_DIR}/childstart/ is created by setup-start.sh
if [ -d "${REPO_DIR}/childstart/" ]; then # <2>
for script in ${REPO_DIR}/childstart/*; do # <2>
if [ -f "$script" ]; then # <2>
echo "Sourcing script: $script" # <2>
source "$script" || { # <2>
echo "Error: Failed to source $script. Moving on to the next script." # <2>
} # <2>
fi # <2>
done # <2>
fi # <2>
exec "$@"
```
1. In a Docker file so no way to dynamically set environmental variables, so the `env.txt` file with the `export <var>=<value>` are source at start up.
2. Run any child start script in a subshell. Run in a subshell to contain any `set` statements or similar.
2. Run any child start script in a subshell. Run in a subshell to contain any `set` statements or similar. start scripts are moved into `childstarts` by the `setup-start.sh` pyrocket script.

## desktop.sh

Expand Down
Loading

0 comments on commit b0bccf7

Please sign in to comment.