Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add local repo2docker information. #6505

Merged
merged 3 commits into from
Nov 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/_quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ website:
- tasks/rebuild-postgres-image.qmd
- tasks/managing-multiple-user-image-repos.qmd
- tasks/new-image.qmd
- tasks/repo2docker-local.qmd
- tasks/transition-image.qmd
- tasks/new-packages.qmd
- tasks/course-config.qmd
Expand Down
93 changes: 46 additions & 47 deletions docs/tasks/new-image.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -19,22 +19,20 @@ If that is the case, only a `Dockerfile` format will work.

As always, create a feature branch for your changes, and submit a PR when done.

## Use an existing image as a template
There are two approaches to pre-populate the image's assets:

Browse through our [image repos](https://github.com/orgs/berkeley-dsep-infra/repositories?language=&q=image&sort=&type=all)
to find a hub that is similar to the one you are trying to create. This will
give you a good starting point.
- Use an existing image as a template.
Browse through our [image
repos](https://github.com/orgs/berkeley-dsep-infra/repositories?language=&q=image&sort=&type=all)
to find a hub that is similar to the one you are trying to create. This will
give you a good starting point.

## Create the image repos
- Fork [hub-user-image-template](https://github.com/berkeley-dsep-infra/hub-user-image-template). Click "Use this template" > "Create a new repository".
Be sure to follow convention and name the repo `<hubname>-user-image`, and
the owner needs to be `berkeley-dsep-infra`. When that is done, create your
own fork of the new repo.

Create a new image repo from the [hub-user-image-template](https://github.com/berkeley-dsep-infra/hub-user-image-template).
Click "Use this template" > "Create a new repository".

Be sure to follow convention and name the repo `<hubname>-user-image`, and the
owner needs to be `berkeley-dsep-infra`. When that is done, create your own
fork of the new repo.

### Configuring the root image repo
### Image Repository Settings

There are now a few steps to set up the CI/CD for the new image repo. In the
`berkeley-dsep-infra` image repo, click on `Settings`, and under `General`,
Expand All @@ -53,7 +51,7 @@ will be adding two new variables:
always be `ucb-datahub-2018/user-images/<image-name>` and the
image name will always be the same as the repo: `<hubname>-user-image`.

### Configure your fork
### Your Fork's Repository Settings

Now you will want to disable Github Actions for your fork of the image repo.
If you don't, whenever you push PRs to the root repo the workflows *in your
Expand All @@ -64,25 +62,29 @@ failure.
To disable this for your fork, click on `Settings`, `Actions` and `General`.
Check the `Disable actions` box and click save.

### Add the root image repo to the list of allowed repos in the `berkeley-dsep-infra` secrets.
### Enable Artifact Registry Pushing

Now, go to the `berkeley-dsep-infra` [Secrets and Variables](https://github.com/organizations/berkeley-dsep-infra/settings/secrets/actions).
You will need to give your repo permissions to push to the Artifact Registry,
The image repository needs to be added to the list of allowed repositories in
the `berkeley-dsep-infra` secrets. Go to the `berkeley-dsep-infra` [Secrets and
Variables](https://github.com/organizations/berkeley-dsep-infra/settings/secrets/actions).
Give your repository permissions to push to the Artifact Registry,
as well as to push a branch to the [datahub repo](https://github.com/berkeley-dsep-infra/datahub).

Edit both `DATAHUB_CREATE_PR` and `GAR_SECRET_KEY`, and click on the gear icon,
search for your repo name, check the box and save.

### Update your deployment's `hubploy.yaml` and add the image to the primary list of repos.
### Configure `hubploy`

You need to let `hubploy` know the specifics of the image. Change the `name` of the image in
`deployments/<hubname>/hubploy.yaml` to point to your new image name, and after the name add
`:PLACEHOLDER` in place of the image sha. This will be automatically updated after your new image
is built and pushed to the Artifact Registry.
You need to let `hubploy` know the specifics of the image by updating your
deployment's `hubploy.yaml`. Change the `name` of the image in
`deployments/<hubname>/hubploy.yaml` to point to your new image name, and after
the name add `:PLACEHOLDER` in place of the image sha. This will be
automatically updated after your new image is built and pushed to the Artifact
Registry.

Example:

```
```yaml
images:
images:
- name: us-central1-docker.pkg.dev/ucb-datahub-2018/user-images/fancynewhub-user-image:PLACEHOLDER
Expand All @@ -102,56 +104,53 @@ Create a PR and merge to staging. You can cancel the
[`Deploy staging and prod hubs` job in Actions](https://github.com/berkeley-dsep-infra/datahub/actions/workflows/deploy-hubs.yaml),
or just let it fail.

## Add a github bot notification in Slack
## Subscribe to GitHub Repo in Slack

Go to the #ucb-datahubs-bots channel, and run the following command:

```
/github subscribe berkeley-dsep-infra/<your repo name>
```

## Modify the image configuration as necessary
## Modify the Image

This step is straightforward: create a feature branch, edit/modify/delete/add
any files in the image repo to configure the image as needed.
This step is straightforward: create a feature branch, and edit, delete, or add
any files to configure the image as needed.

We also strongly recommend copying `README-template.md` over the default
`README.md`, and modifying it to replace all occurrences of `<HUBNAME>` with
the name of your image.

## Submitting a pull request
## Submit Pull Requests

Familiarize yourself with [pull
requests](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)
and [repo2docker](https://github.com/jupyter/repo2docker) , and create a
fork of the [datahub staging
branch](https://github.com/berkeley-dsep-infra/datahub).

1. Set up your git/dev environment by [following the instructions
here](https://github.com/berkeley-dsep-infra/hub-user-image-template/blob/main/CONTRIBUTING.md).
: - This guide is also located in your image repo!

2. Test the changes locally using `repo2docker`, then submit a PR to `staging`.
and [repo2docker](https://github.com/jupyter/repo2docker), and create a fork of
the [datahub staging branch](https://github.com/berkeley-dsep-infra/datahub).

: - To use `repo2docker`, be sure that you are inside the image
repo directory on your device, and then run `repo2docker .`.
1. Set up your git/dev environment by following the [image templat's
contributing
guide](https://github.com/berkeley-dsep-infra/hub-user-image-template/blob/main/CONTRIBUTING.md).

3. Commit and push your changes to your fork of the image repo, and
1. [Test the image locally](repo2docker-local.qmd) using `repo2docker`.
1. Submit a PR to `staging`.
1. Commit and push your changes to your fork of the image repo, and
create a new pull request at
https://github.com/berkeley-dsep-infra/<hubname-user-image>.

4. After the build passes, merge your PR in to `main` and the image will
1. After the build passes, merge your PR in to `main` and the image will
be built again and pushed to the Artifact Registry. If that succeeds,
then a commit will be crafted that will update the `PLACEHOLDER` field in
`hubploy.yaml` with the image's SHA and pushed to the datahub repo.
You can check on the progress of this workflow in your root image repo's
`Actions` tab.

5. After 4 is completed successfully, go to the Datahub repo and click on
the [New pull request](https://github.com/berkeley-dsep-infra/datahub/compare)
1. After the previous step is completed successfully, go to the Datahub repo
and click on the [New pull
request](https://github.com/berkeley-dsep-infra/datahub/compare)
button. Next, click on the `compare: staging` drop down, and you should see
a branch named something like `update-<hubname>-image-tag-<SHA>`. Select that,
and create a new pull request.
a branch named something like `update-<hubname>-image-tag-<SHA>`. Select
that, and create a new pull request.

6. Once the checks has passed, merge to `staging` and your new image will be
deployed! You can watch the progress [here](https://github.com/berkeley-dsep-infra/datahub/actions/workflows/deploy-hubs.yaml).
1. Once the checks has passed, merge to `staging` and your new image will be
deployed! You can watch the progress in the [deploy-hubs workflow](https://github.com/berkeley-dsep-infra/datahub/actions/workflows/deploy-hubs.yaml).
68 changes: 68 additions & 0 deletions docs/tasks/repo2docker-local.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
title: Test User Images Locally
---

You should use `repo2docker` to build and test the image on your own device before you push and create a PR. It is often faster to do this first before using CI/CD since you can take advantage of local caching and rapid iteration. There's no need to waste Github Action minutes to test build images when you can do this on your own device.

## Common Usage

One can simply run `repo2docker /path/to/image/assets`. For example if one has changed into the directory containing the `repo2docker` files (such as `environment.yml` and/or `Dockerfile`), the command would be:

```shell
repo2docker .
```

This works on Linux and Windows Subsystem for Linux (WSL). It will build the image, then launch jupyter server and display a localhost URL. Copy the URL and paste it into a local web browser.

If you just want to build the image without also running the server,
add the `--no-run` argument:

```shell
repo2docker --no-run .
```

## On Apple Silicon

Apple's ARM-based CPUs (the "M" chips) are different from those run on the virtual machines in our clusters. macOS is capable of emulating x86_64/amd64, but it is necessary to optimize docker for this emulation, and to explicitly tell your local docker runtime that the images should be built on the `linux/amd64` platform.

In Docker's settings:

- Under **General** > **Virtual Machine Options**, either enable both **Apple Virtualization framework** and **Use Rosetta for x86_64/amd64 emulation on Apple Silicon**, or enable **Docker VMM**.
- Under **Resources** it is also recommended to raise the memory limit to at least 4GB.

There are two methods for building `linux/amd64` images. The default uses `repo2docker`'s support for `docker-py`, while the second uses a `repo2docker` plugin that can invoke your local docker command-line interface.

### docker-py (default)

Run `jupyter-repo2docker` with the following arguments:

```
repo2docker \
--Repo2Docker.platform=linux/amd64 \
-e PLAYWRIGHT_BROWSERS_PATH=/srv/conda \
--user-id=1000 --user-name=jovyan \
--target-repo-dir=/home/jovyan/.cache \
.
```

where the final parameter is the path to the assets or `.` if they are in the current directory.

The `--user-id` and `--user-name` options are for non-Dockerfile based builds. Images with Dockerfiles do not need those options.

Note that you may see (possibly harmless) architecture mismatch warnings with this method.

### `docker` CLI

You can instruct `repo2docker` to use your machine's local `docker` executable directly rather than the default of `docker-py`. You will first need to install [repo2podman](https://github.com/manics/repo2podman), a plugin that lets you use any container runtime with a command-line user interface similar to that of `docker`. This is useful if you want to leverage [docker buildx](https://github.com/docker/buildx/) (for things like multi-stage builds) or if you want to use an alternative executable like `podman`. This also eliminates architecture mismatch warnings.
ryanlovett marked this conversation as resolved.
Show resolved Hide resolved

::: {.callout-warning}
repo2podman reportedly does not work yet on WSL.
:::

```
repo2docker \
--Repo2Docker.platform=linux/amd64 \
-e PLAYWRIGHT_BROWSERS_PATH=/srv/conda \
--engine podman --PodmanEngine.podman_executable=docker \
.
```
Loading