Skip to content

Commit

Permalink
Merge pull request #55 from eth-cscs/docs
Browse files Browse the repository at this point in the history
Update docs
  • Loading branch information
rsarm authored Dec 6, 2024
2 parents 088e664 + 22999dc commit 24a6521
Show file tree
Hide file tree
Showing 5 changed files with 308 additions and 301 deletions.
18 changes: 7 additions & 11 deletions docs/source/authentication.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Authorization Code Flow
-----------------------

For CSCS's requirements, the most suitable authentication method is the Authorization Code Flow, where users log into JupyterHub and receive an access token and a refresh token from Keycloak.
The access tokens is then given to the spawner to be used for authentication with FirecREST.
The access token is then given to the spawner to be used for authentication with FirecREST.
Since access tokens expire after a few minutes, they must be refreshed before performing any spawner operations.
New access tokens are requested by providing a refresh token, which has a longer lifespan.
During the time of validity of a refresh token, also known as single sign-on (SSO) session, the process of requesting new access tokens can be managed either by JupyterHub or by the spawner itself.
Expand All @@ -36,8 +36,8 @@ To prevent this issue, FirecRESTSpawner can check the notebook status using an a
Client Credentials Flow
-----------------------

In this approach, JupyterHub polls for the jobs status of every users with a single service account that uses the Client Credentials Flow.
In contrast to the SSO session discussed above, within this Keycloak authentication method it's possible to refresh the access tokens by providing a client id and secret.
In this approach, JupyterHub uses a single Client Credentials Flow service account to poll for the job status of every user.
In contrast to the Authorization Code Flow, within this Keycloak authentication method it's possible to refresh the access tokens by providing a client id and secret.
Note that such service account has no permissions to start or stop user jobs.
Those actions only can be done providing the user's access token.
By combining this method with JupyterHub's own session management, which operates independently of Keycloak, potential polling failures can be completely avoided.
Expand All @@ -61,17 +61,13 @@ Stopping the server
Enabling JupyterHub's authentication state
------------------------------------------

By default, JupyterHub does not store authentication states.
The access and refresh tokens are obtained by the spawner via `JupyterHub's authentication state <https://jupyterhub.readthedocs.io/en/latest/reference/authenticators.html#authenticator-auth-state>`_.
That information must be stored in the hub's database so it can be accessible within the spawner.
By default the authentication state is not persisted.
That feature must be enabled in the configuration by setting

.. code-block:: Python
c.Authenticator.enable_auth_state = True
which, in turn, requires setting ``c.CryptKeeper.keys`` in JupyterHub's configuration and the environment variable ``JUPYTERHUB_CRYPT_KEY`` on the system where the notebooks server will run.

The access and refresh tokens are kept stored in
JupyterHub's `authentication state <https://jupyterhub.readthedocs.io/en/stable/reference/authenticators.html#authentication-state>`_ dictionary.
From there, they can be fetched by the spawner and used to create PyFirecREST clients.

That mechanism can be implemented by extending the ``GenericOAuthenticator`` class from ``oauthenticator`` to provide a token refreshing method.
Since the authentication state is encrypted before being stored in the database, ``c.CryptKeeper.keys`` must be set in JupyterHub's configuration and the environment variable ``JUPYTERHUB_CRYPT_KEY`` must be defined on the system where the notebooks server will run.
287 changes: 6 additions & 281 deletions docs/source/deployment.rst
Original file line number Diff line number Diff line change
@@ -1,13 +1,10 @@
Deployment
==========

Overview
--------

Deploying JupyteHub has two components:

Hub and proxy
Users access the hub (JupyterHub), which is a multi-user platform from where Jupyter notebook servers are launched.
Users access the hub (JupyterHub), which is a multi-user platform from where Jupyter notebook servers can be launched.
When using FirecRESTSpawner, notebook servers are started via FirecREST on the compute nodes of HPC clusters.
The proxy routes the communication from the user's browser to the hub or to the notebook servers.
Besides access to the internet and to the FirecREST server, no special requirements are necessary for the platforms running the hub and the proxy.
Expand All @@ -18,282 +15,10 @@ Jupyter notebook servers
That can be done either natively or as a container image.
This part of the deployment doesn't require FirecREST.

Reference deployment at CSCS
----------------------------

At the Swiss National Supercomputing Centre (CSCS), JupyterHub is deployed on Kubernetes.
From there, JupyterLab servers are launched on different HPC clusters via FirecREST.
Each deployment targets a single cluster.

JupyterHub is deployed on ArgoCD using the `f7t4jhub <https://eth-cscs.github.io/firecrestspawner>`_ Helm chart.
The chart is available in the `spawner's repository <https://github.com/eth-cscs/firecrestspawner/tree/main/chart>`_.
It has been designed mainly for CSCS but it's general enough for the use at other sites.

.. figure:: images/chart.png
:alt: Company Logo
:width: 700px
:align: center

Schematic representation of the f7t4jhub chart

In our deployments at CSCS, the hub and proxy run on their own pods.
That's a standard practice that allows the hub to be restarted (to apply a new configuration, for instance) without affecting users with running JupyterLab servers.
The deployment used the following images:

Proxy
JupyterHub's default `configurable-http-proxy <https://github.com/jupyterhub/configurable-http-proxy>`_ is used as a proxy.
We package it in the container image `ghcr.io/eth-cscs/chp <https://github.com/eth-cscs/firecrestspawner/pkgs/container/chp>`_.
Initially we used ``quay.io/jupyterhub/configurable-http-proxy:4.6.1``, but because of security reasons we now build our own image that uses the newer ``node:lts-alpine3.19`` as base.

Hub
For the hub, we use our container image ``ghcr.io/eth-cscs/f7t4jhub``, which includes JupyterHub and FirecRESTSpawner.
The corresponding Dockerfile can be found `here <https://github.com/eth-cscs/firecrestspawner/blob/main/dockerfiles/Dockerfile>`_.

The following figure shows a schematic representation of the deployment:

.. figure:: images/cscs-deployment.png
:alt: Company Logo
:width: 500px
:align: center

JupyterHub deployment at CSCS

Access to Keycloak
~~~~~~~~~~~~~~~~~~

At CSCS, the Keycloak client's IDs and secrets to login in JupyterHub are stored in `Vault <https://www.vaultproject.io>`_.
They can be accessed in our kubernetes deployment via a set of secrets:

- The ``vault-approle-secret`` kubernetes ``Secret``, which contains the credentials to access Vault.
This secret is not part of the helm chart. It must be created manually for the namespace where the chart will be deployed.

- A `SecretStore <https://github.com/eth-cscs/firecrestspawner/blob/main/chart/f7t4jhub/templates/secret-store.yaml>`_, which interacts with the ``vault-approle-secret`` secret.

- An `ExternalSecret <https://github.com/eth-cscs/firecrestspawner/blob/main/chart/f7t4jhub/templates/external-secret.yaml>`_ which interacts with the ``SecretStore`` allowing the deployment to access the client's IDs and secrets.

- An optional `ExternalSecret to access credentials for a custom container registry <https://github.com/eth-cscs/firecrestspawner/blob/main/chart/f7t4jhub/templates/external-secret-registry.yaml>`_. That's currently not in use.

The section of the chart related to Vault is optional and can be disabled in the ``values.yaml``.

JupyterHub configuration
~~~~~~~~~~~~~~~~~~~~~~~~

Another key element of the chart is the ``ConfigMap`` mentioned above, which provides
the `JupyterHub configuration <https://jupyterhub.readthedocs.io/en/stable/tutorial/getting-started/config-basics.html>`_.
While the configuration includes many parameters, only a handful need to be modified from one deployment to another.
Therefore, templating only those parameters seems to be sufficient to create a generic chart for all CSCS deployments,
requiring only minor adjustments in the ``values.yaml``.
In our deployments, the required changes are typically related to the authentication settings and the batch script used by the spawner
to submit the Jupyter notebook servers, as Slurm settings may vary between clusters.
All JupyterHub configuration parameters are set under ``config`` in the ``values.yaml``.

Live updates
~~~~~~~~~~~~

The chart uses `Reloader <https://github.com/stakater/Reloader>`_ to ensure that the hub pod is restarted if the configuration is modified or if secrets are changed in vault.
Since the hub and the proxy run on different pods, plus the JupyterHub database is stored on a persistent volume, it's possible to apply new configurations without affecting users that have JupyterLab running.

HTTPS Provisioning
~~~~~~~~~~~~~~~~~~

HTTPS is automatically provided by `cert-manager <https://cert-manager.io/>`_, which handles the management of of SSL/TLS certificates to ensure secure connections.


Deploying the chart
~~~~~~~~~~~~~~~~~~~

This section explain how the chart is deployed with Helm or ArgoCD.
For either option, there's a common first first step, which is the creation of the ``vault-approle-secret``.
That can be done in a namespace with the following command:

.. code-block:: Shell
kubectl create namespace <namespace>
kubectl create secret generic vault-approle-secret --from-literal secret-id=<approle-secret-id> -n<namespace>
Here ``secret-id=<approle-secret-id>`` is a "key, value" pair.
The actual value of ``<approle-secret-id>`` can be copied from an existing ``vault-approle-secret``.

.. code-block:: Shell
kubectl get secret vault-approle-secret -n<existing-namespace> -o yaml
# apiVersion: v1
# data:
# secret-id: <approle-secret-id-base64>
# kind: Secret
# metadata:
# creationTimestamp: "2024-03-06T16:22:23Z"
# name: vault-approle-secret
# namespace: jhub-clariden-tds
# resourceVersion: "206319585"
# uid: 29490228-a546-4609-bba3-102dc9b113b9
# type: Opaque
In the output, ``<approle-secret-id-base64>`` is the ``<approle-secret-id>`` encoded as Base64. It must be decoded in order to use it with the ``kubectl create secret``.

In short

.. code-block:: Shell
kubectl get secret vault-approle-secret -njhub-eiger-dev -o jsonpath="{.data.secret-id}" | base64 --decode
Helm
^^^^

The repository can be added to the local helm repo with

.. code-block:: Shell
helm repo add f7t4jhub https://eth-cscs.github.io/firecrestspawner
helm repo update
Now, for instance the available versions can be displayed

.. code-block:: Shell
helm search repo f7t4jhub/f7t4jhub --versions
# NAME CHART VERSION APP VERSION DESCRIPTION
# f7t4jhub/f7t4jhub 0.6.0 4.1.5 A Helm chart to Deploy JupyterHub with the Fire...
# f7t4jhub/f7t4jhub 0.5.2 4.1.5 A Helm chart to Deploy JupyterHub with the Fire...
# f7t4jhub/f7t4jhub 0.5.1 4.1.5 A Helm chart to Deploy JupyterHub with the Fire...
# f7t4jhub/f7t4jhub 0.5.0 4.1.5 A Helm chart to Deploy JupyterHub with the Fire...
# f7t4jhub/f7t4jhub 0.3.0 4.1.5 A Helm chart to Deploy JupyterHub with the Fire...
Once available locally, the chart can be installed with

.. code-block:: Shell
helm dependency build
helm install <namespace> -n<namespace> f7t4jhub/f7t4jhub --values values.yaml --version <chart-version>
and updated live with

.. code-block:: Shell
helm dependency build
helm upgrade namespace -n<namespace> f7t4jhub/f7t4jhub --values values.yaml
Here we have used the same names for the namespace and the helm release.

ArgoCD
^^^^^^

The ``values.yaml`` as presented in the spawner's repository, is written for a deployment with Helm.
To deploy the chart with ArgoCD, because of the way we defined the dependencies, both the ``reloader`` and the ``f7t4jhub`` sections must be indented into another section of the same name.
The structure should look like the following code block, wher we have highlighted the two new sections:

.. code-block:: Yaml
:emphasize-lines: 1, 8
reloader:
reloader:
reloader:
# Set to true to enable the reloader for automatically restarting pods on ConfigMap/Secret changes.
enabled: true
...
f7t4jhub:
f7t4jhub:
setup:
# URL for the Firecrest service (replace with your own Firecrest URL)
firecrestUrl: "https://firecrest.cscs.ch"
...
The dependecies are defined like in the following ``Chart.yaml`` for the version ``0.8.6`` of the chart

.. code-block:: Yaml
apiVersion: v2
name: f7t4jhub
description: A Helm chart to Deploy JupyterHub with the FirecREST Spawner
type: application
version: 0.8.6 # same as the chart version
appVersion: "4.1.5"
dependencies:
- name: f7t4jhub
version: 0.8.6 # chart version
repository: https://eth-cscs.github.io/firecrestspawner
- name: reloader
version: v1.0.51
repository: https://stakater.github.io/stakater-charts
condition: reloader.reloader.enabled
For more information about the ArgoCD deployment, please get in contact with us.


Software installation in the cluster
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

A JupyterLab installation including the spawner must be available in the HPC cluster.
From the spawner, only the ``firecrestspawner-singleuser`` script is used since it's needed to launch the JupyterLab server.
The needed software can be installed like with

.. code-block:: Shell
pip install --no-cache jupyterhub==4.1.5 pyfirecrest==2.1.0 SQLAlchemy==1.4.52 oauthenticator==16.0.7 jupyterlab==4.1.8
git clone https://github.com/eth-cscs/firecrestspawner.git
cd firecrestspawner
git checkout test-eiger
pip install .
That software can be installed on a python virtual environment or container images or `uenv <https://github.com/eth-cscs/uenv>`_ images.

Container images
^^^^^^^^^^^^^^^^

As an example, this is a dockerfile to install the JupyterLab and the spawner within a PyTorch image from `NVidia GPU Cloud <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch>`_.

.. code-block:: Dockerfile
FROM nvcr.io/nvidia/pytorch:24.07-py3
RUN pip install --no-cache jupyterlab jupyterhub==4.1.6 pyfirecrest==2.1.0 SQLAlchemy==1.4.52 oauthenticator==16.3.1 notebook==7.2.1
RUN git clone https://github.com/eth-cscs/firecrestspawner.git && \
cd firecrestspawner && \
pip install .
Uenvs
^^^^^

A simple way to create a uenv to be used with the JupyterHub deployment, is by starting from the `prgenv-gnu <https://github.com/eth-cscs/alps-uenv/tree/main/recipes/prgenv-gnu/23.11/mc>`_ recipe. One way to go, is to include the `py-pip` Spack package on the ``environment.yaml`` (the ``osu-micro-benchmarks@5.9`` package can be removed)

.. code-block:: Yaml
:emphasize-lines: 18
gcc-env:
compiler:
- toolchain: gcc
spec: gcc@12
mpi:
spec: cray-mpich
gpu: Null
unify: true
specs:
- cmake
- fftw
- fmt
- hdf5
- ninja@1.11
- openblas
- python@3.11
- py-pybind11
- py-pip
variants:
- +mpi
views:
default:
and to add a post-install script that will take care of all the necessary software

.. code-block:: Shell
.. toctree::
:maxdepth: 2
:caption: Contents:

export PATH=/user-environment/env/default/bin:$PATH
pip install --no-cache jupyterhub==4.1.5 pyfirecrest==2.1.0 SQLAlchemy==1.4.52 oauthenticator==16.0.7 jupyterlab==4.1.8
git clone https://github.com/eth-cscs/firecrestspawner.git
cd firecrestspawner
pip install .
deployment_cscs
deployment_demo
Loading

0 comments on commit 24a6521

Please sign in to comment.