Skip to content

Commit

Permalink
feat: run notebooks in data service (#375)
Browse files Browse the repository at this point in the history
Co-authored-by: Samuel Gaist <samuel.gaist@idiap.ch>

squashme: resolve package version conflicts

feat: run notebooks in data service (#375)

Co-authored-by: Samuel Gaist <samuel.gaist@idiap.ch>

squashme: resolve package version conflicts

feat: run notebooks in data service (#375)

Co-authored-by: Samuel Gaist <samuel.gaist@idiap.ch>

squashme: resolve package version conflicts

feat: update and expand apispec for environments

chore: filter environments by owner type

squashme: address comments

feat!: expand environment specification

This is a breaking change in the API.

chore: add tests and minor fixes

chore: test the global environments migration

chore: fix tests

chore: minor improvements to db session handling

squashme: minor fix

squashme: fixups for conflict resolutuion after merge

squashme: fix failing tests

chore: address comments

feat: add command and args to environments

squashme: notebooks changes

This includes major edits to the notebooks code to work with the data
service.

chore: resolve changes from conflict resolution

chore: do not use the complicated notebooks gitlab header

The gitlab credentials header from the notebooks is really complicated.
We used it here just to get the access token expiry. I modified the
gateway to now pass in an extra header value to indicate the gitlab
token expiry.

squashme: handle per secret adoption in amalthea

squashme: fix parsing of PosixPath in orm

squashme: display the right status and state

squashme: address comments from review pt1

refactor: make APIUser a frozen dataclass

As there is no reason that these object shall be modified within
the services, it simplifies its handling.

squashme: address comments from review pt2

squashme: fixups from rebasing

squashme: use PurePosixPath for workdir and mount

squashme: add saved cloud storage model
  • Loading branch information
olevski committed Sep 23, 2024
1 parent 99e2296 commit 1b981e9
Show file tree
Hide file tree
Showing 71 changed files with 6,863 additions and 1,482 deletions.
5 changes: 4 additions & 1 deletion .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,10 @@
"ghcr.io/devcontainers/features/kubectl-helm-minikube:1": {
"minikube": "none"
},
"ghcr.io/eitsupi/devcontainer-features/jq-likes:2": {},
"ghcr.io/eitsupi/devcontainer-features/jq-likes:2": {
"jqVersion": "latest",
"yqVersion": "latest"
},
"ghcr.io/dhoeric/features/k9s:1": {},
"ghcr.io/EliiseS/devcontainer-features/bash-profile:1": {
"command": "alias k=kubectl"
Expand Down
3 changes: 3 additions & 0 deletions .devcontainer/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ services:
ZED_TOKEN: renku
ZED_INSECURE: "true"
POETRY_CACHE_DIR: "/poetry_cache"
NB_SERVER_OPTIONS__DEFAULTS_PATH: /workspace/server_defaults.json
NB_SERVER_OPTIONS__UI_CHOICES_PATH: /workspace/server_options.json
network_mode: service:db
depends_on:
- db
Expand All @@ -43,6 +45,7 @@ services:
- "8080:8080"
- "5678:5678"
- "50051:50051"
- "8888:80"

swagger:
image: swaggerapi/swagger-ui
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/acceptance-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ jobs:
renku-graph: ${{ steps.deploy-comment.outputs.renku-graph}}
renku-notebooks: ${{ steps.deploy-comment.outputs.renku-notebooks}}
renku-ui: ${{ steps.deploy-comment.outputs.renku-ui}}
amalthea-sessions: ${{ steps.deploy-comment.outputs.amalthea-sessions}}
amalthea: ${{ steps.deploy-comment.outputs.amalthea}}
test-enabled: ${{ steps.deploy-comment.outputs.test-enabled}}
test-cypress-enabled: ${{ steps.deploy-comment.outputs.test-cypress-enabled}}
persist: ${{ steps.deploy-comment.outputs.persist}}
Expand Down Expand Up @@ -84,6 +86,8 @@ jobs:
renku_graph: "${{ needs.check-deploy.outputs.renku-graph }}"
renku_notebooks: "${{ needs.check-deploy.outputs.renku-notebooks }}"
renku_data_services: "@${{ github.head_ref }}"
amalthea: "${{ needs.check-deploy.outputs.amalthea }}"
amalthea_sessions: "${{ needs.check-deploy.outputs.amalthea-sessions }}"
extra_values: "${{ needs.check-deploy.outputs.extra-values }}"

selenium-acceptance-tests:
Expand Down
41 changes: 41 additions & 0 deletions .github/workflows/save_cache.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
name: Create cache from commits on main

on:
push:
branches:
- main
- chore-add-kind
workflow_dispatch:


jobs:
save-poetry-cache:
runs-on: ubuntu-latest
env:
CACHE_KEY: main-branch-poetry-cache-ubuntu
CACHE_PATH: .devcontainer/.poetry_cache
DEVCONTAINER_IMAGE_CACHE: ghcr.io/swissdatasciencecenter/renku-data-services/devcontainer

steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Login to Docker Hub
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Install python deps
uses: devcontainers/ci@v0.3
with:
runCmd: poetry install --with dev
push: always
skipContainerUserIdUpdate: false
imageName: ${{ env.DEVCONTAINER_IMAGE_CACHE }}
cacheFrom: ${{ env.DEVCONTAINER_IMAGE_CACHE }}
- uses: actions/cache/save@v3
name: Create cache
with:
path: ${{ env.CACHE_PATH }}
key: ${{ env.CACHE_KEY }}
15 changes: 15 additions & 0 deletions .github/workflows/test_publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,11 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/cache/restore@v3
name: Restore cache
with:
path: ${{ env.CACHE_PATH }}
key: ${{ env.CACHE_KEY }}
- name: Set Git config
shell: bash
run: |
Expand Down Expand Up @@ -111,6 +116,11 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/cache/restore@v3
name: Restore cache
with:
path: ${{ env.CACHE_PATH }}
key: ${{ env.CACHE_KEY }}
- name: Set Git config
shell: bash
run: |
Expand Down Expand Up @@ -155,6 +165,11 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/cache/restore@v3
name: Restore cache
with:
path: ${{ env.CACHE_PATH }}
key: ${{ env.CACHE_KEY }}
- name: Set Git config
shell: bash
run: |
Expand Down
16 changes: 11 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.PHONY: schemas tests test_setup main_tests schemathesis_tests collect_coverage style_checks pre_commit_checks run download_avro check_avro avro_models update_avro kind_cluster install_amaltheas all

AMALTHEA_JS_VERSION ?= 0.11.0
AMALTHEA_SESSIONS_VERSION ?= 0.0.1-new-operator-chart
AMALTHEA_JS_VERSION ?= 0.12.2
AMALTHEA_SESSIONS_VERSION ?= 0.0.9-new-operator-chart
codegen_params = --input-file-type openapi --output-model-type pydantic_v2.BaseModel --use-double-quotes --target-python-version 3.12 --collapse-root-models --field-constraints --strict-nullable --set-default-enum-member --openapi-scopes schemas paths parameters --set-default-enum-member --use-one-literal-as-default --use-default

define test_apispec_up_to_date
Expand Down Expand Up @@ -153,7 +153,13 @@ kind_cluster: ## Creates a kind cluster for testing
sleep 15
kubectl wait --namespace ingress-nginx --for=condition=ready pod --selector=app.kubernetes.io/component=controller --timeout=90s

install_amaltheas: ## Installs both version of amalthea in the currently active k8s context.
install_amaltheas: ## Installs both version of amalthea in the. NOTE: It uses the currently active k8s context.
helm repo add renku https://swissdatasciencecenter.github.io/helm-charts
helm install amalthea-js renku/amalthea --version $(AMALTHEA_JS_VERSION)
helm install amalthea-sessions renku/amalthea-sessions --version $(AMALTHEA_SESSIONS_VERSION)
helm repo update
helm upgrade --install amalthea-js renku/amalthea --version $(AMALTHEA_JS_VERSION)
helm upgrade --install amalthea-sessions amalthea-sessions-0.0.9-new-operator-chart.tgz --version $(AMALTHEA_SESSIONS_VERSION)

# TODO: Add the version variables from the top of the file here when the charts are fully published
amalthea_schema: ## Updates generates pydantic classes from CRDs
curl https://raw.githubusercontent.com/SwissDataScienceCenter/amalthea/feat-add-cloud-storage/config/crd/bases/amalthea.dev_amaltheasessions.yaml | yq '.spec.versions[0].schema.openAPIV3Schema' | poetry run datamodel-codegen --input-file-type jsonschema --output-model-type pydantic_v2.BaseModel --output components/renku_data_services/notebooks/cr_amalthea_session.py --use-double-quotes --target-python-version 3.12 --collapse-root-models --field-constraints --strict-nullable --base-class renku_data_services.notebooks.cr_base.BaseCRD --allow-extra-fields --use-default-kwarg
curl https://raw.githubusercontent.com/SwissDataScienceCenter/amalthea/main/controller/crds/jupyter_server.yaml | yq '.spec.versions[0].schema.openAPIV3Schema' | poetry run datamodel-codegen --input-file-type jsonschema --output-model-type pydantic_v2.BaseModel --output components/renku_data_services/notebooks/cr_jupyter_server.py --use-double-quotes --target-python-version 3.12 --collapse-root-models --field-constraints --strict-nullable --base-class renku_data_services.notebooks.cr_base.BaseCRD --allow-extra-fields --use-default-kwarg
22 changes: 22 additions & 0 deletions bases/renku_data_services/data_api/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
UserResourcePoolsBP,
)
from renku_data_services.namespace.blueprints import GroupsBP
from renku_data_services.notebooks.blueprints import NotebooksBP, NotebooksNewBP
from renku_data_services.platform.blueprints import PlatformConfigBP
from renku_data_services.project.blueprints import ProjectsBP
from renku_data_services.repositories.blueprints import RepositoriesBP
Expand Down Expand Up @@ -134,6 +135,25 @@ def register_all_handlers(app: Sanic, config: Config) -> Sanic:
authenticator=config.authenticator,
internal_gitlab_authenticator=config.gitlab_authenticator,
)
notebooks = NotebooksBP(
name="notebooks_old",
url_prefix=url_prefix,
authenticator=config.authenticator,
nb_config=config.nb_config,
internal_gitlab_authenticator=config.gitlab_authenticator,
git_repo=config.git_repositories_repo,
)
notebooks_new = NotebooksNewBP(
name="notebooks",
url_prefix=url_prefix,
authenticator=config.authenticator,
nb_config=config.nb_config,
project_repo=config.project_repo,
session_repo=config.session_repo,
storage_repo=config.storage_v2_repo,
rp_repo=config.rp_repo,
internal_gitlab_authenticator=config.gitlab_authenticator,
)
platform_config = PlatformConfigBP(
name="platform_config",
url_prefix=url_prefix,
Expand Down Expand Up @@ -161,6 +181,8 @@ def register_all_handlers(app: Sanic, config: Config) -> Sanic:
oauth2_clients.blueprint(),
oauth2_connections.blueprint(),
repositories.blueprint(),
notebooks.blueprint(),
notebooks_new.blueprint(),
platform_config.blueprint(),
]
)
Expand Down
12 changes: 10 additions & 2 deletions components/renku_data_services/app_config/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@
from renku_data_services.message_queue.interface import IMessageQueue
from renku_data_services.message_queue.redis_queue import RedisQueue
from renku_data_services.namespace.db import GroupRepository
from renku_data_services.notebooks.config import _NotebooksConfig
from renku_data_services.platform.db import PlatformRepository
from renku_data_services.project.db import ProjectMemberRepository, ProjectRepository
from renku_data_services.repositories.db import GitRepositoriesRepository
Expand Down Expand Up @@ -144,6 +145,7 @@ class Config:
kc_api: IKeycloakAPI
message_queue: IMessageQueue
gitlab_url: str | None
nb_config: _NotebooksConfig

secrets_service_public_key: rsa.RSAPublicKey
"""The public key of the secrets service, used to encrypt user secrets that only it can decrypt."""
Expand Down Expand Up @@ -208,6 +210,10 @@ def __post_init__(self) -> None:
with open(spec_file) as f:
repositories = safe_load(f)

spec_file = Path(renku_data_services.notebooks.__file__).resolve().parent / "api.spec.yaml"
with open(spec_file) as f:
repositories = safe_load(f)

spec_file = Path(renku_data_services.platform.__file__).resolve().parent / "api.spec.yaml"
with open(spec_file) as f:
platform = safe_load(f)
Expand Down Expand Up @@ -408,8 +414,8 @@ def from_env(cls, prefix: str = "") -> "Config":
gitlab_client: base_models.GitlabAPIProtocol
user_preferences_config: UserPreferencesConfig
version = os.environ.get(f"{prefix}VERSION", "0.0.1")
server_options_file = os.environ.get("SERVER_OPTIONS")
server_defaults_file = os.environ.get("SERVER_DEFAULTS")
server_options_file = os.environ.get("NB_SERVER_OPTIONS__UI_CHOICES_PATH")
server_defaults_file = os.environ.get("NB_SERVER_OPTIONS__DEFAULTS_PATH")
k8s_namespace = os.environ.get("K8S_NAMESPACE", "default")
max_pinned_projects = int(os.environ.get(f"{prefix}MAX_PINNED_PROJECTS", "10"))
user_preferences_config = UserPreferencesConfig(max_pinned_projects=max_pinned_projects)
Expand Down Expand Up @@ -491,6 +497,7 @@ def from_env(cls, prefix: str = "") -> "Config":
sentry = SentryConfig.from_env(prefix)
trusted_proxies = TrustedProxiesConfig.from_env(prefix)
message_queue = RedisQueue(redis)
nb_config = _NotebooksConfig.from_env(db)

return cls(
version=version,
Expand All @@ -511,4 +518,5 @@ def from_env(cls, prefix: str = "") -> "Config":
encryption_key=encryption_key,
secrets_service_public_key=secrets_service_public_key,
gitlab_url=gitlab_url,
nb_config=nb_config,
)
18 changes: 16 additions & 2 deletions components/renku_data_services/authn/dummy.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
from typing import Optional

from sanic import Request
from ulid import ULID

import renku_data_services.base_models as base_models

Expand Down Expand Up @@ -39,10 +40,22 @@ class DummyAuthenticator:
"""

token_field = "Authorization" # nosec: B105
anon_id_header_key: str = "Renku-Auth-Anon-Id"
anon_id_cookie_name: str = "Renku-Auth-Anon-Id"

@staticmethod
async def authenticate(access_token: str, request: Request) -> base_models.APIUser:
async def authenticate(self, access_token: str, request: Request) -> base_models.APIUser:
"""Indicates whether the user has successfully logged in."""
access_token = request.headers.get(self.token_field) or ""
if not access_token or len(access_token) == 0:
# Try to get an anonymous user ID if the validation of keycloak credentials failed
anon_id = request.headers.get(self.anon_id_header_key)
if anon_id is None:
anon_id = request.cookies.get(self.anon_id_cookie_name)
if anon_id is None:
anon_id = f"anon-{str(ULID())}"
return base_models.AnonymousAPIUser(id=str(anon_id))

access_token = access_token.removeprefix("Bearer ").removeprefix("bearer ")
user_props = {}
with contextlib.suppress(Exception):
user_props = json.loads(access_token)
Expand All @@ -64,4 +77,5 @@ async def authenticate(access_token: str, request: Request) -> base_models.APIUs
last_name=user_props.get("last_name", "Doe") if is_set else None,
email=user_props.get("email", "john.doe@gmail.com") if is_set else None,
full_name=user_props.get("full_name", "John Doe") if is_set else None,
refresh_token=request.headers.get("Renku-Auth-Refresh-Token"),
)
16 changes: 13 additions & 3 deletions components/renku_data_services/authn/gitlab.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,13 @@

import contextlib
import urllib.parse as parse
from contextlib import suppress
from dataclasses import dataclass
from datetime import datetime

import gitlab
from sanic import Request
from sanic.compat import Header

import renku_data_services.base_models as base_models
from renku_data_services import errors
Expand All @@ -23,6 +26,7 @@ class GitlabAuthenticator:
gitlab_url: str

token_field: str = "Gitlab-Access-Token"
expires_at_field: str = "Gitlab-Access-Token-Expires-At"

def __post_init__(self) -> None:
"""Properly set gitlab url."""
Expand All @@ -36,10 +40,10 @@ async def authenticate(self, access_token: str, request: Request) -> base_models
if self.token_field != "Authorization": # nosec: B105
access_token = str(request.headers.get(self.token_field))

result = await self._get_gitlab_api_user(access_token)
result = await self._get_gitlab_api_user(access_token, request.headers)
return result

async def _get_gitlab_api_user(self, access_token: str) -> base_models.APIUser:
async def _get_gitlab_api_user(self, access_token: str, headers: Header) -> base_models.APIUser:
"""Get and validate a Gitlab API User."""
client = gitlab.Gitlab(self.gitlab_url, oauth_token=access_token)
try:
Expand Down Expand Up @@ -69,12 +73,18 @@ async def _get_gitlab_api_user(self, access_token: str) -> base_models.APIUser:
if len(name_parts) >= 1:
last_name = " ".join(name_parts)

expires_at: datetime | None = None
expires_at_raw: str | None = headers.get(self.expires_at_field)
if expires_at_raw is not None and len(expires_at_raw) > 0:
with suppress(ValueError):
expires_at = datetime.fromtimestamp(float(expires_at_raw))

return base_models.APIUser(
is_admin=False,
id=str(user_id),
access_token=access_token,
first_name=first_name,
last_name=last_name,
email=email,
full_name=full_name,
access_token_expires_at=expires_at,
)
Loading

0 comments on commit 1b981e9

Please sign in to comment.