Skip to content

Commit

Permalink
Merge pull request #12 from AustrianDataLAB/feature/operator
Browse files Browse the repository at this point in the history
Add operator
  • Loading branch information
winklermichael authored May 19, 2024
2 parents 1f8e6f2 + 87c9e47 commit 17b8ca9
Show file tree
Hide file tree
Showing 9 changed files with 869 additions and 0 deletions.
87 changes: 87 additions & 0 deletions .github/workflows/build_push_operator.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
name: build-push-operator

# This workflow uses actions that are not certified by GitHub.
# They are provided by a third-party and are governed by
# separate terms of service, privacy policy, and support
# documentation.

on:
# schedule:
# - cron: '40 16 * * *'
push:
branches: [ "feature/operator" ]
# Publish semver tags as releases.
#tags: [ 'v*.*.*' ]
pull_request:
branches: [ "dev","main" ]

env:
# Use docker.io for Docker Hub if empty
REGISTRY: ghcr.io
# github.repository as <account>/<repo>
IMAGE_NAME: ${{ github.repository }}

jobs:
build-scan-push:
runs-on: ubuntu-latest
steps:
-
name: Checkout
uses: actions/checkout@v3
-
name: Set up QEMU
uses: docker/setup-qemu-action@v2
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2

-
name: Log into registry ${{ env.REGISTRY }}
uses: docker/login-action@343f7c4344506bcbf9b4de18042ae17996df046d # v3.0.0
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

-
name: Extract Docker metadata
id: meta
uses: docker/metadata-action@96383f45573cb7f253c731d3b3ab81c87ef81934 # v5.0.0
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
-
name: Build and load
uses: docker/build-push-action@v4
with:
load: true
context: ./operator
file: ./operator/Dockerfile
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
-
name: Scan for vulnerabilities
id: scan
uses: crazy-max/ghaction-container-scan@v3
with:
image: ${{ steps.meta.outputs.tags }}
dockerfile: ./operator/Dockerfile

-
name: Upload SARIF file
if: ${{ steps.scan.outputs.sarif != '' }}
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: ${{ steps.scan.outputs.sarif }}

-
name: Build and push Docker image
id: build-and-push
uses: docker/build-push-action@0565240e2d4ab88bba5387d719585280857ece09 # v5.0.0
with:
context: ./operator
file: ./operator/Dockerfile
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
6 changes: 6 additions & 0 deletions operator/.env_example
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
NAMESPACE=mlaas
TRAINING_IMAGE=hello-world
SERVING_IMAGE=nginx
SERVING_PORT=80
PERSISTENCE_SERVICE_URI=http://persistence-service.mlaas.svc.cluster.local:5000
DOMAIN=mlaas.aocc.at
5 changes: 5 additions & 0 deletions operator/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
__pycache__/
.pytest_cache/
userdata/
*.xml
.env
20 changes: 20 additions & 0 deletions operator/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
FROM cgr.dev/chainguard/python:latest-dev as builder

WORKDIR /app

COPY requirements.txt .

RUN pip install -r requirements.txt --user



FROM cgr.dev/chainguard/python:latest

WORKDIR /app

# Make sure you update Python version in path
COPY --from=builder /home/nonroot/.local/lib/python3.12/site-packages /home/nonroot/.local/lib/python3.12/site-packages

COPY main.py .

ENTRYPOINT [ "python", "/app/main.py" ]
100 changes: 100 additions & 0 deletions operator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Flask Kubernetes Application

This application provides a Flask-based API to manage Kubernetes resources, specifically for training jobs and serving deployments. It utilizes the Kubernetes Python client to interact with a Kubernetes cluster, handling tasks like creating, monitoring, and deleting jobs and deployments.

## Prerequisites

Ensure you have the following environment variables set before running the application:

- `TRAINING_IMAGE`
- `SERVING_IMAGE`
- `NAMESPACE`
- `SERVING_PORT`
- `PERSISTENCE_SERVICE_URI`
- `DOMAIN`

These environment variables are essential for the application to interact with Kubernetes and handle the deployment configurations.

## API Endpoints

### Create a Training Job
**Endpoint:** `/training`
**Method:** `POST`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Creates a new training job if no job is currently running for the tenant. If a job is already running, it returns an error.

**Response:**
- `202 Accepted`: Training job created.
- `400 Bad Request`: Training job is already running.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

### Get Status of All Training Jobs
**Endpoint:** `/training`
**Method:** `GET`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Retrieves the status of all training jobs for the tenant.

**Response:**
- `200 OK`: List of jobs and their statuses.
- `404 Not Found`: No jobs found for the tenant.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

### Get Status of a Specific Training Job
**Endpoint:** `/training/<id>`
**Method:** `GET`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Retrieves the status of a specific training job by its ID.

**Response:**
- `200 OK`: Status of the job.
- `404 Not Found`: Job not found.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

### Create a Serving Deployment
**Endpoint:** `/serving`
**Method:** `POST`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Creates a new serving deployment for the tenant. If a deployment already exists, it returns an error.

**Response:**
- `201 Created`: Serving deployment created.
- `400 Bad Request`: Serving deployment already exists.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

### Get Status of Serving Deployment
**Endpoint:** `/serving`
**Method:** `GET`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Retrieves the status of the serving deployment for the tenant.

**Response:**
- `200 OK`: Status of the deployment.
- `404 Not Found`: Deployment not found.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

### Delete Serving Deployment
**Endpoint:** `/serving`
**Method:** `DELETE`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Deletes the serving deployment for the tenant.

**Response:**
- `200 OK`: Deployment deleted.
- `404 Not Found`: Deployment not found.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

## Installation

Apply the manifest.yaml file to your kubernetes cluster.
Empty file added operator/__init__.py
Empty file.
Loading

0 comments on commit 17b8ca9

Please sign in to comment.