Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add operator #12

Merged
merged 12 commits into from
May 19, 2024
87 changes: 87 additions & 0 deletions .github/workflows/build_push_operator.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
name: build-push-operator

# This workflow uses actions that are not certified by GitHub.
# They are provided by a third-party and are governed by
# separate terms of service, privacy policy, and support
# documentation.

on:
# schedule:
# - cron: '40 16 * * *'
push:
branches: [ "feature/operator" ]
# Publish semver tags as releases.
#tags: [ 'v*.*.*' ]
pull_request:
branches: [ "dev","main" ]

env:
# Use docker.io for Docker Hub if empty
REGISTRY: ghcr.io
# github.repository as <account>/<repo>
IMAGE_NAME: ${{ github.repository }}

jobs:
build-scan-push:
runs-on: ubuntu-latest
steps:
-
name: Checkout
uses: actions/checkout@v3
-
name: Set up QEMU
uses: docker/setup-qemu-action@v2
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2

-
name: Log into registry ${{ env.REGISTRY }}
uses: docker/login-action@343f7c4344506bcbf9b4de18042ae17996df046d # v3.0.0
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

-
name: Extract Docker metadata
id: meta
uses: docker/metadata-action@96383f45573cb7f253c731d3b3ab81c87ef81934 # v5.0.0
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
-
name: Build and load
uses: docker/build-push-action@v4
with:
load: true
context: ./operator
file: ./operator/Dockerfile
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
-
name: Scan for vulnerabilities
id: scan
uses: crazy-max/ghaction-container-scan@v3
with:
image: ${{ steps.meta.outputs.tags }}
dockerfile: ./operator/Dockerfile

-
name: Upload SARIF file
if: ${{ steps.scan.outputs.sarif != '' }}
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: ${{ steps.scan.outputs.sarif }}

-
name: Build and push Docker image
id: build-and-push
uses: docker/build-push-action@0565240e2d4ab88bba5387d719585280857ece09 # v5.0.0
with:
context: ./operator
file: ./operator/Dockerfile
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
6 changes: 6 additions & 0 deletions operator/.env_example
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
NAMESPACE=mlaas
TRAINING_IMAGE=hello-world
SERVING_IMAGE=nginx
SERVING_PORT=80
PERSISTENCE_SERVICE_URI=http://persistence-service.mlaas.svc.cluster.local:5000
DOMAIN=mlaas.aocc.at
5 changes: 5 additions & 0 deletions operator/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
__pycache__/
.pytest_cache/
userdata/
*.xml
.env
20 changes: 20 additions & 0 deletions operator/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
FROM cgr.dev/chainguard/python:latest-dev as builder

WORKDIR /app

COPY requirements.txt .

RUN pip install -r requirements.txt --user



FROM cgr.dev/chainguard/python:latest

WORKDIR /app

# Make sure you update Python version in path
COPY --from=builder /home/nonroot/.local/lib/python3.12/site-packages /home/nonroot/.local/lib/python3.12/site-packages

COPY main.py .

ENTRYPOINT [ "python", "/app/main.py" ]
100 changes: 100 additions & 0 deletions operator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Flask Kubernetes Application

This application provides a Flask-based API to manage Kubernetes resources, specifically for training jobs and serving deployments. It utilizes the Kubernetes Python client to interact with a Kubernetes cluster, handling tasks like creating, monitoring, and deleting jobs and deployments.

## Prerequisites

Ensure you have the following environment variables set before running the application:

- `TRAINING_IMAGE`
- `SERVING_IMAGE`
- `NAMESPACE`
- `SERVING_PORT`
- `PERSISTENCE_SERVICE_URI`
- `DOMAIN`

These environment variables are essential for the application to interact with Kubernetes and handle the deployment configurations.

## API Endpoints

### Create a Training Job
**Endpoint:** `/training`
**Method:** `POST`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Creates a new training job if no job is currently running for the tenant. If a job is already running, it returns an error.

**Response:**
- `202 Accepted`: Training job created.
- `400 Bad Request`: Training job is already running.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

### Get Status of All Training Jobs
**Endpoint:** `/training`
**Method:** `GET`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Retrieves the status of all training jobs for the tenant.

**Response:**
- `200 OK`: List of jobs and their statuses.
- `404 Not Found`: No jobs found for the tenant.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

### Get Status of a Specific Training Job
**Endpoint:** `/training/<id>`
**Method:** `GET`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Retrieves the status of a specific training job by its ID.

**Response:**
- `200 OK`: Status of the job.
- `404 Not Found`: Job not found.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

### Create a Serving Deployment
**Endpoint:** `/serving`
**Method:** `POST`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Creates a new serving deployment for the tenant. If a deployment already exists, it returns an error.

**Response:**
- `201 Created`: Serving deployment created.
- `400 Bad Request`: Serving deployment already exists.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

### Get Status of Serving Deployment
**Endpoint:** `/serving`
**Method:** `GET`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Retrieves the status of the serving deployment for the tenant.

**Response:**
- `200 OK`: Status of the deployment.
- `404 Not Found`: Deployment not found.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

### Delete Serving Deployment
**Endpoint:** `/serving`
**Method:** `DELETE`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Deletes the serving deployment for the tenant.

**Response:**
- `200 OK`: Deployment deleted.
- `404 Not Found`: Deployment not found.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

## Installation

Apply the manifest.yaml file to your kubernetes cluster.
Empty file added operator/__init__.py
Empty file.
Loading
Loading