Skip to content

Commit

Permalink
add readme
Browse files Browse the repository at this point in the history
  • Loading branch information
winklermichael committed May 19, 2024
1 parent 832aa09 commit 87c9e47
Show file tree
Hide file tree
Showing 2 changed files with 105 additions and 1 deletion.
6 changes: 5 additions & 1 deletion operator/.env_example
Original file line number Diff line number Diff line change
@@ -1,2 +1,6 @@
NAMESPACE=mlaas
TRAINING_IMAGE=hello-world
SERVING_IMAGE=alpine
SERVING_IMAGE=nginx
SERVING_PORT=80
PERSISTENCE_SERVICE_URI=http://persistence-service.mlaas.svc.cluster.local:5000
DOMAIN=mlaas.aocc.at
100 changes: 100 additions & 0 deletions operator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Flask Kubernetes Application

This application provides a Flask-based API to manage Kubernetes resources, specifically for training jobs and serving deployments. It utilizes the Kubernetes Python client to interact with a Kubernetes cluster, handling tasks like creating, monitoring, and deleting jobs and deployments.

## Prerequisites

Ensure you have the following environment variables set before running the application:

- `TRAINING_IMAGE`
- `SERVING_IMAGE`
- `NAMESPACE`
- `SERVING_PORT`
- `PERSISTENCE_SERVICE_URI`
- `DOMAIN`

These environment variables are essential for the application to interact with Kubernetes and handle the deployment configurations.

## API Endpoints

### Create a Training Job
**Endpoint:** `/training`
**Method:** `POST`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Creates a new training job if no job is currently running for the tenant. If a job is already running, it returns an error.

**Response:**
- `202 Accepted`: Training job created.
- `400 Bad Request`: Training job is already running.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

### Get Status of All Training Jobs
**Endpoint:** `/training`
**Method:** `GET`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Retrieves the status of all training jobs for the tenant.

**Response:**
- `200 OK`: List of jobs and their statuses.
- `404 Not Found`: No jobs found for the tenant.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

### Get Status of a Specific Training Job
**Endpoint:** `/training/<id>`
**Method:** `GET`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Retrieves the status of a specific training job by its ID.

**Response:**
- `200 OK`: Status of the job.
- `404 Not Found`: Job not found.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

### Create a Serving Deployment
**Endpoint:** `/serving`
**Method:** `POST`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Creates a new serving deployment for the tenant. If a deployment already exists, it returns an error.

**Response:**
- `201 Created`: Serving deployment created.
- `400 Bad Request`: Serving deployment already exists.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

### Get Status of Serving Deployment
**Endpoint:** `/serving`
**Method:** `GET`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Retrieves the status of the serving deployment for the tenant.

**Response:**
- `200 OK`: Status of the deployment.
- `404 Not Found`: Deployment not found.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

### Delete Serving Deployment
**Endpoint:** `/serving`
**Method:** `DELETE`
**Headers:**
- `Authorization`: Tenant identifier

**Description:** Deletes the serving deployment for the tenant.

**Response:**
- `200 OK`: Deployment deleted.
- `404 Not Found`: Deployment not found.
- `500 Internal Server Error`: An error occurred with the Kubernetes API.

## Installation

Apply the manifest.yaml file to your kubernetes cluster.

0 comments on commit 87c9e47

Please sign in to comment.