New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Configurable autoscaling #203

Merged

nstogner merged 17 commits into main from autoscaling-config

Sep 17, 2024

Contributor

nstogner commented Sep 11, 2024 •

edited

Loading

Add configurable scale-down-delay, autoscaling interval & window, and target (Fixes Model expose scale down delay #202)
Add full scale-up-to-down integration test
Update helm chart values
Update docs (add autoscaling docs, remove how-to info from concepts and break into separate sections)
Remove resource-profile-override fields from Model spec

nstogner force-pushed the autoscaling-config branch from 150fa43 to d22b3df Compare

September 12, 2024 01:10

nstogner changed the title ~~WIP: Add basic autoscaling config options~~ Configurable autoscaling with integration tests

nstogner changed the title ~~Configurable autoscaling with integration tests~~ Configurable autoscaling

samos123 reviewed

View reviewed changes

charts/kubeai/charts/crds/crds/kubeai.org_models.yaml Outdated Show resolved Hide resolved

samos123 reviewed

View reviewed changes

charts/kubeai/values.yaml Outdated Show resolved Hide resolved

samos123 reviewed

View reviewed changes

api/v1/model_types.go Outdated Show resolved Hide resolved

samos123 requested changes

View reviewed changes

Contributor

samos123 left a comment

Looks good except some minor nits and question on behavior.

samos123 reviewed

View reviewed changes

api/v1/model_types.go Outdated Show resolved Hide resolved

samos123 reviewed

View reviewed changes

internal/config/system.go Outdated Show resolved Hide resolved

samos123 reviewed

View reviewed changes

internal/config/system.go Outdated Show resolved Hide resolved

samos123 reviewed

View reviewed changes

docs/how-to/configure-autoscaling.md Outdated Show resolved Hide resolved

samos123 reviewed

View reviewed changes

charts/kubeai/values.yaml Show resolved Hide resolved

samos123 approved these changes

View reviewed changes

Contributor

samos123 commented Sep 14, 2024 •

edited

Loading

Alternative:

apiVersion: kubeai.org/v1
kind: Model
metadata:
  name: faster-whisper-medium-en-cpu
spec:
  features: [SpeechToText]
  owner: Systran
  url: hf://Systran/faster-whisper-medium.en
  engine: FasterWhisper
  minReplicas: 0 # defaults to 0 if not set like before
  maxReplicas: 3 # defaults to 3 if not set like before
  concurrentRequests: 100 # defaults to 100 if not set
  ScaleDownDelay: 60s # defaults to 60s if not set
  resourceProfile: cpu:1

Benefits: simple and backwards compatible.

Ideally we keep things backwards compatible also so our existing tutorials and docs don't all have to get updated.

samos123 reviewed

View reviewed changes

api/v1/model_types.go Outdated

+              	// TargetRequests is the target number of active requests per Pod.
+              	// +kubebuilder:validation:Minimum=1
+              	// +kubebuilder:default=100
+              	TargetRequests int32 `json:"targetRequests"`

Contributor

samos123 Sep 14, 2024

I think I prefer calling it concurrentRequest

nstogner added 15 commits

September 16, 2024 20:26


          Add basic autoscaling config options

d16d93c


          Update comment

0eb68ed


          Update comments

74ad849


          Make average window configurable

ad796c3


          Add more config to support tests

362d09c


          Add passing integration tests with full scale up and down

6fdbdc7


          Add autoscaling profiles

47e1dc0


          Fix typo

287b274


          Update docs and chart

d0f4e26


          Fix chart config - remove hpa

76e6368


          Address PR comments

ac66a19


          Update CRD description

8570ec3


          Update docs with in-line example

69b0595


          Refactor API

ceb71c4


          Fix docs

73a0e85

samos123 reviewed

View reviewed changes

charts/kubeai/values-gke.yaml Show resolved Hide resolved


          Rebase

ea0b609

nstogner force-pushed the autoscaling-config branch from 42f7d40 to ea0b609 Compare

September 17, 2024 00:52


          Fix required fields

a40e456

samos123 reviewed

View reviewed changes

test/quickstart.sh Show resolved Hide resolved

samos123 approved these changes

View reviewed changes

nstogner merged commit 91f1d15 into main

5 checks passed

nstogner deleted the autoscaling-config branch

September 17, 2024 01:26

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet