-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update readme #357
base: main
Are you sure you want to change the base?
Update readme #357
Conversation
nstogner
commented
Dec 22, 2024
- Remove references to Open Web UI (soon to be removed)
- Update feature list formatting and wording
- Simplify wording in sections
- Get rid of star chart
|
||
## Key Features | ||
|
||
🚀 **LLM Operator** - Manages vLLM and Ollama servers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we add that we support VLMs too as part of this line.
🖥 **Hardware Flexible** - Runs on CPU, GPU, or TPU | ||
💾 **Efficient Caching** - Supports EFS, Filestore, and more | ||
🎙️ **Speech Processing** - Transcribe audio via FasterWhisper | ||
🔢 **Vector Operations** - Generate embeddings via Infinity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to Embedding Operator - manages Infinity servers.
I think Vector Operations isnt what I would think of when think of when looking for vector or embedding server.
|
||
## Local Quickstart | ||
|
||
|
||
<video controls src="https://github.com/user-attachments/assets/711d1279-6af9-4c6c-a052-e59e7730b757" width="800"></video> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have a video demo imo. So people can quickly see what they would get after installing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am hoping to get rid of OpenWebUI, which is showcased in the video
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can keep the video showcasing an example chatUI (e.g. OpenWebUI) even though KubeAI doesn't include installation of it. I prefer to keep the video until we have a new video though.
@@ -119,50 +117,13 @@ Now open your browser to [localhost:8000](http://localhost:8000) and select the | |||
|
|||
If you go back to the browser and start a chat with Qwen2, you will notice that it will take a while to respond at first. This is because we set `minReplicas: 0` for this model and KubeAI needs to spin up a new Pod (you can verify with `kubectl get models -oyaml qwen2-500m-cpu`). | |||
|
|||
## Documentation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should highlight our full documentation before the Local Quickstart guide.