Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update readme #357

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Update readme #357

wants to merge 2 commits into from

Conversation

nstogner
Copy link
Contributor

  • Remove references to Open Web UI (soon to be removed)
  • Update feature list formatting and wording
  • Simplify wording in sections
  • Get rid of star chart

@nstogner nstogner requested a review from samos123 December 22, 2024 17:10

## Key Features

🚀 **LLM Operator** - Manages vLLM and Ollama servers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we add that we support VLMs too as part of this line.

🖥 **Hardware Flexible** - Runs on CPU, GPU, or TPU
💾 **Efficient Caching** - Supports EFS, Filestore, and more
🎙️ **Speech Processing** - Transcribe audio via FasterWhisper
🔢 **Vector Operations** - Generate embeddings via Infinity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to Embedding Operator - manages Infinity servers.

I think Vector Operations isnt what I would think of when think of when looking for vector or embedding server.


## Local Quickstart


<video controls src="https://github.com/user-attachments/assets/711d1279-6af9-4c6c-a052-e59e7730b757" width="800"></video>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a video demo imo. So people can quickly see what they would get after installing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am hoping to get rid of OpenWebUI, which is showcased in the video

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can keep the video showcasing an example chatUI (e.g. OpenWebUI) even though KubeAI doesn't include installation of it. I prefer to keep the video until we have a new video though.

@@ -119,50 +117,13 @@ Now open your browser to [localhost:8000](http://localhost:8000) and select the

If you go back to the browser and start a chat with Qwen2, you will notice that it will take a while to respond at first. This is because we set `minReplicas: 0` for this model and KubeAI needs to spin up a new Pod (you can verify with `kubectl get models -oyaml qwen2-500m-cpu`).

## Documentation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should highlight our full documentation before the Local Quickstart guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants