-
Notifications
You must be signed in to change notification settings - Fork 49
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
10 changed files
with
347 additions
and
0 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
# Private Deep Chat | ||
|
||
In this tutorial you are going to deploy a custom, multitenant, private chat application. The Chat UI is powered by <a href="https://deepchat.dev/" target="_blank">Deep Chat</a> - an open source web component that is easy to embed into any frontend web app framework or simple HTML page. KubeAI will be used to ensure that all chat interactions are kept private within the cluster. | ||
|
||
![Screenshot](../screenshots/private-deep-chat.png) | ||
|
||
In this example, we will deploy a custom Go server that will authenticate users using <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme" target="_blank">Basic Authentication</a>. When a webpage is requested, a simple HTML page with the `<deep-chat>` web component will be served. We will configure Deep Chat and KubeAI to communicate using the OpenAI API format: | ||
|
||
```html | ||
<deep-chat | ||
connect='{"url":"/openai/v1/chat/completions", ... }' | ||
directConnection='{"openAI": ... }'> | ||
</deep-chat> | ||
``` | ||
|
||
When the HTML page loads we will use Javascript to make an initial request to fetch available models. The Go server will proxy this request to KubeAI: | ||
|
||
```go | ||
proxyHandler := httputil.NewSingleHostReverseProxy(kubeAIURL) | ||
|
||
http.Handle("/openai/", authUserToKubeAI(proxyHandler)) | ||
``` | ||
|
||
The server will translate the username and password provided in the basic auth header into a label selector that will tell KubeAI to filter the models it returns. The same approach will also be used to enforce access at inference-time. | ||
|
||
```go | ||
r.Header.Set("X-Label-Selector", fmt.Sprintf("tenancy in (%s)", | ||
strings.Join(tenancy, ","), | ||
)) | ||
``` | ||
|
||
While this is a simple example application, this overall architecture can be used when incorporating chat into a production application. | ||
|
||
![Architecture](../diagrams/private-deep-chat.excalidraw.png) | ||
|
||
## Guide | ||
|
||
Create a local cluster with <a href="https://kind.sigs.k8s.io/" target="_blank">kind</a> and install KubeAI. | ||
|
||
```bash | ||
kind create cluster | ||
|
||
helm repo add kubeai https://www.kubeai.org && helm repo update | ||
helm install kubeai kubeai/kubeai --set openwebui.enabled=false --wait --timeout 5m | ||
``` | ||
|
||
Clone the KubeAI repo and navigate to the example directory. | ||
|
||
```bash | ||
git clone https://github.com/substratusai/kubeai | ||
cd ./kubeai/examples/private-deep-chat | ||
``` | ||
|
||
Build the private chat application and load the image into the local kind cluster. | ||
|
||
```bash | ||
docker build -t private-deep-chat:latest . | ||
kind load docker-image private-deep-chat:latest | ||
``` | ||
|
||
Deploy the private chat application along with some KubeAI Models. | ||
|
||
```bash | ||
kubectl apply -f ./manifests | ||
``` | ||
|
||
Start a port-forward. | ||
|
||
```bash | ||
kubectl port-forward svc/private-deep-chat 8000:80 | ||
``` | ||
|
||
In your browser, navigate to <a href="http://localhost:8000/" target="_blank">localhost:8000</a>. | ||
|
||
Login as any of the following users: | ||
|
||
|User|Password | | ||
|----|---------| | ||
|nick|nickspass| | ||
|sam |samspass | | ||
|joe |joespass | | ||
|
||
These users each have access to different KubeAI Models. You can see this assignment by looking at the user mapping in `./main.go` and the associated `tenancy` label on the Models in `./manifests/models.yaml`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
FROM golang:1.23 AS builder | ||
|
||
WORKDIR /workspace | ||
COPY go.* . | ||
|
||
RUN go mod download | ||
|
||
COPY main.go main.go | ||
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -a -o server ./main.go | ||
|
||
FROM gcr.io/distroless/static:nonroot | ||
|
||
WORKDIR /app | ||
COPY --from=builder /workspace/server /app/ | ||
COPY ./static /app/static | ||
USER 65532:65532 | ||
|
||
ENTRYPOINT ["/app/server"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
module private-chat | ||
|
||
go 1.22.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
package main | ||
|
||
import ( | ||
"fmt" | ||
"log" | ||
"net/http" | ||
"net/http/httputil" | ||
"net/url" | ||
"os" | ||
"strings" | ||
) | ||
|
||
// Run a web server that serves static content and proxies inference | ||
// requests to KubeAI. | ||
// Control access with basic auth. | ||
func main() { | ||
kubeAIURL, err := url.Parse(os.Getenv("KUBEAI_ADDR")) | ||
if err != nil { | ||
log.Fatalf("failed to parse KubeAI address: %v", err) | ||
} | ||
|
||
staticHandler := http.FileServer(http.Dir("static")) | ||
proxyHandler := httputil.NewSingleHostReverseProxy(kubeAIURL) | ||
|
||
http.Handle("/", authUser(staticHandler)) | ||
http.Handle("/openai/", authUserToKubeAI(proxyHandler)) | ||
|
||
listenAddr := os.Getenv("LISTEN_ADDR") | ||
log.Printf("listening on %s", listenAddr) | ||
log.Fatal(http.ListenAndServe(listenAddr, nil)) | ||
} | ||
|
||
func authUser(h http.Handler) http.Handler { | ||
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { | ||
user, pass, ok := r.BasicAuth() | ||
if _, matches := authenticate(user, pass); !ok || !matches { | ||
w.Header().Set("WWW-Authenticate", `Basic realm="Restricted"`) | ||
http.Error(w, "Unauthorized", http.StatusUnauthorized) | ||
return | ||
} | ||
h.ServeHTTP(w, r) | ||
}) | ||
} | ||
|
||
func authUserToKubeAI(h http.Handler) http.Handler { | ||
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { | ||
user, pass, basicAuthProvided := r.BasicAuth() | ||
|
||
tenancy, authenticated := authenticate(user, pass) | ||
|
||
if !basicAuthProvided || !authenticated || len(tenancy) == 0 { | ||
w.Header().Set("WWW-Authenticate", `Basic realm="Restricted"`) | ||
http.Error(w, "Unauthorized", http.StatusUnauthorized) | ||
return | ||
} | ||
|
||
r.Header.Set("X-Label-Selector", fmt.Sprintf("tenancy in (%s)", | ||
strings.Join(tenancy, ","), | ||
)) | ||
|
||
h.ServeHTTP(w, r) | ||
}) | ||
} | ||
|
||
// authenticate checks the provided username and password. | ||
// If the user is authenticated, it returns the tenancy groups the user belongs to. | ||
func authenticate(user, pass string) ([]string, bool) { | ||
// In a real application, this would be a database lookup. | ||
userTable := map[string]struct { | ||
password string | ||
tenancy []string | ||
}{ | ||
"nick": {"nickspass", []string{"group-a"}}, | ||
"sam": {"samspass", []string{"group-b"}}, | ||
"joe": {"joespass", []string{"group-a", "group-b"}}, | ||
} | ||
|
||
row, ok := userTable[user] | ||
if !ok { | ||
return nil, false | ||
} | ||
if row.password != pass { | ||
return nil, false | ||
} | ||
|
||
return row.tenancy, true | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
metadata: | ||
name: private-deep-chat | ||
labels: | ||
app: private-deep-chat | ||
spec: | ||
replicas: 1 | ||
selector: | ||
matchLabels: | ||
app: private-deep-chat | ||
template: | ||
metadata: | ||
labels: | ||
app: private-deep-chat | ||
spec: | ||
containers: | ||
- name: server | ||
image: private-deep-chat:latest | ||
imagePullPolicy: IfNotPresent | ||
ports: | ||
- containerPort: 8000 | ||
env: | ||
- name: LISTEN_ADDR | ||
value: ":8000" | ||
- name: KUBEAI_ADDR | ||
value: "http://kubeai" | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
apiVersion: kubeai.org/v1 | ||
kind: Model | ||
metadata: | ||
name: gemma2-a | ||
labels: | ||
tenancy: group-a | ||
spec: | ||
features: [TextGeneration] | ||
owner: google | ||
url: ollama://gemma2:2b | ||
engine: OLlama | ||
resourceProfile: cpu:2 | ||
--- | ||
apiVersion: kubeai.org/v1 | ||
kind: Model | ||
metadata: | ||
name: gemma2-b | ||
labels: | ||
tenancy: group-b | ||
spec: | ||
features: [TextGeneration] | ||
owner: google | ||
url: ollama://gemma2:2b | ||
engine: OLlama | ||
resourceProfile: cpu:2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
apiVersion: v1 | ||
kind: Service | ||
metadata: | ||
name: private-deep-chat | ||
labels: | ||
app: private-deep-chat | ||
spec: | ||
ports: | ||
- port: 80 | ||
protocol: TCP | ||
targetPort: 8000 | ||
selector: | ||
app: private-deep-chat |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
<!DOCTYPE html> | ||
<html> | ||
|
||
<head> | ||
<meta charset="UTF-8" /> | ||
</head> | ||
<script type="module" src="https://unpkg.com/deep-chat@2.0.1/dist/deepChat.bundle.js"></script> | ||
|
||
<body style="font-family: Inter, sans-serif, Avenir, Helvetica, Arial"> | ||
<div style="padding: 10px; text-align: center"> | ||
Model: | ||
<select id="modelDropdown"> | ||
<option value="">Loading...</option> | ||
</select> | ||
</div> | ||
<deep-chat id="chat" style="border-radius: 10px; width: 96vw; height: calc(100vh - 70px); padding-top: 10px" | ||
messageStyles='{"default": {"shared": {"innerContainer": {"fontSize": "1rem"}}}}' | ||
inputAreaStyle='{"fontSize": "1rem"}' | ||
connect='{"url":"/openai/v1/chat/completions", "credentials": "same-origin", "stream": true}' | ||
directConnection='{"openAI":{"chat": {"model": ""}, "key": "placeholder", "validateKeyProperty": false}}' | ||
textInput='{"placeholder":{"text": "Chat with a model!"}}'> | ||
</deep-chat> | ||
</body> | ||
<script type="module"> | ||
const chatElementRef = document.getElementById('chat'); | ||
const modelDropdownElementRef = document.getElementById('modelDropdown'); | ||
|
||
var selectedModelId = ""; | ||
|
||
chatElementRef.requestInterceptor = (requestDetails) => { | ||
// Remove the placeholder API key from the request headers | ||
// and allow the "same-origin" credentials to be sent | ||
// (the logged-in basic-auth user/pass). | ||
console.log("Request interceptor: ", requestDetails); | ||
delete requestDetails.headers['Authorization']; | ||
|
||
// Set the selected model ID. | ||
// NOTE: modifying the model in the directConnection attribute on the <deep-chat> | ||
// element does not appear to take effect. | ||
requestDetails.body.model = selectedModelId; | ||
|
||
return requestDetails; | ||
}; | ||
|
||
function selectModel(modelId) { | ||
console.log("Selected model: ", modelId); | ||
selectedModelId = modelId; | ||
} | ||
|
||
fetch('/openai/v1/models') | ||
.then(response => { | ||
if (!response.ok) { | ||
throw new Error('Network response was not ok'); | ||
} | ||
return response.json(); | ||
}) | ||
.then(data => { | ||
const dropdown = document.getElementById('modelDropdown'); | ||
|
||
// Clear the dropdown | ||
dropdown.innerHTML = ''; | ||
|
||
// Populate the dropdown with country names | ||
// Select the first option by default. | ||
var i = 0; | ||
data.data.forEach(model => { | ||
const option = document.createElement('option'); | ||
option.value = model.id; // Use the country code as value | ||
option.textContent = model.id; // Display country name | ||
dropdown.appendChild(option); | ||
if (i === 0) { | ||
selectModel(model.id); | ||
} | ||
i++; | ||
}); | ||
}) | ||
.catch(error => { | ||
console.error('There was a problem with the fetch operation:', error); | ||
}); | ||
|
||
// Add event listener to log dropdown selections. | ||
modelDropdownElementRef.addEventListener('change', (event) => { | ||
const selectedValue = event.target.value; | ||
const selectedText = event.target.options[event.target.selectedIndex].text; | ||
console.log(`Selected Value: ${selectedValue}, Selected Text: ${selectedText}`); | ||
selectModel(selectedValue); | ||
}); | ||
</script> | ||
|
||
</html> |