Skip to content

Commit

Permalink
Update k8s user guide to use deployments (#474)
Browse files Browse the repository at this point in the history
  • Loading branch information
edrevo authored Jun 2, 2021
1 parent c3fc0c7 commit 01b57f7
Show file tree
Hide file tree
Showing 10 changed files with 50 additions and 54 deletions.
2 changes: 1 addition & 1 deletion ballista/rust/executor/executor_config_spec.toml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ doc = "Host name or IP address to register with scheduler so that other executor

[[param]]
abbr = "p"
name = "port"
name = "bind_port"
type = "u16"
default = "50051"
doc = "bind port"
Expand Down
2 changes: 1 addition & 1 deletion ballista/rust/executor/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ async fn main() -> Result<()> {

let external_host = opt.external_host;
let bind_host = opt.bind_host;
let port = opt.port;
let port = opt.bind_port;

let addr = format!("{}:{}", bind_host, port);
let addr = addr
Expand Down
2 changes: 1 addition & 1 deletion ballista/rust/scheduler/scheduler_config_spec.toml
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ doc = "Local host name or IP address to bind to. Default: 0.0.0.0"

[[param]]
abbr = "p"
name = "port"
name = "bind_port"
type = "u16"
default = "50050"
doc = "bind port. Default: 50050"
2 changes: 1 addition & 1 deletion ballista/rust/scheduler/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ async fn main() -> Result<()> {

let namespace = opt.namespace;
let bind_host = opt.bind_host;
let port = opt.port;
let port = opt.bind_port;

let addr = format!("{}:{}", bind_host, port);
let addr = addr.parse()?;
Expand Down
6 changes: 3 additions & 3 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ RUST_LOG=info RUSTFLAGS='-C target-cpu=native -C lto -C codegen-units=1 -C embed
To run the benchmarks:

```bash
cd $ARROW_HOME/ballista/rust/benchmarks/tpch
cd $ARROW_HOME/benchmarks
cargo run --release benchmark ballista --host localhost --port 50050 --query 1 --path $(pwd)/data --format tbl
```

Expand All @@ -131,9 +131,9 @@ cargo run --release benchmark ballista --host localhost --port 50050 --query 1 -
To start a Rust scheduler and executor using Docker Compose:

```bash
cd $BALLISTA_HOME
cd $ARROW_HOME
./dev/build-rust.sh
cd $BALLISTA_HOME/rust/benchmarks/tpch
cd $ARROW_HOME/benchmarks
docker-compose up
```

Expand Down
4 changes: 2 additions & 2 deletions benchmarks/docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ services:
command: "etcd -advertise-client-urls http://etcd:2379 -listen-client-urls http://0.0.0.0:2379"
ballista-scheduler:
image: ballista:0.5.0-SNAPSHOT
command: "/scheduler --config-backend etcd --etcd-urls etcd:2379 --bind-host 0.0.0.0 --port 50050"
command: "/scheduler --config-backend etcd --etcd-urls etcd:2379 --bind-host 0.0.0.0 --bind-port 50050"
environment:
- RUST_LOG=ballista=debug
volumes:
Expand All @@ -30,7 +30,7 @@ services:
- etcd
ballista-executor:
image: ballista:0.5.0-SNAPSHOT
command: "/executor --bind-host 0.0.0.0 --port 50051 --scheduler-host ballista-scheduler"
command: "/executor --bind-host 0.0.0.0 --bind-port 50051 --scheduler-host ballista-scheduler"
scale: 2
environment:
- RUST_LOG=info
Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/src/distributed/docker-compose.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ services:
- "2379:2379"
ballista-executor:
image: ballistacompute/ballista-rust:0.4.2-SNAPSHOT
command: "/executor --bind-host 0.0.0.0 --port 50051 --local"
command: "/executor --bind-host 0.0.0.0 --bind-port 50051 --local"
environment:
- RUST_LOG=info
ports:
Expand Down
70 changes: 33 additions & 37 deletions docs/user-guide/src/distributed/kubernetes.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ you are already comfortable with managing Kubernetes deployments.

The k8s deployment consists of:

- k8s stateful set for one or more scheduler processes
- k8s stateful set for one or more executor processes
- k8s deployment for one or more scheduler processes
- k8s deployment for one or more executor processes
- k8s service to route traffic to the schedulers
- k8s persistent volume and persistent volume claims to make local data accessible to Ballista

Expand All @@ -38,6 +38,14 @@ Ballista is at an early stage of development and therefore has some significant
- Only a single scheduler instance is currently supported unless the scheduler is configured to use `etcd` as a
backing store.

## Publishing your images

Currently there are no official Ballista images that work with the instructions in this guide. For the time being,
you will need to build and publish your own images. You can do that by invoking the `dev/build-ballista-docker.sh`.

Once the images have been built, you can retag them with `docker tag ballista:0.5.0-SNAPSHOT <new-image-name>` so you
can push them to your favourite docker registry.

## Create Persistent Volume and Persistent Volume Claim

Copy the following yaml to a `pv.yaml` file and apply to the cluster to create a persistent volume and a persistent
Expand Down Expand Up @@ -88,7 +96,7 @@ persistentvolumeclaim/data-pv-claim created

## Deploying Ballista Scheduler and Executors

Copy the following yaml to a `cluster.yaml` file.
Copy the following yaml to a `cluster.yaml` file and change `<your-image>` with the name of your Ballista Docker image.

```yaml
apiVersion: v1
Expand All @@ -101,16 +109,14 @@ spec:
ports:
- port: 50050
name: scheduler
clusterIP: None
selector:
app: ballista-scheduler
---
apiVersion: apps/v1
kind: StatefulSet
kind: Deployment
metadata:
name: ballista-scheduler
spec:
serviceName: "ballista-scheduler"
replicas: 1
selector:
matchLabels:
Expand All @@ -122,27 +128,26 @@ spec:
ballista-cluster: ballista
spec:
containers:
- name: ballista-scheduler
image: ballistacompute/ballista-rust:0.4.2-SNAPSHOT
command: ["/scheduler"]
args: ["--port=50050"]
ports:
- containerPort: 50050
name: flight
volumeMounts:
- mountPath: /mnt
name: data
- name: ballista-scheduler
image: <your-image>
command: ["/scheduler"]
args: ["--bind-port=50050"]
ports:
- containerPort: 50050
name: flight
volumeMounts:
- mountPath: /mnt
name: data
volumes:
- name: data
persistentVolumeClaim:
claimName: data-pv-claim
---
apiVersion: apps/v1
kind: StatefulSet
kind: Deployment
metadata:
name: ballista-executor
spec:
serviceName: "ballista-scheduler"
replicas: 2
selector:
matchLabels:
Expand All @@ -155,20 +160,12 @@ spec:
spec:
containers:
- name: ballista-executor
image: ballistacompute/ballista-rust:0.4.2-SNAPSHOT
image: <your-image>
command: ["/executor"]
args:
[
"--port=50051",
"--scheduler-host=ballista-scheduler",
"--scheduler-port=50050",
"--external-host=$(MY_POD_IP)",
]
env:
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- "--bind-port=50051",
- "--scheduler-host=ballista-scheduler",
- "--scheduler-port=50050"
ports:
- containerPort: 50051
name: flight
Expand All @@ -189,19 +186,18 @@ This should show the following output:

```
service/ballista-scheduler created
statefulset.apps/ballista-scheduler created
statefulset.apps/ballista-executor created
deployment.apps/ballista-scheduler created
deployment.apps/ballista-executor created
```

You can also check status by running `kubectl get pods`:

```bash
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
busybox 1/1 Running 0 16m
ballista-scheduler-0 1/1 Running 0 42s
ballista-executor-0 1/1 Running 2 42s
ballista-executor-1 1/1 Running 0 26s
NAME READY STATUS RESTARTS AGE
ballista-executor-78cc5b6486-4rkn4 0/1 Pending 0 42s
ballista-executor-78cc5b6486-7crdm 0/1 Pending 0 42s
ballista-scheduler-879f874c5-rnbd6 0/1 Pending 0 42s
```

You can view the scheduler logs with `kubectl logs ballista-scheduler-0`:
Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/src/distributed/raspberrypi.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ Run the benchmarks:
```bash
docker run -it myrepo/ballista-arm64 \
/tpch benchmark datafusion --query=1 --path=/path/to/data --format=parquet \
--concurrency=24 --iterations=1 --debug --host=ballista-scheduler --port=50050
--concurrency=24 --iterations=1 --debug --host=ballista-scheduler --bind-port=50050
```

Note that it will be necessary to mount appropriate volumes into the containers and also configure networking
Expand Down
12 changes: 6 additions & 6 deletions docs/user-guide/src/distributed/standalone.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,15 +26,15 @@ Start a scheduler using the following syntax:
```bash
docker run --network=host \
-d ballistacompute/ballista-rust:0.4.2-SNAPSHOT \
/scheduler --port 50050
/scheduler --bind-port 50050
```

Run `docker ps` to check that the process is running:

```
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
59452ce72138 ballistacompute/ballista-rust:0.4.2-SNAPSHOT "/scheduler --port 5…" 6 seconds ago Up 5 seconds affectionate_hofstadter
59452ce72138 ballistacompute/ballista-rust:0.4.2-SNAPSHOT "/scheduler --bind-p…" 6 seconds ago Up 5 seconds affectionate_hofstadter
```

Run `docker logs CONTAINER_ID` to check the output from the process:
Expand All @@ -51,7 +51,7 @@ Start one or more executor processes. Each executor process will need to listen
```bash
docker run --network=host \
-d ballistacompute/ballista-rust:0.4.2-SNAPSHOT \
/executor --external-host localhost --port 50051
/executor --external-host localhost --bind-port 50051
```

Use `docker ps` to check that both the scheduer and executor(s) are now running:
Expand All @@ -60,14 +60,14 @@ Use `docker ps` to check that both the scheduer and executor(s) are now running:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0746ce262a19 ballistacompute/ballista-rust:0.4.2-SNAPSHOT "/executor --externa…" 2 seconds ago Up 1 second naughty_mclean
59452ce72138 ballistacompute/ballista-rust:0.4.2-SNAPSHOT "/scheduler --port 5…" 4 minutes ago Up 4 minutes affectionate_hofstadter
59452ce72138 ballistacompute/ballista-rust:0.4.2-SNAPSHOT "/scheduler --bind-p…" 4 minutes ago Up 4 minutes affectionate_hofstadter
```

Use `docker logs CONTAINER_ID` to check the output from the executor(s):

```
$ docker logs 0746ce262a19
[2021-02-14T18:36:25Z INFO executor] Running with config: ExecutorConfig { host: "localhost", port: 50051, work_dir: "/tmp/.tmpVRFSvn", concurrent_tasks: 4 }
[2021-02-14T18:36:25Z INFO executor] Running with config: ExecutorConfig { host: "localhost", bind_port: 50051, work_dir: "/tmp/.tmpVRFSvn", concurrent_tasks: 4 }
[2021-02-14T18:36:25Z INFO executor] Ballista v0.4.2-SNAPSHOT Rust Executor listening on 0.0.0.0:50051
[2021-02-14T18:36:25Z INFO executor] Starting registration with scheduler
```
Expand All @@ -84,7 +84,7 @@ Ballista can optionally use [etcd](https://etcd.io/) as a backing store for the
```bash
docker run --network=host \
-d ballistacompute/ballista-rust:0.4.2-SNAPSHOT \
/scheduler --port 50050 \
/scheduler --bind-port 50050 \
--config-backend etcd \
--etcd-urls etcd:2379
```
Expand Down

0 comments on commit 01b57f7

Please sign in to comment.