🚀 Kubernetes Scheduler Simulator

The simulator evaluates different scheduling policies in GPU-sharing clusters. It includes the Fragmentation Gradient Descent (FGD) policy proposed in the USENIX ATC 2023 paper "Beware of Fragmentation: Scheduling GPU-Sharing Workloads with Fragmentation Gradient Descent", along with other baseline policies (e.g., Best-fit, Dot-product, GPU Packing, GPU Clustering, Random-fit).

🎬 Demo

Step 1: Init and Run Experiments	Step 2: Result Analysis

🚧 Environment Setup

Build from stratch

Please ensure that Go is installed.

go mod vendor installs the dependencies required for the simulator.

$ go mod vendor

make generates the compiled binary files in the bin directory.

$ make

Run within Docker

To save your time in the environment setup, we have just prepared a docker image with Golang 1.20.4, Python 3.10.11, and required libraries installed.

Besides, we have copied the GitHub repo under the home directory and compile the executable binary file (bin/simon), therefore, go mod vendor and make commands are no longer needed.

For the users not familiar with Docker, please refer to the official installation guide on Linux, Mac, or Windows platform. For the others, the following commands are for your reference.

# step 1: pull image
sudo docker pull qzweng/kubernetes-scheduler-simulator:atc23

# step 2: launch the docker container
sudo docker run -d --name=kss qzweng/kubernetes-scheduler-simulator:atc23 bash -c "sleep infinity"

# step 3: execute commands inside the container
sudo docker exec -it kss bash

# step 4: go to the project folder and conduct experiments
cd ~/kubernetes-scheduler-simulator

🔥 Quickstart Example

The following example will schedule 6 pods to a cluster with 2 nodes, and the expected output will show the allocation ratio of each resource dimension (CPU, memory, GPU). The default scheduling policy is fragmentation gradient descent (FGD).

$ bin/simon apply --extended-resources "gpu" \
                  -f example/test-cluster-config.yaml \
                  -s example/test-scheduler-config.yaml

🔮 Experiments on Production Traces

Install the required Python dependency environment.

$ pip install -r requirements.txt

Please refer to README under the data directory to prepare production traces.
Then refer to README under the experiments directory to reproduce the results reported in the paper.

📝 Paper

Please cite our paper if it is helpful to your research.

@inproceedings{FGD2023,
    title = {Beware of Fragmentation: Scheduling GPU-Sharing Workloads with Fragmentation Gradient Descent},
    author = {Qizhen Weng and Lingyun Yang and Yinghao Yu and Wei Wang and Xiaochuan Tang and Guodong Yang and Liping Zhang},
    booktitle = {2023 {USENIX} Annual Technical Conference},
    year = {2023},
    series = {{USENIX} {ATC} '23}
    url = {https://www.usenix.org/conference/atc23/presentation/weng},
    publisher = {{USENIX} Association},
}

🙏🏻 Acknowledge

Our simulator is developed based on open-simulator by Alibaba, a simulator used for cluster capacity planning. This repository primarily evaluates the performance of different scheduling polices on production traces. GPU-related plugin has been merged into the main branch of open-simulator.

⏳ TODO

Add a minikube running example to demonstrate how the simulator schedules pods in a real Kubernetes cluster.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
cmd		cmd
data		data
example		example
experiments		experiments
pkg		pkg
scripts		scripts
vendor		vendor
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Kubernetes Scheduler Simulator

🎬 Demo

🚧 Environment Setup

Build from stratch

Run within Docker

🔥 Quickstart Example

🔮 Experiments on Production Traces

📝 Paper

🙏🏻 Acknowledge

⏳ TODO

About

Releases 1

Contributors 2

Languages

License

hkust-adsl/kubernetes-scheduler-simulator

Folders and files

Latest commit

History

Repository files navigation

🚀 Kubernetes Scheduler Simulator

🎬 Demo

🚧 Environment Setup

Build from stratch

Run within Docker

🔥 Quickstart Example

🔮 Experiments on Production Traces

📝 Paper

🙏🏻 Acknowledge

⏳ TODO

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Contributors 2

Languages