The dstack
platform consists of six major components:
- Server
- Python API
- CLI
- Runner
- Shim
- Gateway (optional)
The server provides an HTTP API for submitting runs and managing all of the dstack
functionality including users,
projects, backends, repos, secrets, and gateways.
The Python API consists of the low-level and high-level Python API. The low-level Python API is a Python wrapper around
the server's HTTP API. It's available as dstack.api.server
. The high-level API provides a more convenient interface to
work with dstack
programatically. It's available as dstack.api
. The dstack
CLI is implemented on top of the
high-level API.
When the server provisions a cloud instance for a run, it launches a Docker image with the runner inside the image. The runner provides an HTTP API that the server uses for submitting the run, uploading the code, fetching logs and so on.
The shim may be or may not be present depending on which type of cloud is used. If it's a GPU cloud that provides an API for running Docker images, then no shim is required. If it's a traditional cloud that provisions VMs, then the shim is started on the VM launch. It pulls and runs the Docker image, controls its execution, and implements any cloud-specific functionality such as terminating the instance.
The gateway makes jobs available via a public URL. It works like a reverse proxy that forwards requests to the job instance via an SSH tunnel.
When a user invokes dstack run
, the CLI first sends the run configuration and other profile parameters to the server
to get the run plan. The server iterates over configured backends to get all instance offers matching the requirements
and their availability. If the user is willing to proceed with the offers suggested, the CLI uploads the code from the
user's machine to the server and submits the run configuration.
Note: If a git repository is used, dstack
only uploads the code diff. The runner then pulls the repository and applies
the diff to get the copy of the user's files. The dstack init
command uploads git credentials to the server so that
the runner can access private repositories.
The submitted runs are stored in the server database. For each run, the server also creates one or more jobs. (Multiple jobs allow for distributed runs.) And for each job, it creates an initial job submission. If one submission fails, the server may create new submissions.
A background worker fetches a job submission and iterates over configured backends to provision an instance. It tries best offers first until the provisioning succeeds. The instance is instructed to run the shim on the launch. In case of "Docker-only" clouds, the docker image is run directly.
A successfully provisioned job enters the provisioning state. Another background worker processes such jobs. It waits for the runner to become available and submits the job.
Note: The runner HTTP API is not exposed publicly. In order to use it, the server established an SSH connection to the instance. The runner HTTP API becomes available via port-forwarding.
After the job is submitted, the job enters the running state. A background worker pings the runner periodically for the job status and logs updates.
When all job's commands are executed, the runner marks job as done, the container exists, and the shim terminates the
instance. The job may also be interrupted by dstack stop
that asks the runner shutdown gracefully. The --abort
flag
tells the server to force instance shutdown without notifying the runner, which may be useful if the runner becomes
unavailable.
The server is a FastAPI app backend by sqlite. The runner and shim are written in Go.
docker/
– Dockefiles fordstack
imagesdocs/
– source files for mkdocs generated documentationrunner/
– source code for the runner and the shimscripts/
– dev and CI/CD scriptssrc/
– source code for thedstack
Python package that includes the server, the CLI and the Python APIdstack/
_internal/
– modules hidden from the users of thedstack
Python APIcli/
– CLI source codecore/
– coredstack
business logic that is not API, CLI or server specific. Although most of it is used only on the server side as of now (e.g. backends).backends/
– core backends logic (e.g. compute provisioning, pricing, etc)models/
– coredstack
pydantic models. For simplicity, server-specific models also live here. Put the model here if unsure.backends/
– backend-specific models such as configs used by the server
services/
– other business logic implemented on top ofmodels/
server/
– server source codebackground/
– server background workersmigrations/
– alembic migrationsrouters/
– API endpoints implementation, a thin wrapper aroundservices/
.schemas/
– request/response-specific pydantic models. Other server models live indstack._internal.core.models
.security/
– permissionsservices/
– core server business logicbackends/
configurators/
– backend configurators responsible for configuring and creating backends from API configs
jobs/
configurators/
– job configurators responsible for makingJobSpec
fromRunSpec
utils/
– server-specific utilsalembic.ini
db.py
– db class and utilsmain.py
– server entrypointmodels.py
– sqlalchemy modelssettings.py
– server global settings
utils/
– utils common for all modules
api/
– public Python API_public
– the implementation of the high-level Python APIserver
– the low-level Python API (a Python wrapper around server's HTTP API)
core/
– core Python API modules (e.g.dstack
errors)
tests/
gateway/src/dstack/gateway
- source code for the gateway applicationopenai/
- OpenAI API proxyregistry/
- gateway services registrysystemd/
- systemd service files