This repo houses the work-in-progress Oxide Rack control plane.
Omicron is open-source. But we’re pretty focused on our own goals for the foreseeable future and not able to help external contributors. Please see CONTRIBUTING.md for more information.
Docs are automatically generated for the public (externally-facing) API based on the OpenAPI spec that itself is automatically generated from the server implementation. You can generate your own docs for either the public API or any of the internal APIs by feeding the corresponding OpenAPI specs (in ./openapi) into an OpenAPI doc generator.
There are some internal design docs in the ./docs directory.
For more design documentation and internal Rust API docs, see the generated Rust documentation. You can generate this yourself with:
$ cargo doc --document-private-items
Note that --document-private-items
is configured by default, so you can actually just use cargo doc
.
Folks with access to Oxide RFDs may find RFD 48 ("Control Plane Requirements") and other control plane RFDs relevant. These are not currently publicly available.
Omicron has two modes of operation: "simulated" and "non-simulated".
The simulated version of Omicron allows the high-level control plane logic to run without actually managing any sled-local resources. This version can be executed on Linux, Mac, and illumos. This mode of operation is provided for development and testing only.
To build and run the simulated version of Omicron, see: docs/how-to-run-simulated.adoc.
The non-simulated version of Omicron actually manages sled-local resources, and may only be executed on hosts running Helios. This mode of operation will be used in production.
To build and run the non-simulated version of Omicron, see: docs/how-to-run.adoc.
You can format the code using cargo fmt
. Make sure to run this before pushing changes. The CI checks that the code is correctly formatted.
You can run the Clippy linter using cargo clippy --all-targets -- --deny warnings --allow clippy::style
. Make sure to run this before pushing changes. The CI checks that the code is clippy-clean.
This repo includes a Dockerfile that builds an image containing the Nexus and sled agent. There’s a GitHub Actions workflow that builds and publishes the Docker image. This is used by cli for testing. This is not the way Omicron will be deployed on production systems, but it’s a useful vehicle for working with it.
nexus
requires a TOML configuration file. There’s an example in
nexus/examples/config.toml.
Supported config properties include:
Name | Example | Required? | Description |
---|---|---|---|
|
|
Yes |
URL identifying the CockroachDB instance(s) to connect to. CockroachDB is used for all persistent data. |
|
Yes |
Dropshot configuration for the external server (i.e., the one that operators and developers using the Oxide rack will use). Specific properties are documented below, but see the Dropshot README for details. Note that this is an array of external address configurations; multiple may be supplied. |
|
|
|
Yes |
Specifies that the server should bind to the given IP address and TCP port for the external API (i.e., the one that operators and developers using the Oxide rack will use). In general, servers can bind to more than one IP address and port, but this is not (yet?) supported. |
|
|
Yes |
Specifies the maximum request body size for the external API. |
|
Yes |
Dropshot configuration for the internal server (i.e., the one used by the sled agent). Specific properties are documented below, but see the Dropshot README for details. |
|
|
|
Yes |
Specifies that the server should bind to the given IP address and TCP port for the internal API (i.e., the one used by the sled agent). In general, servers can bind to more than one IP address and port, but this is not (yet?) supported. |
|
|
Yes |
Specifies the maximum request body size for the internal API. |
|
|
Yes |
Unique identifier for this Nexus instance |
|
Yes |
Logging configuration. Specific properties are documented below, but see the Dropshot README for details. |
|
|
|
Yes |
Controls where server logging will go. Valid modes are |
|
|
Yes |
Specifies what severity of log messages should be included in the log. Valid values include |
|
|
Only if |
If |
|
|
Only if |
If |
The ClickHouse binary uses several sources for its configuration. The binary expects an XML
config file, usually named config.xml
to be available, or one may be specified with the
-C
command-line flag. The binary also includes a minimal configuration embedded within
it, which will be used if no configuration file is given or present in the current directory.
The server also accepts command-line flags for overriding the values of the configuration
parameters.
The packages downloaded by ci_download_clickhouse
include a config.xml
file with them.
You should probably run ClickHouse via the omicron-dev
tool, but if you decide to run it
manually, you can start the server with:
$ /path/to/clickhouse server --config-file /path/to/config.xml
The configuration file contains a large number of parameters, but most of them are described
with comments in the included config.xml
, or you may learn more about them
here
and here. Parameters may be updated
in the config.xml
, and the server will automatically reload them. You may also specify
many of them on the command-line with:
$ /path/to/clickhouse server --config-file /path/to/config.xml -- --param_name param_value ...
Each service is a Dropshot server that presents an HTTP API. The description of
that API is serialized as an
OpenAPI document which we store
in omicron/openapi
and check in to this repo. In order to
ensure that changes to those APIs are made intentionally, each service contains
a test that validates that the current API matches. This allows us 1. to catch
accidental changes as test failures and 2. to explicitly observe API changes
during code review (and in the git history).
We also use these OpenAPI documents as the source for the clients we generate using Progenitor. Clients are automatically updated when the coresponding OpenAPI document is modified.
Note that Omicron contains a nominally circular dependency:
-
Nexus depends on the Sled Agent client
-
The Sled Agent client is derived from the OpenAPI document emitted by Sled Agent
-
Sled Agent depends on the Nexus client
-
The Nexus client is derived from the OpenAPI document emitted by Nexus
We effectively "break" this circular dependency by virtue of the OpenAPI documents being checked in.
In general, changes any service API require the following set of build steps:
-
Make changes to the service API
-
Build the package for the modified service alone. This can be done by changing directories there, or
cargo build -p <package>
. This is step is important, to avoid the circular dependency at this point. One needs to update this one OpenAPI document, without rebuilding the other components that depend on a now-outdated spec. -
Update the OpenAPI document by running the relevant test with overwrite set:
EXPECTORATE=overwrite cargo test test_nexus_openapi_internal
(changing the test name as necessary) -
This will cause the generated client to be updated which may break the build for dependent consumers
-
Modify any dependent services to fix calls to the generated client
Note that if you make changes to both Nexus and Sled Agent simultaneously, you may end up in a spot where neither can build and therefore neither OpenAPI document can be generated. In this case, revert or comment out changes in one so that the OpenAPI document can be generated.