diff --git a/main/search/search_index.json b/main/search/search_index.json index f78d9b9b1..d6bac3d9e 100644 --- a/main/search/search_index.json +++ b/main/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Welcome","text":""},{"location":"#eclipse-ankaios","title":"Eclipse Ankaios","text":"Watch Eclipse Ankaios presentation at Eclipse SDV community day on July 6, 2023 on Youtube"},{"location":"#scope","title":"Scope","text":"
Eclipse Ankaios provides workload and container orchestration for automotive High Performance Computing (HPC) software. While it can be used for various fields of applications, it is developed from scratch for automotive use cases and provides a slim, yet powerful solution to manage containerized applications. It supports various container runtimes with Podman as the first one, but other container runtimes and even native applications can be supported. Eclipse Ankaios is independent of existing communication frameworks like SOME/IP, DDS, or REST API.
Eclipse Ankaios manages multiple nodes and virtual machines with a single unique API in order to start, stop, configure, and update containers and workloads. It provides a central place to manage automotive applications with a setup consisting of one server and multiple agents. Usually one agent per node connects to one or more runtimes that are running the workloads.
"},{"location":"#next-steps","title":"Next steps","text":"Eclipse Ankaios follows the UNIX philosophy to have one tool for one job and do that job well. It does not depend on a specific init system like systemd but can be started with any init system. It also does not handle persistency but can use an existing automotive persistency handling, e.g. provided by AUTOSAR Adaptive.
The workloads are provided access to the Eclipse Ankaios API using access control and thus are able to dynamically reconfigure the system. One possible use case is the dynamic startup of an application that is only required in a particular situation such as a parking assistant. When the driver wants to park the car, a control workload can start the parking assistant application. When the parking is finished, the parking assistant workload is stopped again.
Eclipse Ankaios also provides a CLI that allows developers to develop and test configurations. In order to gain compatibility with Kubernetes, Eclipse Ankaios accepts pod specifications.
An optional fleet connector can use the Eclipse Ankaios API to connect to a cloud-based software update system, which allows an OEM to manage a fleet of vehicles and provide new states to Eclipse Ankaios in order to update single or all applications.
In order to support the Automotive SPICE process, Eclipse Ankaios comes with requirements tracing supported by OpenFastTrace.
"},{"location":"architecture/","title":"Architecture","text":""},{"location":"architecture/#overview","title":"Overview","text":"Two executables are used for each Ankaios deployment: the Ankaios server and the Ankaios agent:
When started, the Ankaios server loads the configured startup manifest file of the cluster and stores it as a desired state. To reach this desired state, the server instructs the Ankaios agents to start and stop workloads. Each Ankaios cluster runs exactly one instance of the Ankaios server making the server the single source of truth.
A running instance of the Ankaios agent is present on every node where Ankaios needs to execute workloads. The Ankaios agent is responsible for starting and stopping workloads, according to the commands it gets from the Ankaios server.
The Ankaios server itself does not run workloads directly so in order to start workloads on the node running the server, an Ankaios agent shall be started there too.
Ankaios also allows workloads to change the state stored in the Ankaios server via the control interface. Workloads access this interface by sending their requests to the Ankaios agent managing them. Each request is checked by the Ankaios agent and, on successful authorization, forwarded to the Ankaios server. This interface can be used to, e.g.:
In the diagram above one of the workloads on node 1 acts as fleet connector. It accesses a backend and forwards commands to the Ankaios server. In the example below the fleet connector gets an update from the backend, which adds a workload to node 2.
"},{"location":"architecture/#notes","title":"Notes","text":"Join our developer mailing list for up to date information or sending questions.
"},{"location":"support/#discussion-forum","title":"Discussion forum","text":"If you have a general question, an idea or want to show how you use Ankaios, the discussion forum might be the right place for you.
"},{"location":"support/#issue","title":"Issue","text":"For reporting bugs or suggesting enhancements a new issue should be created using one of the templates if possible.
"},{"location":"support/#slack","title":"Slack","text":"Join the conversion with the community in the Ankaios Slack workspace.
"},{"location":"development/build/","title":"Build","text":""},{"location":"development/build/#dev-container","title":"Dev container","text":"The repo provides a Visual Studio Code dev container which includes all necessary tools to build all components and the documentation. It also contains Podman, which is needed to run the system tests for Ankaios. In case you want to extend the dev container see extending the dev container.
"},{"location":"development/build/#prerequisites","title":"Prerequisites","text":"As prerequisites, you need to have the following tools set up:
The following steps assume an x86_64 host. For Mac with Apple silicon, see chapter Build for arm64 target.
To build and test the Ankaios agent and server, run the following command inside the dev container:
cargo build\n
and for release
cargo build --release\n
As Ankaios uses musl for static linking, the binaries will be located in target/x86_64-unknown-linux-musl
.
The dev container adds required tools for arm64
architecture. To build Ankaios for arm64
, run the following command inside the dev container:
cargo build --target aarch64-unknown-linux-musl --release\n
Info
When using a dev container on Mac with Apple silicon and the build fails, change the file sharing implementation in Docker Desktop. Goto Docker Desktop and Settings
, then General
and change the file sharing implementation from VirtioFS
to gRPC FUSE
. See also eclipse-ankaios/ankaios#147.
A release shall be built directly using the CI/CD environment GitHub Actions. The release build creates and uploads all necessary artifacts that are required for a release.
"},{"location":"development/ci-cd-release/#release-branches","title":"Release branches","text":"In order to stabilize an upcoming release or to create a patch release, a release branch can be created. The naming convention for such a branch is:
release-<major>.<minor>\n
For example release-0.4
.
For building a release a separate workflow exists inside .github/workflows/release.yml
. The release workflow reuses the complete build workflow from .github/workflows/build.yml
and its artifacts.
This allows to avoid having to duplicate the steps of the build workflow into the release workflow and thus have a single point of change for the build workflow.
The release workflow executes the build workflow, exports the build artifacts into an archive for each supported platform and uploads it to the GitHub release.
As an example the following release artifacts are created for linux-amd64:
The tar.gz archive contains the pre-built binaries for the Ankaios CLI, Ankaios server and Ankaios agent. The *.sha512sum.txt file contains the sha-512 hash of the archive.
"},{"location":"development/ci-cd-release/#release-scripts","title":"Release scripts","text":"To package the desired release artifacts a separate script tools/create_release.sh
is called inside the release job. The script calls another script tools/create_artifacts.sh
for each platform that creates the artifacts mentioned above.
In addition, it exports the following:
Within the release workflow the build artifacts are downloaded into a temporary folder called dist
which has the following structure:
\u251c\u2500\u2500 coverage\n\u2502 \u251c\u2500\u2500 index.html\n\u2502 \u2514\u2500\u2500 style.css\n\u251c\u2500\u2500 linux-amd64\n\u2502 \u2514\u2500\u2500 bin\n\u2502 \u251c\u2500\u2500 ank\n\u2502 \u251c\u2500\u2500 ank-agent\n\u2502 \u2514\u2500\u2500 ank-server\n\u251c\u2500\u2500 linux-arm64\n\u2502 \u2514\u2500\u2500 bin\n\u2502 \u251c\u2500\u2500 ank\n\u2502 \u251c\u2500\u2500 ank-agent\n\u2502 \u2514\u2500\u2500 ank-server\n\u2514\u2500\u2500 req_tracing_report.html\n
The platform specific files are downloaded into a sub-folder dist/<os>-<platform>/bin
. Reports and shared artifacts are placed into the dist
folder directly.
The scripts expect this folder structure to create final release artifacts.
"},{"location":"development/ci-cd-release/#adding-a-new-platform","title":"Adding a new Platform","text":"If a new platform shall be supported the following steps must be done:
.github/workflows/build.yml
and configure the upload of the artifacts, see CI/CD section..github/workflows/release.yml
to download the new artifacts. Under jobs.release.steps
add a new step after the existing download steps and replace the parameters <os>-<platform>
with the correct text (e.g. linux-amd64): jobs:\n ...\n release:\n steps:\n ...\n - name: Download artifacts for ankaios-<os>-<platform>-bin\n uses: actions/download-artifact@v4.1.7\n with:\n name: ankaios-<os>-<platform>-bin\n path: dist/<os>-<platform>/bin\n ...\n
The name ankaios-<os>-<platform>-bin
must match the used name in the upload artifact action defined inside the build workflow (.github/workflows/build.yml
). 3. Inside tools/create_release.sh
script add a new call to the script tools/create_artifacts.sh
like the following:
...\n \"${SCRIPT_DIR}\"/create_artifacts.sh -p <os>-<platform>\n...\n
The <os>-<platform>
string must match the name of the sub-folder inside the dist folder. The called script expects the pre-built binaries inside <os>-<platform>/bin
.
.github/workflows/release.yml
. Inside the step that uploads the release artifacts add the new artifact(s) to the github upload command:...\nrun: |\n gh release upload ${{ github.ref_name }}\n ...\n <os>-<platform>/ankaios-<os>-<platform>.tar.gz \\\n <os>-<platform>/ankaios-<os>-<platform>.tar.gz.sha512sum.txt\n ...\n
tools/install.sh
and update the script if needed.The release notes are generated automatically if a release is created via the GitHub web frontend by clicking on the Generate release notes
button.
The procedure uses the filters for pull request labels configured inside .github/release.yml
.
The following steps shall be done before the actual release build is triggered.
tools/update_version.sh --release <new version>
).Before building the release, all preparation steps shall be finished before.
The release shall be created directly via the GitHub web frontend.
When creating a release a tag with the following naming convention must be provided: vX.Y.Z
(e.g. v0.1.0).
Draft a new release
.Generate release notes
to generate the release notes automatically based on the filter settings for pull requests inside .github/release.yml
configuration. In case of unwanted pull requests are listed, label the pull requests correctly, delete the description field and generate the release notes again (The correction of the labels and the regeneration of the release notes can also be done after the release build.).Set as the latest release
is enabled. This setting is important otherwise the provided link for the installation script in chapter installation is still pointing to the previous release marked as latest.Publish release
.Note
There is a GitHub Action available to automatically rollback the created release and tag. This action is not used to have a better control over the cleanup procedure before a next release build is triggered. For instance, without auto-rollback a manually entered release description is still available after a failing release build.
"},{"location":"development/ci-cd/","title":"CI/CD","text":"As CI/CD environment GitHub Actions is used. Merge verifications in case of opening a pull request and release builds are fully covered into GitHub Action workflows. For information about release builds, see CI/CD - Release section.
"},{"location":"development/ci-cd/#merge-verification","title":"Merge verification","text":"When a pull request is opened, the following pipeline jobs run:
After a pull request was merged into the main branch, the jobs listed above are executed again to validate stable branch behavior.
The steps for the build workflow are defined inside .github/workflows/build.yml
.
The produced artifacts of the build workflow are uploaded and can be downloaded from GitHub for debugging or testing purposes.
"},{"location":"development/ci-cd/#adding-a-new-merge-verification-job","title":"Adding a new merge verification job","text":"To add a new merge verification job adjust the workflow defined inside .github/workflows/build.yml
.
Select a GitHub runner image matching your purposes or in case of adding a cross-build first make sure that the build works locally within the dev container.
jobs
jobs section and define a job name.ankaios-<os>-<platform>-bin
(e.g. ankaios-linux-amd64-bin) otherwise define a custom name. If the artifact is needed inside a release the artifact is referenced with this name inside the release workflow. ...\n - uses: actions/upload-artifact@v4.3.3\n with:\n name: ankaios-<os>-<platform>-bin\n path: dist/\n ...\n
Note
GitHub Actions only runs workflow definitions from main (default) branch. That means when a workflow has been changed and a PR has been created for that, the change will not become effective before the PR is merged in main branch. For local testing the act tool can be used.
"},{"location":"development/ci-cd/#adding-a-new-github-action","title":"Adding a new GitHub action","text":"When introducing a new GitHub action, do not use a generic major version tag (e.g. vX
). Specify a specific release tag (e.g. vX.Y.Z
) instead. Using the generic tag might lead to an unstable CI/CD environment, whenever the authors of the GitHub action update the generic tag to point to a newer version that contains bugs or incompatibilities with the Ankaios project.
Example:
Bad:
...\n - uses: actions/checkout@v4\n...\n
Good:
...\n - uses: actions/checkout@v4.1.1\n...\n
"},{"location":"development/ci-cd/#adding-github-action-jobs","title":"Adding GitHub action jobs","text":"When creating a new job inside a workflow, specify a job name for each job.
Example:
...\n\njobs:\n test_and_build_linux_amd64:\n name: Test and Build Linux amd64\n...\n
Note
Beside being a best practice, giving a job a name is needed to reference it from the self-service repository in order to configure the job as a required status check.
"},{"location":"development/documentation-guidelines/","title":"Documentation guidelines","text":"These guidelines apply to all documentation which is created in the Ankaios project like this website, software design documents or README files. The aim is to support the creators of documents by enforcing a common look and feel.
"},{"location":"development/documentation-guidelines/#capitalization","title":"Capitalization","text":"As 'Ankaios' is a proper noun it shall be written with a capital 'A'. Other words which are not proper nouns shall be in lower case when they are not the first word in a sentence.
Examples:
Correct Incorrect Ankaios ankaios Ankaios server Ankaios-Server, Ankaios-server, Ankaios Server Ankaios agent Ankaios-Agent, Ankaios-agent, Ankaios Agent workload Workload control interface Control InterfaceThe same rule also applies to headlines, i.e. only the first word of a headline is in upper case.
"},{"location":"development/extending-dev-container/","title":"Extending the dev container","text":"The dev container is relatively large. If there is a need to include additional items in the dev container, please note that it is split into two parts due to its size:
A base container available from ghcr.io/eclipse-ankaios/devcontainer
which, in case of a change, needs to be build manually from .devcontainer/Dockerfile.base
(see below for instructions).
A docker container which derives from the base image mentioned above is specified in .devcontainer/Dockerfile
(so don't forget to reference your new version there once you build one).
If you want to add some additional tools, you can initially do it in .devcontainer/Dockerfile
, but later on they need to be pulled in the base image in order to speed up the initial dev container build.
The base container is available for amd64 and arm64/v8 architectures. There are two options to build the base container:
In case the multiplatform build is used, one image can be build natively on the host platform (usually amd64) while the other needs to be emulated.
Build the base container by running the following commands outside of the dev container:
# Prepare the build with buildx. Depending on you environment\n# the following steps might be necessary:\ndocker run --rm --privileged multiarch/qemu-user-static --reset -p yes --credential yes\n\n# Create and use a new builder. This needs to be called only once:\ndocker buildx create --name mybuilder --driver docker-container --bootstrap\ndocker buildx use mybuilder\n\n# Now build the new base image for the dev container\ncd .devcontainer\ndocker buildx build -t ghcr.io/eclipse-ankaios/devcontainer-base:<version> --platform linux/amd64,linux/arm64 -f Dockerfile.base .\n
In order to push the base image append --push
to the previous command.
Note: If you wish to locally test the base image in VSCode before proceeding, utilize the default builder and exclusively build for the default platform like
docker buildx use default\ndocker buildx build -t ghcr.io/eclipse-ankaios/devcontainer-base:<version> -f Dockerfile.base --load .\n
"},{"location":"development/extending-dev-container/#separate-builds-for-different-architectures","title":"Separate builds for different architectures","text":"Due to the emulation for the non-host architecture, the previous multiplatform build might take some time. An alternative is to build the two images separately on different hosts matching the target architecture. For arm64 for example cloud instances with ARM architecture (like AWS Graviton) can be used.
To build the base image this way, perform the following steps:
# On arm64 host: Build arm64 image\ncd .devcontainer\ndocker buildx build -t ghcr.io/eclipse-ankaios/devcontainer-base:<version>-arm64 -f Dockerfile.base --push .\n\n# On amd64 host: Build amd64 image\ncd .devcontainer\ndocker buildx build -t ghcr.io/eclipse-ankaios/devcontainer-base:<version>-amd64 -f Dockerfile.base --push .\n\n# On any host: Create manifest list referencing both images\ndocker buildx imagetools create \\\n -t ghcr.io/eclipse-ankaios/devcontainer-base:<version> \\\n ghcr.io/eclipse-ankaios/devcontainer-base:<version>-amd64 \\\n ghcr.io/eclipse-ankaios/devcontainer-base:<version>-arm64\n
"},{"location":"development/requirement-template/","title":"Requirement description template","text":"All requirements in Ankaios shall be written in the following format:
<Requirement title>\n`swdd~<component>-<descriptive requirement id>~<version>`\n\nStatus: approved\n\n[When <condition separated by and>], <object> shall <do something | be in state | execute a list of actions in order/in parallel | \u2026>\n\nComment:\n<comment body>\n\nRationale:\n<rationale body>\n\nTags:\n- <tag1>\n- <tag2>\n- \u2026\n\nNeeds:\n- [impl/utest/stest]\n
NOTE:
Here is an example of the requirement from the Ankaios agent:
#### AgentManager listens for requests from the Server\n`swdd~agent-manager-listens-requests-from-server~1`\n\nStatus: approved\n\nThe AgentManager shall listen for request from the Server.\n\nTags:\n- AgentManager\n\nNeeds:\n- impl\n- utest\n- itest\n
This requirement template has been inspired by:
https://aaltodoc.aalto.fi/server/api/core/bitstreams/d518c3cc-4d7d-4c69-b7db-25d2da9e847f/content
"},{"location":"development/requirement-tracing/","title":"Requirement tracing","text":""},{"location":"development/requirement-tracing/#introduction","title":"Introduction","text":"The Eclipse Ankaios project provides requirement tracing using the OpenFastTrace requirement tracing suite. The dev container already includes the required tooling. To generate a requirement tracing report call:
just trace-requirements\n
Afterwards the HTML report is available under build/req/req_tracing_report.html
and shows the current coverage state.
For details on the OpenFastTrace tool, please consult OFT's user documentation or execute oft help
.
Eclipse Ankaios traces requirements between
**/doc/README.md
)**/src/**
)**/src/**
, tests/**
)Thus, for new features:
swdd
)impl
, e.g., // [impl->swdd~this-is-a-requirement~1]
utest
, itest
or stest
depending on the type of the test, e.g., // [utest->swdd~this-is-a-requirement~1]
for a unit testThe format of a requirement is described in the next section Requirement description template.
"},{"location":"development/run-unit-tests/","title":"Unit tests with cargo-nextest","text":"We use test runner cargo-nextest because of the following reasons:
cargo test
.If you want to run all unit tests without traces, call in the root of the project:
cargo nextest run\n
Some unit tests can print trace logs. If you want to see them, you have to set the RUST_LOG
environment variable before running unit tests.
RUST_LOG=debug cargo nextest run\n
Cargo-nextest also allows to run only a subset of unit tests. You have to set the \"filter string\" in the command:
cargo nextest run <filter string>\n
Where the filter string
is part of unit test name. For example we have a unit test with the name:
test podman::workload::container_create_success\n
If you want to call only this test, you can call:
cargo nextest run workload::container_create_success\n
If you want to call all tests in workload.rs
, you have to call:
cargo nextest run podman::workload\n
You can also call only tests in workload.rs
, which have a name starting with container
:
cargo nextest run podman::workload::container\n
"},{"location":"development/rust-coding-guidelines/","title":"Rust coding guidelines","text":"When engaging in collaborative software projects, it is crucial to ensure that the code is well-organized and comprehensible. This facilitates ease of maintenance and allows for seamless extension of the project. To accomplish this objective, it is essential to establish shared guidelines that the entire development team adheres to.
The goal is to get a harmonized code-base which appears to come from the same hands. This simplifies reading and understanding the intention of the code and helps maintaining the development speed.
The following chapters describe rules and concepts to fit clean code expectations.
"},{"location":"development/rust-coding-guidelines/#clean-code","title":"Clean code","text":"We like our code clean and thus use the \"Clean Code\" rules from \"uncle Bob\". A short summary can be found here.
As rust could get a bit messy, feel free to add some additional code comments to blocks that cannot be made readable using the clean code rules.
"},{"location":"development/rust-coding-guidelines/#naming-conventions","title":"Naming conventions","text":"We follow the standard Rust naming conventions.
Names of components, classes , functions, etc. in code should also follow the prescriptions in SW design. Before thinking of new names, please make sure that we have not named the beast already.
Names of unit tests within a file shall be hierarchical. Tests which belong together shall have the same prefix. For example the file workload.rs
contains following tests:
container_create_success
container_create_failed
container_start_success
container_start_failure_no_id
So if you want to call tests which work with container, you can write
cargo nextest run container\n
If you want to call tests of the \"container create\" function, you can call:
cargo nextest run container_create\n
More information about calling unit tests is in The Rust Programming Language.
"},{"location":"development/rust-coding-guidelines/#logging-conventions","title":"Logging conventions","text":"The following chapters describe rules for creating log messages.
"},{"location":"development/rust-coding-guidelines/#log-format-of-internal-objects","title":"Log format of internal objects","text":"When writing log messages that reference internal objects, the objects shall be surrounded in single quotes, e.g.:
log::info!(\"This is about object '{}'.\", object.name)\n
This helps differentiate static from dynamic data in the log message.
"},{"location":"development/rust-coding-guidelines/#log-format-of-multiline-log-messages","title":"Log format of multiline log messages","text":"Multi line log messages shall be created with the concat!
macro, e.g.:
log::debug!(concat!(\n \"First line of a log message that lists something:\\n\",\n \" flowers are: '{}'\\n\",\n \" weather is: {}\")\n color, current_weather);\n
This ensures that the log messages are formatted correctly and simplifies writing the message.
"},{"location":"development/rust-coding-guidelines/#choose-a-suitable-log-severity","title":"Choose a suitable log severity","text":"Severity Use Case Trace A log that is useful for diagnostic purposes and/or more granular than severity debug. Debug A log that is useful for developers meant for debugging purposes or hit very often. Info A log communicating important information like important states of an application suitable for any kind of user and that does not pollute the output. Warn A log communicating wrong preconditions or occurrences of something unexpected but do not lead to a panic of the application. Error A log communicating failures and consequences causing a potential panic of the application."},{"location":"development/rust-coding-guidelines/#unit-test-convenience-rules","title":"Unit test convenience rules","text":"The following chapter describes important rules about how to write unit tests.
"},{"location":"development/rust-coding-guidelines/#test-mockobject-generation","title":"Test mock/object generation","text":"When writing tests, one of the most tedious task is to setup the environment and create the necessary objects and/or mocks to be able to test the desired functionality. Following the DRY principle and trying to save some effort, we shall always place the code that generates a test or mock object in the same module/file where the mock of the object is defined.
For example, when you would like to generate and reuse a mock for the Directory
structure located in the agent/src/control_interface/directory.rs
file, you shall
pub fn generate_test_directory_mock() -> __mock_MockDirectory::__new::Context;\n
The <datatype_name>
in __mock_Mock<datatype_name>::__new::Context
must be replaced with the name of the type the mock is created for.
#[cfg(test)]
(or #[cfg(feature = \"test_utils\")]
in case of a library) before the function to restrict its compilation to test onlyAll object/mock generation functions shall start with generate_test_
.
Bad:
let numbers = vec![1, 2, 3, 4, 5, 6, 7, 8];\n\nlet mut filtered_numbers = Vec::new();\n// filter numbers smaller then 3\nfor number in numbers {\n if number < 3 {\n filtered_numbers.push(number);\n }\n}\n
Good:
Prefer standard library algorithms over own implementations to avoid error prone code.
let numbers = vec![1, 2, 3, 4, 5, 6, 7, 8];\nlet filtered_numbers: Vec<i32> = numbers.into_iter().filter(|x| x < &3).collect();\n
"},{"location":"development/rust-coding-guidelines/#prefer-error-propagation","title":"Prefer error propagation","text":"Bad:
A lot of conditionals for opening and reading a file.
use std::fs::File;\nuse std::io;\nuse std::io::Read;\n\nfn read_from_file(filepath: &str) -> Result<String, io::Error> {\n let file_handle = File::open(filepath);\n let mut file_handle = match file_handle {\n Ok(file) => file,\n Err(e) => return Err(e),\n };\n\n let mut buffer = String::new();\n\n match file_handle.read_to_string(&mut buffer) {\n Ok(_) => Ok(buffer),\n Err(e) => Err(e)\n }\n}\n
Good:
Prefer error propagation over exhaustive match and conditionals.
Error propagation shortens and cleans up the code path by replacing complex and exhaustive conditionals with the ?
operator without loosing the failure checks.
The refactored variant populates the error and success case the same way to the caller like in the bad example above, but is more readable:
fn read_from_file(filepath: &str) -> Result<String, io::Error> {\n let mut buffer = String::new();\n File::open(filepath)?.read_to_string(&mut buffer)?;\n Ok(buffer)\n}\n
In case of mismatching error types, provide a custom From-Trait implementation to convert between error types to keep the benefits of using the ?
operator. But keep in mind that error conversion shall be used wisely (e.g. for abstracting third party library error types or if there is a benefit to introduce a common and reusable error type). The code base shall not be spammed with From-Trait implementations to replace each single match or conditional.
Error propagation shall also be preferred when converting between Result<T,E>
and Option<T>
.
Bad:
fn string_to_percentage(string: &str) -> Option<f32> {\n // more error handling\n match string.parse::<f32>() {\n Ok(value) => Some(value * 100.),\n _ => None,\n }\n}\n
Good:
fn string_to_percentage(string: &str) -> Option<f32> {\n // more error handling\n let value = string.parse::<f32>().ok()?; // returns None on parsing error\n Some(value * 100.)\n}\n
"},{"location":"development/rust-coding-guidelines/#avoid-unwrap-and-expect","title":"Avoid unwrap and expect","text":"Unwrap
or expect
return the value in success case or call the panic!
macro if the operation has failed. Applications that are often terminated directly in case of errors are considered as unprofessional and not useful.
Bad:
let value = division(10, 0).unwrap(); // panics, because of a simple division!!!\n
Good:
Replace unwrap
or expect
with a conditional check, e.g. match expression:
let value = division(10, 0); // division 10 / 0 not allowed, returns Err\n\n// conditional check before accessing the value\nmatch value {\n Ok(value) => println!(\"{value}\"),\n Err(e) => eprintln!(\"{e}\")\n}\n
or with if-let condition when match is awkward:
// access value only on success\nif let Ok(value) = division(10, 0) {\n println!(\"{value}\")\n}\n
or if possible continue with some default value in case of an error:
let result = division(10, 0).unwrap_or(0.);\n
Exceptions:
In some cases terminating a program might be necessary. To make a good decision when to panic a program or not, the official rust book might help: To panic! or Not to panic!
When writing unit tests using unwrap
helps to keep tests short and to concentrate on the assert!
statements:
Bad:
let container: Option<HashMap<i32, String>> = operation_under_test();\nmatch container {\n Some(container) => {\n match container.get(&0) {\n Some(value_of_0) => assert_eq!(value_of_0, &\"hello world\".to_string()),\n _ => { panic!(\"Test xy failed, no entry.\") }\n }\n },\n _ => { panic!(\"Test xy failed, no container.\") }\n}\n
Good:
Prefer direct unwrap
calls over assert!
statements nested in complex conditional clauses. It is shorter and the assert!
statement is directly eye-catching.
let container: Option<HashMap<i32, String>> = operation_under_test();\nlet value_of_0 = container.unwrap().remove(&0).unwrap(); // the test is failing on error\n\nassert_eq!(value_of_0, \"hello world\".to_string());\n
"},{"location":"development/rust-coding-guidelines/#prefer-while-let-over-match-in-loops","title":"Prefer while-let over match in loops","text":"Use the shorter and cleaner while-let expression to eliminate exhaustive match sequences in loops:
Bad:
loop {\n match generate() {\n Some(value) => println!(\"{value}\"),\n _ => { break; },\n }\n}\n
Good:
// if success use the value else break\n// ...or while let Ok(value) in case of Result<T,E> instead of Option<T>\nwhile let Some(value) = generate() {\n println!(\"{value}\")\n}\n
"},{"location":"development/rust-coding-guidelines/#prefer-lazily-evaluated-functional-chaining","title":"Prefer lazily evaluated functional chaining","text":"Bad:
Eagerly evaluated functions are always evaluated regardless of the success or error case. If the alternative is not taken potentially costly operations are performed unnecessarily.
let value = division(2., 10.);\nlet result = value.and(to_percentage(value)); // eagerly evaluated\n\nlet value = division(2., 10.);\nlet result = value.or(provide_complex_alternative()); // eagerly evaluated\n\nlet value = division(2., 10.);\nlet result = value.unwrap_or(generate_complex_default()); // eagerly evaluated\n
Good:
Lazily evaluated functions are only evaluated if the case actually occurs and are preferred if the alternatives provide costly operations.
let result = division(2., 10.).and_then(to_percentage); // lazily evaluated\n\nlet result = division(2., 10.).or_else(provide_complex_alternative); // lazily evaluated\n\nlet result = division(2., 10.).unwrap_or_else(generate_complex_default); // lazily evaluated\n
"},{"location":"development/rust-coding-guidelines/#avoid-exhaustive-nested-code","title":"Avoid exhaustive nested code","text":"Bad:
The code is hard to read and the interesting code path is not an eye-catcher.
fn list_books(&self) -> Option<Vec<String>> {\n if self.wifi {\n if self.login {\n if self.admin {\n return Some(get_list_of_books());\n } else {\n eprintln!(\"Expected login as admin.\");\n }\n } else {\n eprintln!(\"Expected login.\");\n }\n } else {\n eprintln!(\"Expected connection.\");\n }\n None\n}\n
Good:
Nest code only into 1 or 2 levels. Use early-exit pattern to reduce the nest level and to separate error handling code from code doing the actual logic.
fn list_books(&self) -> Option<Vec<String>> {\n if !self.wifi {\n eprintln!(\"Expected connection.\");\n return None;\n }\n\n if !self.login {\n eprintln!(\"Expected login.\");\n return None;\n }\n\n if !self.admin {\n eprintln!(\"Expected login as admin.\");\n return None;\n }\n\n // interesting part\n Some(get_list_of_books())\n}\n
As an alternative, when dealing with Option<T>
or Result<T,E>
use Rust's powerful combinators to keep the code readable.
Understanding and practicing important Rust idioms help to write code in an idiomatic way, meaning resolving a task by following the conventions of a given language. Writing idiomatic Rust code ensures a clean and consistent code base. Thus, please follow the guidelines of Idiomatic Rust.
"},{"location":"development/rust-coding-guidelines/#avoid-common-anti-patterns","title":"Avoid common anti-patterns","text":"There are a lot of Rust anti-patterns that shall not be used in general. To get more details about anti-patterns, see here.
"},{"location":"development/rust-coding-guidelines/#dont-make-sync-code-async","title":"Don't make sync code async","text":"Async code is mainly used for I/O intensive, network or background tasks (Databases, Servers) to allow executing such tasks in a non-blocking way, so that waiting times can be used reasonably for executing other operations. However operations that do not fit to async use cases and are called synchronously shall not be made async because there is no real benefit. Async code is more difficult to understand than synchronous code.
Bad:
No need for making those operations async, because they are exclusively called synchronously. It is just more syntax and the code raises more questions about the intent to the reader.
let result1 = operation1().await;\nlet result2 = operation2().await;\nlet result3 = operation3().await;\n
Good:
Keep it synchronous and thus simple.
let result1 = operation1();\nlet result2 = operation2();\nlet result3 = operation3();\n
"},{"location":"development/rust-coding-guidelines/#dont-mix-sync-and-async-code-without-proper-consideration","title":"Don\u2019t mix sync and async code without proper consideration","text":"Mixing sync and async code can lead to a number of problems, including performance issues, deadlocks, and race conditions. Avoid mixing async with sync code unless there is a good reason to do so.
"},{"location":"development/rust-coding-guidelines/#further-readings","title":"Further Readings","text":"The Eclipse Foundation offers self-service of GitHub resources. We are using this self-service to customize Github settings, for example to change branch protection rules or other important settings of the Ankaios project. The current GitHub configuration is hosted as code inside a separate repository called .eclipsefdn.
The settings are in jsonnet format and can be modified by contributors.
A detailed overview of the self-service please have a look into the self-service handbook.
"},{"location":"development/self-service/#process-of-changing-the-settings","title":"Process of changing the settings","text":"If a configuration needs to be changed the process is the following:
System tests are a critical phase of software testing, aimed at evaluating the entire software system as a whole to ensure that it meets its specified requirements and functions correctly in its intended environment. These tests are conducted after unit and integration testing and serve as a comprehensive validation of the software's readiness for deployment.
Here are key aspects of system tests:
End-to-End Evaluation: System tests assess the software's performance, functionality, and reliability in a real-world scenario, simulating the complete user journey. They cover all aspects of the system, from the user interface to the backend processes.
Functional and Non-Functional Testing: These tests not only verify that the software's features work as intended (functional testing) but also assess non-functional attributes like performance, scalability, security, and usability.
Scenario-Based Testing: Test scenarios are designed to replicate various user interactions, use cases, and business workflows. This includes testing different paths, inputs, and error conditions to ensure the system handles them correctly.
Interoperability Testing: In cases where the software interacts with external systems or components, system tests evaluate its compatibility and ability to communicate effectively with these external entities.
Data Integrity and Security: Ensuring the protection of sensitive data and the integrity of information is a critical part of system testing. This includes checking for vulnerabilities and ensuring compliance with security standards.
Performance Testing: Assessing the system's response times, resource utilization, and scalability under various load conditions to ensure it can handle expected levels of usage.
Regression Testing: System tests often include regression testing to ensure that new features or changes do not introduce new defects or disrupt existing functionality.
The Robot test framework, often referred to as just \"Robot Framework,\" is a popular open-source test automation framework used for automating test cases in various software applications. It is designed to be easy to use, highly readable, and adaptable for both beginners and experienced testers. It employs a keyword-driven approach, which means that test cases are written using a combination of keywords that represent actions, objects, and verifications. These keywords can be custom-defined by using Python programming language or come from libraries specific to the application under test. One of the standout features of Robot Framework is its human-readable syntax. Test cases are written in plain text composed with defined keywords, making it accessible to non-programmers and allowing stakeholders to understand and contribute to test case creation. Because of the ability to create custom keywords, a pool of domain specific and generic keywords could be defined to form an Ankaios project specific language for writing test cases.This makes it possible to directly use the test specifications written in natural language or the same wording of it to write automated test cases. This is the main reason why we use this test framework for system tests in Ankaios.
"},{"location":"development/system-tests/#system-tests-structure","title":"System tests structure","text":"ankaios # Ankaios root\n |--tests # Location for system tests and their resources\n | |--resources # Location for test resources\n | | |--configs # Location for test case specific start-up configuration files\n | | | |--default.yaml # A start-up configuration file\n | | | |--... <---------------- # Add more configuration files here!\n | | |\n | | |--ankaios_library.py # Ankaios keywords implementations\n | | |--ankaios.resource # Ankaios keywords\n | | |--variables.resource # Ankaios variables\n | | |--... <------------------- # Add more keywords and keywords implementation resources here!\n | |\n | |--stests # Location for system tests\n | | |--workloads # Location for tests with specific test subject focus e.g. \"workloads\" for tests related \"workloads\"\n | | | |--list_workloads.robot # A test suite testing \"list workloads\"\n | | | |--... <---------------- # Add more tests related to \"workloads\" here!\n | | |... <--------------------- # Add test subject focus here!\n
"},{"location":"development/system-tests/#system-test-creation","title":"System test creation","text":""},{"location":"development/system-tests/#a-generic-ankaios-system-test-structure","title":"A generic Ankaios system test structure","text":"The most common approach to create a robot test is using the space separated format where pieces of the data, such as keywords and their arguments, are separated from each others with two or more spaces. A basic Ankaios system test consists of the following sections:
# ./tests/stests/workloads/my_workload_stest.robot\n\n*** Settings ***\nDocumentation Add test suit documentation here. # Test suite documentation\nResource ../../resources/ankaios.resource # Ankaios specific keywords that forms the Ankaios domain language\nResource ../../resources/variables.resource # Ankaios variables e.g. CONFIGS_DIR\n\n*** Test Cases ***\n[Setup] Setup Ankaios\n# ADD YOUR SYSTEM TEST HERE!\n[Teardown] Clean up Ankaios\n
For more best practices about writing tests with Robot framework see here.
"},{"location":"development/system-tests/#behavior-driven-system-test","title":"Behavior-driven system test","text":"Behavior-driven tests (BDT) use natural language specifications to describe expected system behavior, fostering collaboration between teams and facilitating both manual and automated testing. It's particularly valuable for user-centric and acceptance testing, ensuring that software aligns with user expectations. The Robot test framework supports BDT, and this approach shall be preferred for writing system tests in Ankaios the project.
Generic structure of BDT:
*** Test Cases ***\n[Setup] Setup Ankaios\nGiven <preconditions>\nWhen <actions>\nThen <asserts>\n[Teardown] Clean up Ankaios\n
Example: System test testing listing of workloads.
*** Settings ***\nDocumentation Tests to verify that ank cli lists workloads correctly.\nResource ../../resources/ankaios.resource\nResource ../../resources/variables.resource\n\n*** Test Cases ***\nTest Ankaios CLI get workloads\n [Setup] Setup Ankaios\n # Preconditions\n Given Ankaios server is started with \"ank-server --startup-config ${CONFIGS_DIR}/default.yaml\"\n And Ankaios agent is started with \"ank-agent --name agent_B\"\n And all workloads of agent \"agent_B\" have an initial execution state\n And Ankaios agent is started with \"ank-agent --name agent_A\"\n And all workloads of agent \"agent_A\" have an initial execution state\n # Actions\n When user triggers \"ank -k get workloads\"\n # Asserts\n Then the workload \"nginx\" shall have the execution state \"Running\" on agent \"agent_A\"\n And the workload \"hello1\" shall have the execution state \"Removed\" from agent \"agent_B\"\n And the workload \"hello2\" shall have the execution state \"Succeeded\" on agent \"agent_B\"\n And the workload \"hello3\" shall have the execution state \"Succeeded\" on agent \"agent_B\"\n [Teardown] Clean up Ankaios\n
Note
For Ankaios manifests that are used for system tests, only images from ghcr.io should be used. A lot of other registries (docker.io, quay.io) apply rate limits which might cause failures when executing the system tests.
"},{"location":"development/system-tests/#run-long-runtime-system-tests-upon-merge-into-main","title":"Run long-runtime system tests upon merge into main","text":"To keep the pull request status check runtime short, system tests with a longer runtime (> 30-40 seconds) shall be excluded from the pull request CI/CD verification by assigning the tag \"non_execution_during_pull_request_verification\" directly to the test case. When the pull request is merged into the main branch, the system test is executed. A contributor shall check the test results of those system tests afterwards.
Example system test that runs only on merge into main:
...\n\n*** Test Cases ***\n...\n\nTest Ankaios Podman stops retries after reaching the retry attempt limit\n [Tags] non_execution_during_pull_request_verification\n [Setup] Run Keywords Setup Ankaios\n\n...\n
"},{"location":"development/system-tests/#system-test-execution","title":"System test execution","text":"Warning
The system tests will delete all Podman containers, pods and volume. We recomment to only execute the system tests in the dev container.
A shell script is provided for the easy execution of the system tests. The script does the following:
ank
, ank-server
and ank-agent
) are available at specified path.{Ankaios root folder}/target/robot_tests_result
.Generic syntax:
/workspaces/ankaios$ [ANK_BIN_DIR=path_to_ankaios_executables] tools/run_robot_tests <options> <directory or robot file>\n
If ANK_BIN_DIR is not provided the script looks in the path {Ankaios root folder}/target/x86_64-unknown-linux-musl/debug
for the Ankaios executables. The supported options are the same as of robot
cli, so for more detailed description about it see here.
Note: In order to be able to start podman
runtime in the dev container properly, the dev container needs to be run in privilege
mode.
/workspaces/ankaios$ tools/run_robot_tests.sh tests\n
Example output:
Use default executable directory: /workspaces/ankaios/tools/../target/x86_64-unknown-linux-musl/debug\nFound ank 0.1.0\nFound ank-server 0.1.0\nFound ank-agent 0.1.0\n==============================================================================\nTests\n==============================================================================\nTests.Stests\n==============================================================================\nTests.Stests.Workloads\n==============================================================================\nTests.Stests.Workloads.List Workloads :: List workloads test cases.\n==============================================================================\nTest Ankaios CLI get workloads | PASS |\n------------------------------------------------------------------------------\nTests.Stests.Workloads.List Workloads :: List workloads test cases. | PASS |\n1 test, 1 passed, 0 failed\n==============================================================================\nTests.Stests.Workloads.Update Workload :: Update workload test cases.\n==============================================================================\nTest Ankaios CLI update workload | PASS |\n------------------------------------------------------------------------------\nTests.Stests.Workloads.Update Workload :: Update workload test cases. | PASS |\n1 test, 1 passed, 0 failed\n==============================================================================\nTests.Stests.Workloads | PASS |\n2 tests, 2 passed, 0 failed\n==============================================================================\nTests.Stests | PASS |\n2 tests, 2 passed, 0 failed\n==============================================================================\nTests | PASS |\n2 tests, 2 passed, 0 failed\n==============================================================================\nOutput: /workspaces/ankaios/target/robot_tests_result/output.xml\nLog: /workspaces/ankaios/target/robot_tests_result/log.html\nReport: /workspaces/ankaios/target/robot_tests_result/report.html\n
"},{"location":"development/system-tests/#example-run-a-single-test-file","title":"Example: Run a single test file","text":"/workspaces/ankaios$ tools/run_robot_tests.sh tests/stests/workloads/list_workloads.robot\n
Example output:
Use default executable directory: /workspaces/ankaios/tools/../target/x86_64-unknown-linux-musl/debug\nFound ank 0.1.0\nFound ank-server 0.1.0\nFound ank-agent 0.1.0\n==============================================================================\nList Workloads :: List workloads test cases.\n==============================================================================\nTest Ankaios CLI get workloads | PASS |\n------------------------------------------------------------------------------\nList Workloads :: List workloads test cases. | PASS |\n1 test, 1 passed, 0 failed\n==============================================================================\nOutput: /workspaces/ankaios/target/robot_tests_result/output.xml\nLog: /workspaces/ankaios/target/robot_tests_result/log.html\nReport: /workspaces/ankaios/target/robot_tests_result/report.html\n
"},{"location":"development/system-tests/#integration-in-github-workflows","title":"Integration in GitHub workflows","text":"The execution of the system tests is integrated in the GitHub workflow build step and will be triggered on each commit on a pull request.
"},{"location":"development/test-coverage/","title":"Test coverage","text":"To generate the test coverage report, run the following commands in ankaios
workspace which is /home/vscode/workspaces/ankaios/
:
To print out directly into the console:
cov test\n
Or to produce a report in html:
cov test --html\n
The script outputs where to find the report html:
...\nFinished report saved to /workspaces/ankaios/target/llvm-cov/html\n
Note: By the first usage you might be asked for confirmation to install the llvm-tools-preview
tool.
While writing tests, you may want to execute only the tests in a certain file and check the reached coverage. To do so you can execute:
To print out directly into the console:
cov test ankaios_server\n
Or to produce a report in html:
cov test ankaios_server --html\n
Once the run is complete, you can check the report to see which lines are not covered yet.
"},{"location":"development/unit-verification/","title":"Unit verification","text":"This page defines which tools and processes are used in in this project for the purposes of software unit verification. The unit verification process is performed during implementation phase and is as automated as possible, one exception is the code review which cannot be done automatically. Automated unit test runs are executed by the CI build system as well as during the regular releasing process.
"},{"location":"development/unit-verification/#verification-tools-and-procedures","title":"Verification tools and procedures","text":"Ankaios development follows the guidelines specified in the Rust coding guidelines.
"},{"location":"development/unit-verification/#code-review","title":"Code review","text":"Code reviews are part of the implementation process and performed before code is merged to the main branch. Contributors create pull requests and request a review s.t. the process can be started. The review is performed by at least one committer who has good knowledge of the area under review. When all applicable review criteria and checklists are passed and reviewer(s) have accepted the change, code can be merged to the main branch.
"},{"location":"development/unit-verification/#verification-by-unit-test","title":"Verification by unit test","text":""},{"location":"development/unit-verification/#test-focus-and-goal","title":"Test focus and goal","text":"The objective of the unit test is to confirm the correct internal behavior of a software unit according to the design aspects documented in the SW design. A unit test will test the unit in the target environment by triggering unit methods/functions and verifying the behavior. Stubbed interfaces/mocking techniques can be used to meet the code coverage requirements. This means that unit tests shall be written according to the detailed requirements. Requirement source is SW design.
"},{"location":"development/unit-verification/#unit-test-case-naming-convention","title":"Unit test case naming convention","text":"By introducing a naming convention for unit test cases a harmonized test code-base can be achieved. This simplifies reading and understanding the intention of the unit test case. Please see the naming convention defined in Rust coding guidelines.
"},{"location":"development/unit-verification/#unit-test-organization","title":"Unit test organization","text":"The unit tests shall be written in the same file as the source code like suggested in the Rust Language Book and shall be prefixed with utest_
.
At the end of the file e.g. my_module/src/my_component.rs
:
...\nfn my_algorithm(input: i32) -> Vec<u8> {\n ...\n}\n\nasync fn my_async_function(input: i32) -> Vec<u8> {\n ...\n}\n...\n#[cfg(test)]\nmod tests {\n ...\n #[test]\n fn utest_my_algorithm_returns_empty_array_when_input_is_0_or_negative() {\n ...\n }\n\n #[tokio::test]\n async fn utest_my_async_function_returns_empty_array_when_input_is_0_or_negative() {\n ...\n }\n}\n
"},{"location":"development/unit-verification/#test-execution-and-reports","title":"Test Execution and Reports","text":"Unit test cases are executed manually by the developer during implementation phase and later automatically in CI builds. Unit test and coverage reports are generated and stored automatically by the CI build system. If unit test case fails before code is merged to main branch (merge verification), the merge is not allowed until the issue is fixed. If unit test case fails after the code is merged to main branch, it is reported via email and fixed via internal Jira ticket reported by the developer.
Regression testing is done by the CI build system.
"},{"location":"development/unit-verification/#goals-and-metrics","title":"Goals and Metrics","text":"The following table show how test coverage is currently shown in the coverage report:
Goal Metric Red Yellow Green Code coverage <80% >80% 100%Currently there is no proper way of explicitly excluding parts of the code from the test coverage report in order to get to an easily observable value of 100%. The explicitly excluded code would have a corresponding comment stating the reason for excluding it. As this is not possible, we would initially target at least 80% line coverage in each file.
"},{"location":"reference/_ankaios.proto/","title":"Protocol Documentation","text":""},{"location":"reference/_ankaios.proto/#table-of-contents","title":"Table of Contents","text":"control_api.proto
ank_base.proto
WorkloadStatesMap.AgentStateMapEntry
AddCondition
Scalar Value Types
Top
"},{"location":"reference/_ankaios.proto/#control_apiproto","title":"control_api.proto","text":"The Ankaios Control Interface is used in the communcation between a workload and Ankaios
The protocol consists of the following top-level message types:
ToAnkaios: workload -> ankaios
FromAnkaios: ankaios -> workload
This message informs the user of the Control Interface that the connection was closed by Ankaios. No more messages will be processed by Ankaios after this message is sent.
Field Type Label Description reason string A string containing the reason for closing the connection. "},{"location":"reference/_ankaios.proto/#fromankaios","title":"FromAnkaios","text":"Messages from the Ankaios server to e.g. the Ankaios agent.
Field Type Label Description response ank_base.Response A message containing a response to a previous request. connectionClosed ConnectionClosed A message sent by Ankaios to inform a workload that the connection to Anakios was closed. "},{"location":"reference/_ankaios.proto/#hello","title":"Hello","text":"This message is the first one that needs to be sent when a new connection to the Ankaios cluster is established. Without this message being sent all further request are rejected.
Field Type Label Description protocolVersion string The protocol version used by the calling component. "},{"location":"reference/_ankaios.proto/#toankaios","title":"ToAnkaios","text":"Messages to the Ankaios server.
Field Type Label Description hello Hello The fist message sent when a connection is established. The message is needed to make sure the connected components are compatible. request ank_base.Request A request to AnkaiosTop
"},{"location":"reference/_ankaios.proto/#ank_baseproto","title":"ank_base.proto","text":""},{"location":"reference/_ankaios.proto/#accessrightsrule","title":"AccessRightsRule","text":"A message containing an allow or deny rule.
Field Type Label Description stateRule StateRule Rule for getting or setting the state "},{"location":"reference/_ankaios.proto/#agentattributes","title":"AgentAttributes","text":"A message that contains attributes of the agent.
Field Type Label Description cpu_usage CpuUsage The cpu usage of the agent. free_memory FreeMemory The amount of free memory of the agent. "},{"location":"reference/_ankaios.proto/#agentmap","title":"AgentMap","text":"A nested map that provides the names of the connected agents and their optional attributes. The first level allows searches by agent name.
Field Type Label Description agents AgentMap.AgentsEntry repeated "},{"location":"reference/_ankaios.proto/#agentmapagentsentry","title":"AgentMap.AgentsEntry","text":"Field Type Label Description key string value AgentAttributes"},{"location":"reference/_ankaios.proto/#completestate","title":"CompleteState","text":"A message containing the complete state of the Ankaios system. This is a response to the CompleteStateRequest message.
Field Type Label Description desiredState State The state the user wants to reach. workloadStates WorkloadStatesMap The current execution states of the workloads. agents AgentMap The agents currently connected to the Ankaios cluster. "},{"location":"reference/_ankaios.proto/#completestaterequest","title":"CompleteStateRequest","text":"A message containing a request for the complete/partial state of the Ankaios system. This is usually answered with a CompleteState message.
Field Type Label Description fieldMask string repeated A list of symbolic field paths within the State message structure e.g. 'desiredState.workloads.nginx'. "},{"location":"reference/_ankaios.proto/#configarray","title":"ConfigArray","text":"Field Type Label Description values ConfigItem repeated"},{"location":"reference/_ankaios.proto/#configitem","title":"ConfigItem","text":"An enum type describing possible configuration objects.
Field Type Label Description String string array ConfigArray object ConfigObject "},{"location":"reference/_ankaios.proto/#configmap","title":"ConfigMap","text":"This is a workaround for proto not supporing optional maps
Field Type Label Description configs ConfigMap.ConfigsEntry repeated "},{"location":"reference/_ankaios.proto/#configmapconfigsentry","title":"ConfigMap.ConfigsEntry","text":"Field Type Label Description key string value ConfigItem"},{"location":"reference/_ankaios.proto/#configmappings","title":"ConfigMappings","text":"This is a workaround for proto not supporing optional maps
Field Type Label Description configs ConfigMappings.ConfigsEntry repeated "},{"location":"reference/_ankaios.proto/#configmappingsconfigsentry","title":"ConfigMappings.ConfigsEntry","text":"Field Type Label Description key string value string"},{"location":"reference/_ankaios.proto/#configobject","title":"ConfigObject","text":"Field Type Label Description fields ConfigObject.FieldsEntry repeated"},{"location":"reference/_ankaios.proto/#configobjectfieldsentry","title":"ConfigObject.FieldsEntry","text":"Field Type Label Description key string value ConfigItem"},{"location":"reference/_ankaios.proto/#controlinterfaceaccess","title":"ControlInterfaceAccess","text":"A message containing the parts of the control interface the workload as authorized to access. By default, all access is denied. Only if a matching allow rule is found, and no matching deny rules is found, the access is allowed.
Field Type Label Description allowRules AccessRightsRule repeated Rules allow the access denyRules AccessRightsRule repeated Rules denying the access "},{"location":"reference/_ankaios.proto/#cpuusage","title":"CpuUsage","text":"A message containing the CPU usage information of the agent.
Field Type Label Description cpu_usage uint32 expressed in percent, the formula for calculating: cpu_usage = (new_work_time - old_work_time) / (new_total_time - old_total_time) * 100 "},{"location":"reference/_ankaios.proto/#dependencies","title":"Dependencies","text":"This is a workaround for proto not supporing optional maps
Field Type Label Description dependencies Dependencies.DependenciesEntry repeated "},{"location":"reference/_ankaios.proto/#dependenciesdependenciesentry","title":"Dependencies.DependenciesEntry","text":"Field Type Label Description key string value AddCondition"},{"location":"reference/_ankaios.proto/#error","title":"Error","text":"Field Type Label Description message string"},{"location":"reference/_ankaios.proto/#executionstate","title":"ExecutionState","text":"A message containing information about the detailed state of a workload in the Ankaios system.
Field Type Label Description additionalInfo string The additional info contains more detailed information from the runtime regarding the execution state. agentDisconnected AgentDisconnected The exact state of the workload cannot be determined, e.g., because of a broken connection to the responsible agent. pending Pending The workload is going to be started eventually. running Running The workload is operational. stopping Stopping The workload is scheduled for stopping. succeeded Succeeded The workload has successfully finished its operation. failed Failed The workload has failed or is in a degraded state. notScheduled NotScheduled The workload is not scheduled to run at any agent. This is signalized with an empty agent in the workload specification. removed Removed The workload was removed from Ankaios. This state is used only internally in Ankaios. The outside world removed states are just not there. "},{"location":"reference/_ankaios.proto/#executionsstatesforid","title":"ExecutionsStatesForId","text":"A map providing the execution state of a specific workload for a given id. This level is needed as a workload could be running more than once on one agent in different versions.
Field Type Label Description idStateMap ExecutionsStatesForId.IdStateMapEntry repeated "},{"location":"reference/_ankaios.proto/#executionsstatesforididstatemapentry","title":"ExecutionsStatesForId.IdStateMapEntry","text":"Field Type Label Description key string value ExecutionState"},{"location":"reference/_ankaios.proto/#executionsstatesofworkload","title":"ExecutionsStatesOfWorkload","text":"A map providing the execution state of a workload for a given name.
Field Type Label Description wlNameStateMap ExecutionsStatesOfWorkload.WlNameStateMapEntry repeated "},{"location":"reference/_ankaios.proto/#executionsstatesofworkloadwlnamestatemapentry","title":"ExecutionsStatesOfWorkload.WlNameStateMapEntry","text":"Field Type Label Description key string value ExecutionsStatesForId"},{"location":"reference/_ankaios.proto/#freememory","title":"FreeMemory","text":"A message containing the amount of free memory of the agent.
Field Type Label Description free_memory uint64 expressed in bytes "},{"location":"reference/_ankaios.proto/#request","title":"Request","text":"A message containing a request to the Ankaios server to update the state or to request the complete state of the Ankaios system.
Field Type Label Description requestId string updateStateRequest UpdateStateRequest A message to Ankaios server to update the state of one or more agent(s). completeStateRequest CompleteStateRequest A message to Ankaios server to request the complete state by the given request id and the optional field mask. "},{"location":"reference/_ankaios.proto/#response","title":"Response","text":"A message containing a response from the Ankaios server to a particular request. The response content depends on the request content previously sent to the Ankaios server.
Field Type Label Description requestId string error Error completeState CompleteState UpdateStateSuccess UpdateStateSuccess "},{"location":"reference/_ankaios.proto/#state","title":"State","text":"A message containing the state information.
Field Type Label Description apiVersion string The current version of the API. workloads WorkloadMap A mapping from workload names to workload configurations. configs ConfigMap Configuration values which can be referenced in workload configurations. "},{"location":"reference/_ankaios.proto/#staterule","title":"StateRule","text":"Message containing a rule for getting or setting the state
Field Type Label Description operation ReadWriteEnum Defines which actions are allowed filterMasks string repeated Pathes definind what can be accessed. Segements of path can be a wildcare \"*\". "},{"location":"reference/_ankaios.proto/#tag","title":"Tag","text":"A message to store a tag.
Field Type Label Description key string The key of the tag. value string The value of the tag. "},{"location":"reference/_ankaios.proto/#tags","title":"Tags","text":"This is a workaround for proto not supporing optional repeated values
Field Type Label Description tags Tag repeated "},{"location":"reference/_ankaios.proto/#updatestaterequest","title":"UpdateStateRequest","text":"A message containing a request to update the state of the Ankaios system. The new state is provided as state object. To specify which part(s) of the new state object should be updated a list of update mask (same as field mask) paths needs to be provided.
Field Type Label Description newState CompleteState The new state of the Ankaios system. updateMask string repeated A list of symbolic field paths within the state message structure e.g. 'desiredState.workloads.nginx' to specify what to be updated. "},{"location":"reference/_ankaios.proto/#updatestatesuccess","title":"UpdateStateSuccess","text":"A message from the server containing the ids of the workloads that have been started and stopped in response to a previously sent UpdateStateRequest.
Field Type Label Description addedWorkloads string repeated Workload istance names of workloads which will be started deletedWorkloads string repeated Workload instance names of workloads which will be stopped "},{"location":"reference/_ankaios.proto/#workload","title":"Workload","text":"A message containing the configuration of a workload.
Field Type Label Description agent string optional The name of the owning Agent. restartPolicy RestartPolicy optional An enum value that defines the condition under which a workload is restarted. dependencies Dependencies A map of workload names and expected states to enable a synchronized start of the workload. tags Tags A list of tag names. runtime string optional The name of the runtime e.g. podman. runtimeConfig string optional The configuration information specific to the runtime. controlInterfaceAccess ControlInterfaceAccess configs ConfigMappings A mapping containing the configurations assigned to the workload. "},{"location":"reference/_ankaios.proto/#workloadinstancename","title":"WorkloadInstanceName","text":"Field Type Label Description workloadName string The name of the workload. agentName string The name of the owning Agent. id string A unique identifier of the workload."},{"location":"reference/_ankaios.proto/#workloadmap","title":"WorkloadMap","text":"This is a workaround for proto not supporing optional maps Workload names shall not be shorter than 1 symbol longer then 63 symbols and can contain only regular characters, digits, the \"-\" and \"_\" symbols.
Field Type Label Description workloads WorkloadMap.WorkloadsEntry repeated "},{"location":"reference/_ankaios.proto/#workloadmapworkloadsentry","title":"WorkloadMap.WorkloadsEntry","text":"Field Type Label Description key string value Workload"},{"location":"reference/_ankaios.proto/#workloadstate","title":"WorkloadState","text":"A message containing the information about the workload state.
Field Type Label Description instanceName WorkloadInstanceName executionState ExecutionState The workload execution state. "},{"location":"reference/_ankaios.proto/#workloadstatesmap","title":"WorkloadStatesMap","text":"A nested map that provides the execution state of a workload in a structured way. The first level allows searches by agent.
Field Type Label Description agentStateMap WorkloadStatesMap.AgentStateMapEntry repeated "},{"location":"reference/_ankaios.proto/#workloadstatesmapagentstatemapentry","title":"WorkloadStatesMap.AgentStateMapEntry","text":"Field Type Label Description key string value ExecutionsStatesOfWorkload"},{"location":"reference/_ankaios.proto/#addcondition","title":"AddCondition","text":"An enum type describing the expected workload state. Used for dependency management.
Name Number Description ADD_COND_RUNNING 0 The workload is operational. ADD_COND_SUCCEEDED 1 The workload has successfully exited. ADD_COND_FAILED 2 The workload has exited with an error or could not be started. "},{"location":"reference/_ankaios.proto/#agentdisconnected","title":"AgentDisconnected","text":"The exact state of the workload cannot be determined, e.g., because of a broken connection to the responsible agent.
Name Number Description AGENT_DISCONNECTED 0 "},{"location":"reference/_ankaios.proto/#failed","title":"Failed","text":"The workload has failed or is in a degraded state.
Name Number Description FAILED_EXEC_FAILED 0 The workload has failed during operation FAILED_UNKNOWN 1 The workload is in an unsupported by Ankaios runtime state. The workload was possibly altered outside of Ankaios. FAILED_LOST 2 The workload cannot be found anymore. The workload was possibly altered outside of Ankaios or was auto-removed by the runtime. "},{"location":"reference/_ankaios.proto/#notscheduled","title":"NotScheduled","text":"The workload is not scheduled to run at any agent. This is signalized with an empty agent in the workload specification.
Name Number Description NOT_SCHEDULED 0 "},{"location":"reference/_ankaios.proto/#pending","title":"Pending","text":"The workload is going to be started eventually.
Name Number Description PENDING_INITIAL 0 The workload specification has not yet being scheduled PENDING_WAITING_TO_START 1 The start of the workload will be triggered once all its dependencies are met. PENDING_STARTING 2 Starting the workload was scheduled at the corresponding runtime. PENDING_STARTING_FAILED 8 The starting of the workload by the runtime failed. "},{"location":"reference/_ankaios.proto/#readwriteenum","title":"ReadWriteEnum","text":"An enum type describing which action is allowed.
Name Number Description RW_NOTHING 0 Allow nothing RW_READ 1 Allow read RW_WRITE 2 Allow write RW_READ_WRITE 5 Allow read and write "},{"location":"reference/_ankaios.proto/#removed","title":"Removed","text":"The workload was removed from Ankaios. This state is used only internally in Ankaios. The outside world removed states are just not there.
Name Number Description REMOVED 0 "},{"location":"reference/_ankaios.proto/#restartpolicy","title":"RestartPolicy","text":"An enum type describing the restart behavior of a workload.
Name Number Description NEVER 0 The workload is never restarted. Once the workload exits, it remains in the exited state. ON_FAILURE 1 If the workload exits with a non-zero exit code, it will be restarted. ALWAYS 2 The workload is restarted upon termination, regardless of the exit code. "},{"location":"reference/_ankaios.proto/#running","title":"Running","text":"The workload is operational.
Name Number Description RUNNING_OK 0 The workload is operational. "},{"location":"reference/_ankaios.proto/#stopping","title":"Stopping","text":"The workload is scheduled for stopping.
Name Number Description STOPPING 0 The workload is being stopped. STOPPING_WAITING_TO_STOP 1 The deletion of the workload will be triggered once neither 'pending' nor 'running' workload depending on it exists. STOPPING_REQUESTED_AT_RUNTIME 2 This is an Ankaios generated state returned when the stopping was explicitly trigged by the user and the request was sent to the runtime. STOPPING_DELETE_FAILED 8 The deletion of the workload by the runtime failed. "},{"location":"reference/_ankaios.proto/#succeeded","title":"Succeeded","text":"The workload has successfully finished operation.
Name Number Description SUCCEEDED_OK 0 The workload has successfully finished operation."},{"location":"reference/_ankaios.proto/#scalar-value-types","title":"Scalar Value Types","text":".proto Type Notes C++ Java Python Go C# PHP Ruby double double double float float64 double float Float float float float float float32 float float Float int32 Uses variable-length encoding. Inefficient for encoding negative numbers \u2013 if your field is likely to have negative values, use sint32 instead. int32 int int int32 int integer Bignum or Fixnum (as required) int64 Uses variable-length encoding. Inefficient for encoding negative numbers \u2013 if your field is likely to have negative values, use sint64 instead. int64 long int/long int64 long integer/string Bignum uint32 Uses variable-length encoding. uint32 int int/long uint32 uint integer Bignum or Fixnum (as required) uint64 Uses variable-length encoding. uint64 long int/long uint64 ulong integer/string Bignum or Fixnum (as required) sint32 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. int32 int int int32 int integer Bignum or Fixnum (as required) sint64 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. int64 long int/long int64 long integer/string Bignum fixed32 Always four bytes. More efficient than uint32 if values are often greater than 2^28. uint32 int int uint32 uint integer Bignum or Fixnum (as required) fixed64 Always eight bytes. More efficient than uint64 if values are often greater than 2^56. uint64 long int/long uint64 ulong integer/string Bignum sfixed32 Always four bytes. int32 int int int32 int integer Bignum or Fixnum (as required) sfixed64 Always eight bytes. int64 long int/long int64 long integer/string Bignum bool bool boolean boolean bool bool boolean TrueClass/FalseClass string A string must always contain UTF-8 encoded or 7-bit ASCII text. string String str/unicode string string string String (UTF-8) bytes May contain any arbitrary sequence of bytes. string ByteString str []byte ByteString string String (ASCII-8BIT)"},{"location":"reference/complete-state/","title":"Working with CompleteState","text":""},{"location":"reference/complete-state/#completestate","title":"CompleteState","text":"The complete state data structure CompleteState is used for building a request to Ankaios server to change or receive the state of the Ankaios system. It contains the desiredState
which describes the state of the Ankaios system the user wants to have, the workloadStates
which gives the information about the execution state of all the workloads and the agents
field containing the names of the Ankaios agents that are currently connected to the Ankaios server. By using of CompleteState in conjunction with the object field mask specific parts of the Ankaios state could be retrieved or updated.
Example: ank -k get state
returns the complete state of Ankaios system:
Note
The instructions assume the default installation without mutual TLS (mTLS) for communication. With -k
or --insecure
the ank
CLI will connect without mTLS. Alternatively, set the environment variable ANK_INSECURE=true
to avoid passing the argument to each ank
CLI command. For an Ankaios setup with mTLS, see here.
desiredState:\n apiVersion: v0.1\n workloads:\n hello-pod:\n agent: agent_B\n tags:\n - key: owner\n value: Ankaios team\n dependencies: {}\n restartPolicy: NEVER\n runtime: podman-kube\n runtimeConfig: |\n manifest: |\n apiVersion: v1\n kind: Pod\n metadata:\n name: hello-pod\n spec:\n restartPolicy: Never\n containers:\n - name: looper\n image: alpine:latest\n command:\n - sleep\n - 50000\n - name: greater\n image: alpine:latest\n command:\n - echo\n - \"Hello from a container in a pod\"\n configs: {}\n hello1:\n agent: agent_B\n tags:\n - key: owner\n value: Ankaios team\n dependencies: {}\n runtime: podman\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--rm\"]\n commandArgs: [ \"echo\", \"Hello Ankaios\"]\n configs: {}\n hello2:\n agent: agent_B\n tags:\n - key: owner\n value: Ankaios team\n dependencies: {}\n restartPolicy: ALWAYS\n runtime: podman\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--entrypoint\", \"/bin/sh\" ]\n commandArgs: [ \"-c\", \"echo 'Always restarted.'; sleep 2\"]\n configs: {}\n nginx:\n agent: agent_A\n tags:\n - key: owner\n value: Ankaios team\n dependencies: {}\n restartPolicy: ON_FAILURE\n runtime: podman\n runtimeConfig: |\n image: docker.io/nginx:latest\n commandOptions: [\"-p\", \"8081:80\"]\n configs: {}\n configs: {}\nworkloadStates: []\nagents: {}\n
It is not necessary to provide the whole structure of the the CompleteState data structure when using it in conjunction with the object field mask. It is sufficient to provide the relevant branch of the CompleteState object. As an example, to change the restart behavior of the nginx workload, only the relevant branch of the CompleteState needs to be provided:
desiredState:\n workloads:\n nginx:\n restartPolicy: ALWAYS\n
Note
In case of workload names, the naming convention states that their names shall: - contain only regular upper and lowercase characters (a-z and A-Z), numbers and the symbols \"-\" and \"\" - have a minimal length of 1 character - have a maximal length of 63 characters Also, agent name shall contain only regular upper and lowercase characters (a-z and A-Z), numbers and the symbols \"-\" and \"\".
"},{"location":"reference/complete-state/#object-field-mask","title":"Object field mask","text":"With the object field mask only specific parts of the Ankaios state could be retrieved or updated. The object field mask can be constructed using the field names of the CompleteState data structure:
<top level field name>.<second level field name>.<third level field name>.<...>\n
Example: ank -k get state desiredState.workloads.nginx
returns only the information about nginx workload:
desiredState:\n apiVersion: v0.1\n workloads:\n nginx:\n agent: agent_A\n tags:\n - key: owner\n value: Ankaios team\n dependencies: {}\n restartPolicy: ALWAYS\n runtime: podman\n runtimeConfig: |\n image: docker.io/nginx:latest\n commandOptions: [\"-p\", \"8081:80\"]\n configs: {}\n
Example ank -k get state desiredState.workloads.nginx.runtimeConfig
returns only the runtime configuration of nginx workload:
desiredState:\n apiVersion: v0.1\n workloads:\n nginx:\n runtimeConfig: |\n image: docker.io/nginx:latest\n commandOptions: [\"-p\", \"8081:80\"]\n
Example ank -k set state desiredState.workloads.nginx.restartPolicy new-state.yaml
changes the restart behavior of nginx workload to NEVER
:
desiredState:\n workloads:\n nginx:\n restartPolicy: NEVER\n
The control interface allows the workload developers to easily integrate the communication between the Ankaios system and their applications.
Note
The control interface is currently only available for workloads using the podman
runtime and not for the podman-kube
runtime.
flowchart TD\n a1(Ankaios Agent 1)\n w1(Workload 1)\n w2(Workload 2)\n a2(Ankaios Agent 2)\n w3(Workload 3)\n w4(Workload 4)\n s(Ankaios server)\n\n\n s <--> a1 <-->|Control Interface| w1 & w2\n s <--> a2 <-->|Control Interface| w3 & w4
The control interface enables a workload to communicate with the Ankaios system by interacting with the Ankaios server through writing/reading communication data to/from the provided FIFO files in the FIFO mount point.
"},{"location":"reference/control-interface/#authorization","title":"Authorization","text":"Ankaios checks for each request from a workload to the control interface, if the workload is authorized. The authorization is configured for each workload using controlInterfaceAccess
. A workload without controlInterfaceAccess
configuration is denied all actions on the control interface. The authorization configuration consists of allow and deny rules. Each rule defines the operation (e.g. read) the workload is allowed to execute and with which filter masks it is allowed to execute this operation.
A filter mask describes a path in the CompleteState object. The segments of the path are divided by the '.' symbol. Segments can also be the wildcard character '*', indicating this segment shall match every possible field. E.g. desiredState.workloads.*.tag
allows access to the tags of all workloads.
In an allow rule the path gives access to the exact path and also all subfields. E.g. an allow rule with desiredState.workloads.example
would also give access to desiredState.workload.example.tags
. In a deny rule the path prohibits access to the exact path and also all parent fields. E.g. a deny rule with desiredState.workloads.example
would also deny access to desiredState.workloads
, but has no effect on desiredState.workloads.other_example
.
Every request not allowed by a rule in controlInterfaceAccess
is prohibited. Every request allowed by a rule, but denied by another rule is also prohibited. E.g. with an allow rule for path desiredState.workloads.*.agent
and a deny rule for desiredState.workloads.controller
, a workload would be allowed to change the agent of each workload, except for the controller
workload.
flowchart TD\n a1(Ankaios Agent 1)\n w1(Workload 1)\n w2(Workload 2)\n s(Ankaios server)\n\n\n s <--> a1 <-->|\"/run/ankaios/control_interface/{input,output}\"| w1 & w2
The control interface relies on FIFO (also known as named pipes) to enable a workload to communicate with the Ankaios system. For that purpose, Ankaios creates a mount point for each workload to store the FIFO files. At the mount point /run/ankaios/control_interface/
the workload developer can find the FIFO files input
and output
and use them for the communication with the Ankaios server. Ankaios uses its own communication protocol described in protocol documentation as a protobuf IDL which allows the client code to be generated in any programming language supported by the protobuf compiler. The generated client code can then be integrated and used in a workload.
flowchart TD\n proto(\"ankaios.proto\")\n gen_code(\"Generated Client Code\")\n workload(\"Workload\")\n\n proto -->|generate code with protoc| gen_code\n workload-->|uses| gen_code
In order to enable the communication between a workload and the Ankaios system, the workload needs to make use of the control interface by sending and processing serialized messages defined in ankaios.proto
via writing to and reading from the provided FIFO files output
and input
found in the mount point /run/ankaios/control_interface/
. By using the protobuf compiler (protoc) code in any programming language supported by the protobuf compiler can be generated. The generated code contains functions for serializing and deserializing the messages to and from the Protocol Buffers binary format.
The messages are encoded using the length-delimited wire type format and layout inside the FIFO file according to the following visualization:
Every protobuf message is prefixed with its byte length telling the reader how much bytes to read to consume the protobuf message. The byte length has a dynamic length and is encoded as VARINT.
"},{"location":"reference/control-interface/#control-interface-examples","title":"Control interface examples","text":"The subfolder examples
inside the Ankaios repository contains example workload applications in various programming languages that are using the control interface. They demonstrate how to easily use the control interface in self-developed workloads. All examples share the same behavior regardless of the programming language and are simplified to focus on the usage of the control interface. Please note that the examples are not are not optimized for production usage.
The following sections showcase in Rust some important parts of the communication with the Ankaios cluster using the control interface. The same concepts are also used in all of the example workload applications.
"},{"location":"reference/control-interface/#sending-request-message-from-a-workload-to-ankaios-server","title":"Sending request message from a workload to Ankaios server","text":"To send out a request message from the workload to the Ankaios server the request message needs to be serialized using the generated serializing function, then encoded as length-delimited protobuf message and then written directly into the output
FIFO file. The type of request message is ToAnkaios.
flowchart TD\n begin([Start])\n req_msg(Fill ToAnkaios message)\n ser_msg(Serialize ToAnkaios message using the generated serializing function)\n enc_bytes(Encode as length-delimited varint)\n output(\"Write encoded bytes to /run/ankaios/control_interface/output\")\n fin([end])\n\n begin --> req_msg\n req_msg --> ser_msg\n ser_msg -->enc_bytes\n enc_bytes --> output\n output --> fin
Send request message via control interface Code snippet in Rust for sending request message via control interface:
use api::ank_base::{Workload, RestartPolicy, Tag, UpdateStateRequest, Request, request::RequestContent, CompleteState, State};\nuse api::control_api::{ToAnkaios, to_ankaios::ToAnkaiosEnum};\nuse prost::Message;\nuse std::{collections::HashMap, fs::File, io::Write, path::Path};\n\nconst ANKAIOS_CONTROL_INTERFACE_BASE_PATH: &str = \"/run/ankaios/control_interface\";\n\nfn create_update_workload_request() -> ToAnkaios {\n let new_workloads = HashMap::from([(\n \"dynamic_nginx\".to_string(),\n Workload {\n runtime: \"podman\".to_string(),\n agent: \"agent_A\".to_string(),\n restart_policy: RestartPolicy::Never.into(),\n tags: vec![Tag {\n key: \"owner\".to_string(),\n value: \"Ankaios team\".to_string(),\n }],\n runtime_config: \"image: docker.io/library/nginx\\ncommandOptions: [\\\"-p\\\", \\\"8080:80\\\"]\"\n .to_string(),\n dependencies: HashMap::new(),\n },\n )]);\n\n ToAnkaios {\n to_ankaios_enum: Some(ToAnkaiosEnum::Request(Request {\n request_id: \"request_id\".to_string(),\n request_content: Some(RequestContent::UpdateStateRequest(\n UpdateStateRequest {\n new_state: Some(CompleteState {\n desired_state: Some(State {\n api_version: \"v0.1\".to_string(),\n workloads: new_workloads,\n }),\n ..Default::default()\n }),\n update_mask: vec![\"desiredState.workloads.dynamic_nginx\".to_string()],\n },\n )),\n })),\n }\n}\n\nfn write_to_control_interface() {\n let pipes_location = Path::new(ANKAIOS_CONTROL_INTERFACE_BASE_PATH);\n let sc_req_fifo = pipes_location.join(\"output\");\n\n let mut sc_req = File::create(&sc_req_fifo).unwrap();\n\n let protobuf_update_workload_request = create_update_workload_request();\n\n println!(\"{}\", &format!(\"Sending UpdateStateRequest containing details for adding the dynamic workload \\\"dynamic_nginx\\\": {:#?}\", protobuf_update_workload_request));\n\n sc_req\n .write_all(&protobuf_update_workload_request.encode_length_delimited_to_vec())\n .unwrap();\n}\n\nfn main() {\n write_to_control_interface();\n}\n
"},{"location":"reference/control-interface/#processing-response-message-from-ankaios-server","title":"Processing response message from Ankaios server","text":"To process a response message from the Ankaios server the workload needs to read out the bytes from the input
FIFO file. As the bytes are encoded as length-delimited protobuf message with a variable length, the length needs to be decoded and extracted first. Then the length can be used to decode and deserialize the read bytes to a response message object for further processing. The type of the response message is FromAnkaios.
flowchart TD\n begin([Start])\n input(\"Read bytes from /run/ankaios/control_interface/input\")\n dec_length(Get length from read length delimited varint encoded bytes)\n deser_msg(Decode and deserialize FromAnkaios message using decoded length and the generated functions)\n further_processing(Process FromAnkaios message object)\n fin([end])\n\n begin --> input\n input --> dec_length\n dec_length --> deser_msg\n deser_msg --> further_processing\n further_processing --> fin
Read response message via control interface Code Snippet in Rust for reading response message via control interface:
use api::control_api::FromAnkaios;\nuse prost::Message;\nuse std::{fs::File, io, io::Read, path::Path};\n\nconst ANKAIOS_CONTROL_INTERFACE_BASE_PATH: &str = \"/run/ankaios/control_interface\";\nconst MAX_VARINT_SIZE: usize = 19;\n\nfn read_varint_data(file: &mut File) -> Result<[u8; MAX_VARINT_SIZE], io::Error> {\n let mut res = [0u8; MAX_VARINT_SIZE];\n let mut one_byte_buffer = [0u8; 1];\n for item in res.iter_mut() {\n file.read_exact(&mut one_byte_buffer)?;\n *item = one_byte_buffer[0];\n // check if most significant bit is set to 0 if so it is the last byte to be read\n if *item & 0b10000000 == 0 {\n break;\n }\n }\n Ok(res)\n}\n\nfn read_protobuf_data(file: &mut File) -> Result<Box<[u8]>, io::Error> {\n let varint_data = read_varint_data(file)?;\n let mut varint_data = Box::new(&varint_data[..]);\n\n // determine the exact size for exact reading of the bytes later by decoding the varint data\n let size = prost::encoding::decode_varint(&mut varint_data)? as usize;\n\n let mut buf = vec![0; size];\n file.read_exact(&mut buf[..])?; // read exact bytes from file\n Ok(buf.into_boxed_slice())\n}\n\nfn read_from_control_interface() {\n let pipes_location = Path::new(ANKAIOS_CONTROL_INTERFACE_BASE_PATH);\n let ex_req_fifo = pipes_location.join(\"input\");\n\n let mut ex_req = File::open(&ex_req_fifo).unwrap();\n\n loop {\n if let Ok(binary) = read_protobuf_data(&mut ex_req) {\n let proto = FromAnkaios::decode(&mut Box::new(binary.as_ref()));\n\n println!(\"{}\", &format!(\"Received FromAnkaios message containing the response from the server: {:#?}\", proto));\n }\n }\n}\n\nfn main() {\n read_from_control_interface();\n}\n
"},{"location":"reference/glossary/","title":"Glossary","text":"This glossary is intended to be a comprehensive, uniform list of Ankaios terminology. It consists of technical terms specific to Ankaios, as well as more general terms that provide useful context.
"},{"location":"reference/glossary/#node","title":"Node","text":"A machine, either physical or virtual, that provides the necessary prerequisites (e.g. OS) to run an Ankaios server and/or agent.
"},{"location":"reference/glossary/#runtime","title":"Runtime","text":"The base an which a workload can be started. For OCI container this is a container runtime or engine. For native applications the runtime is the OS itself.
"},{"location":"reference/glossary/#workload","title":"Workload","text":"A functionality that the Ankaios orchestrator can manage (e.g. start, stop). A workload could be packed inside an OCI container (e.g. Podman container) or could also be just a native program (native workload). Ankaios is build to be extensible for different workload types by adding support for other runtimes.
"},{"location":"reference/glossary/#container","title":"Container","text":"A container is a lightweight, standalone, executable software package that includes everything needed to run an application, including the binaries, runtime, system libraries and dependencies. Containers provide a consistent and isolated environment for applications to run, ensuring that they behave consistently across different computing environments, from development to testing to production.
"},{"location":"reference/glossary/#podman-container","title":"Podman container","text":"A Podman container refers to a container managed by Podman, which is an open-source container engine similar to Docker. Podman aims to provide a simple and secure container management solution for developers and system administrators.
"},{"location":"reference/glossary/#native-workload","title":"Native workload","text":"An application developed specifically for a particular platform or operating system (OS). It is designed to run directly on the target platform without the need for bringing in any additional translation or emulation layers.
"},{"location":"reference/inter-workload-dependencies/","title":"Inter-workload dependencies","text":"Ankaios enables users to configure dependencies between workloads.
There are two types of inter-workload dependencies supported by Ankaios:
The user configures explicit inter-workload dependencies within a workload's configuration, which Ankaios considers when starting the workload. Ankaios starts workloads with dependencies only when all dependencies are met, allowing the user to define a specific sequence for starting workloads.
Ankaios defines implicit inter-workload dependencies internally and takes them into account when a dependency is deleted.
"},{"location":"reference/inter-workload-dependencies/#explicit-inter-workload-dependencies","title":"Explicit inter-workload dependencies","text":"Ankaios supports the following dependency types:
Dependency type AddCondition Description running ADD_COND_RUNNING The dependency must be operational. succeeded ADD_COND_SUCCEEDED The dependency must be successfully exited. failed ADD_COND_FAILED The dependency must exit with a non-zero return code.The user configures the AddCondition
for each dependency in the dependencies
field to define one or multiple dependencies for a workload.
apiVersion: v0.1\nworkloads:\n logger:\n agent: agent_A\n runtime: podman\n dependencies:\n storage_provider: ADD_COND_RUNNING\n ...\n
When the storage_provider
is operational, Ankaios starts the logger
workload. The ExecutionState of the workload remains Pending(WaitingToStart)
until all dependencies are met.
Note
Ankaios rejects manifests and workload configurations with cyclic dependencies. A manifest is valid only when its workloads and dependencies form a directed acyclic graph.
This example demonstrates how to use dependency types to configure inter-workload dependencies:
---\ntitle:\n---\nflowchart RL\n logger(logger)\n init(init_storage)\n storage(storage_provider)\n err_handler(error_handler)\n\n\n logger-- running -->storage\n err_handler-- failed -->storage\n storage-- succeeded -->init
The logging service requires an operational storage provider to write logs. Therefore, the storage provider must be started first and its initialization (init_storage) must be completed before starting the provider itself. In case of a failure, an error handler is started to manage errors.
The Ankaios manifest below includes the configuration of each workload and its dependencies:
apiVersion: v0.1\nworkloads:\n logger:\n runtime: podman\n agent: agent_A\n dependencies:\n storage_provider: ADD_COND_RUNNING # (1)!\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--entrypoint\", \"/bin/sleep\" ]\n commandArgs: [ \"3\" ]\n storage_provider:\n runtime: podman\n agent: agent_B\n dependencies:\n init_storage: ADD_COND_SUCCEEDED # (2)!\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--entrypoint\", \"/bin/sh\" ]\n commandArgs: [ \"-c\", \"sleep 5; exit 1\" ]\n init_storage: # (3)!\n runtime: podman\n agent: agent_B\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--entrypoint\", \"/bin/sleep\" ]\n commandArgs: [ \"2\" ]\n error_handler:\n runtime: podman\n agent: agent_A\n dependencies:\n storage_provider: ADD_COND_FAILED # (4)!\n runtimeConfig: |\n image: alpine:latest\n commandArgs: [ \"echo\", \"report failed storage provider\"]\n
Workloads may have dependencies that do not currently exist in the Ankaios state.
Assuming Ankaios is started with a manifest containing all previous workloads except for the error_handler
, a user can update the desired state by adding the restart_service
workload. This workload restarts certain workloads and should run after the error_handler
has completed. The following Ankaios manifest includes the restart_service
workload, which depends on the non-existent error_handler
in the current desired state:
workloads:\n restart_service:\n runtime: podman\n agent: agent_B\n dependencies:\n error_handler: ADD_COND_SUCCEEDED\n runtimeConfig: |\n image: alpine:latest\n commandArgs: [ \"echo\", \"restart of storage workloads\"]\n
Ankaios delays the restart_service
until the error_handler
reaches the specified state.
Ankaios automatically defines implicit dependencies to prevent a workload from failing or entering an undesired state when a dependency is deleted. These dependencies cannot be configured by the user. Ankaios only defines implicit dependencies for dependencies that other workloads depend on with the running
dependency type.
Ankaios does not explicitly delete a workload when its dependency is deleted. Instead, Ankaios delays the deletion of a dependency until all dependent workloads have been deleted. The dependency will have the ExecutionState Stopping(WaitingToStop)
as long as it cannot be deleted.
In the previous example, the workload logger
depends on the storage_provider
with a running
dependency type. When the user updates or deletes the storage_provider
dependency, Ankaios delays the deletion until the dependent workload logger
is neither pending nor running.
If an update meets the delete conditions but not the add conditions, Ankaios will execute the delete operation directly without delaying the entire update.
Note
Ankaios does not define implicit dependencies for workloads that have dependencies with the succeeded
and failed
types.
Ankaios offers two ways of dynamically interacting with a running cluster - the ank
CLI and the control interface.
The ank
CLI is targeted at integrators or workload developers that want to interact with the cluster during development or for a manual intervention. It is developed for ergonomics and not automation purposes. If required, an external application can connect to the interface used by the CLI, but this is not the standard way of automating a dynamic reconfiguration of the cluster during runtime.
The Ankaios control interface is provided to workloads managed by Ankaios and allows implementing the so-called \"operator pattern\". The control interface allows each workload to send messages to the agent managing it. After successful authorization, the Ankaios agent forwards the request to the Ankaios server and provides the response to the requesting workload. Through the control interface, a workload has the capability to obtain the complete state of the Ankaios cluster or administer the cluster by declaratively adjusting its state, thereby facilitating the addition or removal of other workloads.
"},{"location":"reference/resource-usage/","title":"Resource usage","text":"The following table shows the resource usage of Ankaios v0.2.0 with the setup:
The restart policy of a workload enables the user to determine whether a workload is automatically restarted when it terminates. By default, workloads are not restarted. However, the restart policy can be configured to always restart the workload, or to restart the workload under certain conditions.
"},{"location":"reference/restart-policy/#supported-restart-policies","title":"Supported Restart Policies","text":"The following restart policies are available for a workload:
Restart Policy Description Restart on ExecutionState NEVER The workload is never restarted. Once the workload exits, it remains in the exited state. - ON_FAILURE If the workload exits with a non-zero exit code, it will be restarted. Failed(ExecFailed) ALWAYS The workload is restarted upon termination, regardless of the exit code. Succeeded(Ok) or Failed(ExecFailed)Ankaios restarts the workload when the workload has exited and the configured restart policy aligns with the workload's ExecutionState
, as detailed in the aforementioned table. It does not restart the workload if the user explicitly deletes the workload via the Ankaios CLI or if Ankaios receives a delete request for that workload via the Control Interface.
Note
Ankaios does not consider inter-workload dependencies when restarting a workload because it was already running before it has exited.
"},{"location":"reference/restart-policy/#configure-restart-policies","title":"Configure Restart Policies","text":"The field restartPolicy
enables the user to define the restart policy for each workload within the Ankaios manifest. The field is optional. If the field is not provided, the default restart policy NEVER
is applied.
The following Ankaios manifest contains workloads with different restart policies:
apiVersion: v0.1\nworkloads:\n restarted_always:\n runtime: podman\n agent: agent_A\n restartPolicy: ALWAYS # (1)!\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--entrypoint\", \"/bin/sh\" ]\n commandArgs: [ \"-c\", \"echo 'Always restarted.'; sleep 2\"]\n restarted_never:\n runtime: podman\n agent: agent_A\n restartPolicy: NEVER # (2)!\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--entrypoint\", \"/bin/sh\" ]\n commandArgs: [ \"-c\", \"echo 'Explicitly never restarted.'; sleep 2\"]\n default_restarted_never: # default restart policy is NEVER\n runtime: podman\n agent: agent_A\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--entrypoint\", \"/bin/sh\" ]\n commandArgs: [ \"-c\", \"echo 'Implicitly never restarted.'; sleep 2\"]\n restarted_on_failure:\n runtime: podman\n agent: agent_A\n restartPolicy: ON_FAILURE # (3)!\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--entrypoint\", \"/bin/sh\" ]\n commandArgs: [ \"-c\", \"echo 'Restarted on failure.'; sleep 2; exit 1\"]\n
Depending on the use-case, the Ankaios cluster can be started with an optional predefined list of workloads - the startup configuration. Currently the startup configuration is provided as a file which is in YAML file format and can be passed to the Ankaios server through a command line argument. If Ankaios is started without or with an empty startup configuration, workloads can still be added to the cluster dynamically during runtime.
Note: To be able to run a workload an Ankaios agent must be started on the same or on a different node.
"},{"location":"reference/startup-configuration/#configuration-structure","title":"Configuration structure","text":"The startup configuration is composed of a list of workload specifications within the workloads
object. A workload specification must contain the following information:
workload name
(via field key), specify the workload name to identify the workload in the Ankaios system.runtime
, specify the type of the runtime. Currently supported values are podman
and podman-kube
.agent
, specify the name of the owning agent which is going to execute the workload. Supports templated strings.restartPolicy
, specify how the workload should be restarted upon exiting (not implemented yet).tags
, specify a list of key
value
pairs.runtimeConfig
, specify as a string the configuration for the runtime whose configuration structure is specific for each runtime, e.g., for podman
runtime the PodmanRuntimeConfig is used. Supports templated strings.configs
: assign configuration items defined in the state's configs
field to the workloadcontrolInterfaceAccess
, specify the access rights of the workload for the control interface.Example startup-config.yaml
file:
apiVersion: v0.1\nworkloads:\n nginx: # this is used as the workload name which is 'nginx'\n runtime: podman\n agent: agent_A\n restartPolicy: ALWAYS\n tags:\n - key: owner\n value: Ankaios team\n configs:\n port: web_server_port\n runtimeConfig: |\n image: docker.io/nginx:latest\n commandOptions: [\"-p\", \"{{port.access_port}}:80\"]\n controlInterfaceAccess:\n allowRules:\n - type: StateRule\n operation: Read\n filterMask:\n - \"workloadStates\"\nconfigs:\n web_server_port:\n access_port: \"8081\"\n
Ankaios supports templated strings and essential control directives in the handlebars templating language for the following workload fields:
agent
runtimeConfig
Ankaios renders a templated state at startup or when the state is updated. The rendering replaces the templated strings with the configuration items associated with each workload. The configuration items themselves are defined in a configs
field, which contains several key-value pairs. The key specifies the name of the configuration item and the value is a string, list or associative data structure. To see templated workload configurations in action, follow the tutorial about sending and receiving vehicle data.
Note
The name of a configuration item can only contain regular characters, digits, the \"-\" and \"_\" symbols. The same applies to the keys and values of the workload's configs
field when assigning configuration items to a workload.
The runtime configuration for the podman
runtime is specified as follows:
generalOptions: [<comma>, <separated>, <options>]\nimage: <registry>/<image name>:<version>\ncommandOptions: [<comma>, <separated>, <options>]\ncommandArgs: [<comma>, <separated>, <arguments>]\n
where each attribute is passed directly to podman run
.
If we take as an example the podman run
command:
podman --events-backend file run --env VAR=able docker.io/alpine:latest echo Hello!
it would translate to the following runtime configuration:
generalOptions: [\"--events-backend\", \"file\"]\nimage: docker.io/alpine:latest\ncommandOptions: [\"--env\", \"VAR=able\"]\ncommandArgs: [\"echo\", \"Hello!\"]\n
"},{"location":"reference/startup-configuration/#podmankuberuntimeconfig","title":"PodmanKubeRuntimeConfig","text":"The runtime configuration for the podman-kube
runtime is specified as follows:
generalOptions: [<comma>, <separated>, <options>]\nplayOptions: [<comma>, <separated>, <options>]\ndownOptions: [<comma>, <separated>, <options>]\nmanifest: <string containing the K8s manifest>\n
where each attribute is passed directly to podman play kube
.
If we take as an example the podman play kube
command:
podman --events-backend file play kube --userns host manifest.yaml
and the corresponding command for deleting the manifest file:
podman --events-backend file play kube manifest.yaml --down --force
they would translate to the following runtime configuration:
generalOptions: [\"--events-backend\", \"file\"]\nplayOptions: [\"--userns\", \"host\"]\ndownOptions: [\"--force\"]\nmanifest: <contents of manifest.yaml>\n
"},{"location":"usage/awesome-ankaios/","title":"Awesome Ankaios","text":"Here you find a curated list of awesome things related to Ankaios.
If you have some missing resources, please feel free to open a pull request and add them.
"},{"location":"usage/awesome-ankaios/#extensions-for-ankaios","title":"Extensions for Ankaios","text":"Ankaios has been tested with the following Linux distributions. Others might work as well but have not been tested.
Ankaios currently requires a Linux OS and is available for x86_64 and arm64 targets.
The minimum system requirements are (tested with EB corbos Linux \u2013 built on Ubuntu):
Resource Min CPU 1 core RAM 128 MBPodman needs to be installed as this is used as container runtime (see Podman installation instructions). For using the podman
runtime, Podman version 3.4.2 is sufficient but the podman-kube
runtime requires at least Podman version 4.3.1.
Note
On Ubuntu 24.04 there is a known problem with Podman stopping containers. The following workaround disables AppArmor for Podman. Run the following steps as root after installation of Podman:
mkdir -p /etc/containers/containers.conf.d\nprintf '[CONTAINERS]\\napparmor_profile=\"\"\\n' > /etc/containers/containers.conf.d/disable-apparmor.conf\n
"},{"location":"usage/installation/#installation-methods","title":"Installation methods","text":"There are two ways to install Ankaios, depending on your specific needs and focus. If you are new to Ankaios or TLS is not a top priority, we recommend following the setup instructions in Setup with script without enabling mutual transport layer security (mTLS) for communication. On the other hand, if you want to setup Ankaios in a production environment, follow the setup instructions in Setting up Ankaios with mTLS.
"},{"location":"usage/installation/#setup-with-script","title":"Setup with script","text":"The recommended way to install Ankaios is using the installation script. To install the latest release version of Ankaios, please run the following command:
curl -sfL https://github.com/eclipse-ankaios/ankaios/releases/latest/download/install.sh | bash -\n
Note
Please note that installing the latest version of Ankaios in an automated workflow is discouraged. If you want to install Ankaios during an automated workflow, please install a specific version as described below.
The installation process automatically detects the platform and downloads the appropriate binaries. The default installation path for the binaries is /usr/local/bin
but can be changed. The installation also creates systemd unit files and an uninstall script.
Supported platforms: linux/amd64
, linux/arm64
Note
The script requires root privileges to install the pre-built binaries into the default installation path /usr/local/bin
and also for systemd integration. You can set a custom installation path and disable systemd unit file generation if only non-root privileges are available.
The following table shows the optional arguments that can be passed to the script:
Supported parameters Description -v <version> e.g.v0.1.0
, default: latest version -i <install-path> File path where Ankaios will be installed, default: /usr/local/bin
-t <install-type> Installation type for systemd integration: server
, agent
, none
or both
(default) -s <server-options> Options which will be passed to the Ankaios server. Default --insecure --startup-config /etc/ankaios/state.yaml
-a <agent-options> Options which will be passed to the Ankaios agent. Default --insecure --name agent_A
To install a specific version run the following command and substitute <version>
with a specific version tag e.g. v0.1.0
:
curl -sfL https://github.com/eclipse-ankaios/ankaios/releases/download/<version>/install.sh | bash -s -- -v <version>\n
For available versions see the list of releases.
"},{"location":"usage/installation/#set-the-log-level-for-ank-server-and-ank-agent-services","title":"Set the log level forank-server
and ank-agent
services","text":"To configure the log levels for ank-server
and ank-agent
during the installation process using the provided environment variables, follow these steps:
Set the desired log levels for each service by assigning valid values to the environment variables INSTALL_ANK_SERVER_RUST_LOG
and INSTALL_ANK_AGENT_RUST_LOG
. For the syntax see the documentation for RUST_LOG
.
Run the installation script, making sure to pass these environment variables as arguments if needed:
For a specific version:
curl -sfL https://github.com/eclipse-ankaios/ankaios/releases/download/<version>/install.sh | INSTALL_ANK_SERVER_RUST_LOG=debug INSTALL_ANK_AGENT_RUST_LOG=info bash -s -- -i /usr/local/bin -t both -v <version>\n
For the latest version:
curl -sfL https://github.com/eclipse-ankaios/ankaios/releases/download/latest/install.sh | INSTALL_ANK_SERVER_RUST_LOG=debug INSTALL_ANK_AGENT_RUST_LOG=info bash -s -- -i /usr/local/bin -t both\n
Now, both services will output logs according to the specified log levels. If no explicit value was provided during installation, both services will default to info
log level. You can always change the log level by updating the environment variables and reinstalling the services.
If Ankaios has been installed with the installation script, it can be uninstalled with:
ank-uninstall.sh\n
The folder /etc/ankaios
will remain.
As an alternative to the installation script, the pre-built binaries can be downloaded manually from the Ankaios repository here. This is useful if the automatic detection of the platform is failing in case of uname
system command is not allowed or supported on the target.
For building Ankaios from source see Build.
"},{"location":"usage/mtls-setup/","title":"Setting up Ankaios with mTLS","text":"Mutual TLS (mTLS) is a security protocol that verifies both the client and server identities before establishing a connection. In Ankaios mTLS can be used to secure communication between the server, agent and ank CLI.
"},{"location":"usage/mtls-setup/#prerequisites","title":"Prerequisites","text":"To set up mTLS with OpenSSL, perform the following actions:
First we need to create a folder to keep certificates and keys for ank-server
and ank-agent
:
sudo mkdir -p /etc/ankaios/certs\n
Then we need to create a folder to keep certificates and keys for the ank
CLI:
mkdir -p \"${XDG_CONFIG_HOME:-$HOME/.config}/ankaios\"\n
"},{"location":"usage/mtls-setup/#generate-ca-keys-and-certificate","title":"Generate CA keys and certificate","text":"Construct an OpenSSL configuration file named ca.cnf
. You are welcome to include additional fields if necessary:
[req]\ndistinguished_name = req_distinguished_name\nprompt = no\n\n[req_distinguished_name]\nCN = ankaios-ca\n
Generate CA key:
sudo openssl genpkey -algorithm ED25519 -out \"./ca-key.pem\"\n
Generate CA certificate:
sudo openssl req -config \"./ca.cnf\" -new -x509 -key \"./ca-key.pem\" -out \"/etc/ankaios/certs/ca.pem\"\n
"},{"location":"usage/mtls-setup/#generate-key-and-certificate-for-ank-server","title":"Generate key and certificate for ank-server
","text":"Construct an OpenSSL configuration file named ank-server.cnf
. You are welcome to include additional fields if necessary:
[req]\ndistinguished_name = req_distinguished_name\nreq_extensions = v3_req\nprompt = no\n\n[req_distinguished_name]\nCN = ank-server\n\n[v3_req]\nsubjectAltName = @alt_names\nextendedKeyUsage = serverAuth\n\n[alt_names]\nDNS.1 = ank-server\n
Generate ank-server key:
sudo openssl genpkey -algorithm ED25519 -out \"/etc/ankaios/certs/ank-server-key.pem\"\n
Generate ank-server certificate signing request:
sudo openssl req -config \"./ank-server.cnf\" -new -key \"/etc/ankaios/certs/ank-server-key.pem\" -out \"./ank-server.csr\"\n
Generate ank-server certificate:
sudo openssl x509 -req -in \"./ank-server.csr\" -CA \"/etc/ankaios/certs/ca.pem\" -CAkey \"./ca-key.pem\" -extensions v3_req -extfile \"./ank-server.cnf\" -out \"/etc/ankaios/certs/ank-server.pem\"\n
"},{"location":"usage/mtls-setup/#generate-key-and-certificate-for-ank-agent","title":"Generate key and certificate for ank-agent
","text":"Construct an OpenSSL configuration file named ank-agent.cnf
. You are welcome to include additional fields if necessary:
[req]\ndistinguished_name = req_distinguished_name\nreq_extensions = v3_req\nprompt = no\n\n[req_distinguished_name]\nCN = ank-agent\n\n[v3_req]\nsubjectAltName = @alt_names\nextendedKeyUsage = clientAuth\n\n[alt_names]\n# This certificate can only be used for agents with the names 'agent_A' or 'agent_B'\n# To allow the usage for any agent use the character '*'\n# like: DNS.1 = *\nDNS.1 = agent_A\nDNS.2 = agent_B\n
Generate ank-agent key:
sudo openssl genpkey -algorithm ED25519 -out \"/etc/ankaios/certs/ank-agent-key.pem\"\n
Generate ank-agent certificate signing request:
sudo openssl req -config \"./ank-agent.cnf\" -new -key \"/etc/ankaios/certs/ank-agent-key.pem\" -out \"./ank-agent.csr\"\n
Generate ank-agent certificate:
sudo openssl x509 -req -in \"./ank-agent.csr\" -CA \"/etc/ankaios/certs/ca.pem\" -CAkey \"./ca-key.pem\" -extensions v3_req -extfile \"./ank-agent.cnf\" -out \"/etc/ankaios/certs/ank-agent.pem\"\n
"},{"location":"usage/mtls-setup/#generate-key-and-certificate-for-the-cli-ank","title":"Generate key and certificate for the CLI ank
","text":"Construct an OpenSSL configuration file named ank.cnf
. You are welcome to include additional fields if necessary:
[req]\ndistinguished_name = req_distinguished_name\nreq_extensions = v3_req\nprompt = no\n[req_distinguished_name]\nCN = ank\n\n[v3_req]\nsubjectAltName = @alt_names\nextendedKeyUsage = clientAuth\n\n[alt_names]\nDNS.1 = ank\n
Generate ank key:
openssl genpkey -algorithm ED25519 -out \"${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank-key.pem\"\n
Generate ank certificate signing request:
openssl req -config \"./ank.cnf\" -new -key \"${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank-key.pem\" -out \"./ank.csr\"\n
Generate ank certificate:
sudo openssl x509 -req -in \"./ank.csr\" -CA \"/etc/ankaios/certs/ca.pem\" -CAkey \"./ca-key.pem\" -extensions v3_req -extfile \"./ank.cnf\" -out \"${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank.pem\"\n
"},{"location":"usage/mtls-setup/#perform-ankaios-installation-with-mtls-support","title":"Perform Ankaios installation with mTLS support","text":"To set up Ankaios with mTLS support, you need to supply the necessary mTLS certificates to the ank-server
, ank-agent
, and ank
CLI components. Here's a step-by-step guide:
ank-server
and ank-agent
with mTLS certificates","text":"curl -sfL https://github.com/eclipse-ankaios/ankaios/releases/latest/download/install.sh | bash -s -- -s \"--startup-config /etc/ankaios/state.yaml --ca_pem /etc/ankaios/certs/ca.pem --crt_pem /etc/ankaios/certs/ank-server.pem --key_pem /etc/ankaios/certs/ank-server-key.pem\" -a \"--name agent_A --ca_pem /etc/ankaios/certs/ca.pem --crt_pem /etc/ankaios/certs/ank-agent.pem --key_pem /etc/ankaios/certs/ank-agent-key.pem\"\n
Start the Ankaios server and an Ankaios agent as described in the Quickstart and continue below to configure the CLI with mTLS.
"},{"location":"usage/mtls-setup/#configure-the-ank-cli-with-mtls-certificates","title":"Configure theank
CLI with mTLS certificates","text":"To make it easier, we will set the mTLS certificates for the ank
CLI by using environment variables:
export ANK_CA_PEM=/etc/ankaios/certs/ca.pem\nexport ANK_CRT_PEM=${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank.pem\nexport ANK_KEY_PEM=${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank-key.pem\n
Now you can use the ank
CLI as follows:
ank get workloads\n
Or in a single line call:
ANK_CA_PEM=/etc/ankaios/certs/ca.pem ANK_CRT_PEM=${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank.pem ANK_KEY_PEM=${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank-key.pem ank get workloads\n
Alternatively, you can pass the mTLS certificates as command line arguments:
ank --ca_pem=/etc/ankaios/certs/ca.pem --crt_pem=\"${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank.pem\" --key_pem=\"${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank-key.pem\" get workloads\n
"},{"location":"usage/quickstart/","title":"Quickstart","text":"If you have not installed Ankaios, please follow the instructions here. The following examples assumes that the installation script has been used with default options.
You can start workloads in Ankaios in a number of ways. For example, you can define a file with the startup configuration and use systemd to start Ankaios. The startup configuration file contains all of the workloads and their configuration that you want to be started by Ankaios.
Let's modify the default config which is stored in /etc/ankaios/state.yaml
:
apiVersion: v0.1\nworkloads:\n nginx:\n runtime: podman\n agent: agent_A\n restartPolicy: ALWAYS\n tags:\n - key: owner\n value: Ankaios team\n runtimeConfig: |\n image: docker.io/nginx:latest\n commandOptions: [\"-p\", \"8081:80\"]\n
Then we can start the Ankaios server:
sudo systemctl start ank-server\n
The Ankaios server will read the config but detect that no agent with the name agent_A
is available that could start the workload, see logs with:
journalctl -t ank-server\n
Now let's start an agent:
sudo systemctl start ank-agent\n
This Ankaios agent will run the workload that has been assigned to it. We can use the Ankaios CLI to check the current state:
ank -k get state\n
Note
The instructions assume the default installation without mutual TLS (mTLS) for communication. With -k
or --insecure
the ank
CLI will connect without mTLS. Alternatively, set the environment variable ANK_INSECURE=true
to avoid passing the argument to each ank
CLI command. For an Ankaios setup with mTLS, see here.
which creates:
desiredState:\n apiVersion: v0.1\n workloads:\n nginx:\n agent: agent_A\n tags:\n - key: owner\n value: Ankaios team\n dependencies: {}\n restartPolicy: ALWAYS\n runtime: podman\n runtimeConfig: |\n image: docker.io/nginx:latest\n commandOptions: [\"-p\", \"8081:80\"]\n configs: {}\n configs: {}\nworkloadStates:\n agent_A:\n nginx:\n cc74dd34189ef3181a2f15c6c5f5b0e76aaefbcd55397e15314e7a25bad0864b:\n state: Running\n subState: Ok\n additionalInfo: ''\nagents:\n agent_A:\n cpuUsage: 2\n freeMemory: 7989682176\n
or
ank -k get workloads\n
which results in:
WORKLOAD NAME AGENT RUNTIME EXECUTION STATE ADDITIONAL INFO\nnginx agent_A podman Running(Ok)\n
Ankaios also supports adding and removing workloads dynamically. To add another workload call:
ank -k run workload \\\nhelloworld \\\n--runtime podman \\\n--agent agent_A \\\n--config 'image: docker.io/busybox:1.36\ncommandOptions: [ \"-e\", \"MESSAGE=Hello World\"]\ncommandArgs: [ \"sh\", \"-c\", \"echo $MESSAGE\"]'\n
We can check the state again with ank -k get state
and see, that the workload helloworld
has been added to desiredState.workloads
and the execution state is available in workloadStates
.
As the workload had a one time job its state is Succeeded(Ok)
and we can delete it from the state again with:
ank -k delete workload helloworld\n
Note
Workload names shall not be longer then 63 symbols and can contain only regular characters, digits, the \"-\" and \"_\" symbols.
For next steps follow the tutorial on sending and receiving vehicle data with workloads orchestrated by Ankaios. Then also check the reference documentation for the startup configuration including the podman-kube
runtime and also working with the complete state data structure.
Ankaios supports command completion for the ank
CLI in various shells.
Note
For dynamic completion (workloads etc.) to work, the ank
CLI must be configured via environment variables. To use a non-default server URL, provide ANK_SERVER_URL
. Also provide either ANK_INSECURE=true
or ANK_CA_PEM
, ANK_CRT_PEM
and ANK_KEY_PEM
.
Add the following lines to your ~/.bashrc
:
if command -v ank &> /dev/null; then\n source <(COMPLETE=bash ank)\nfi\n
"},{"location":"usage/shell-completion/#z-shell-zsh","title":"Z shell (zsh)","text":"Add the following lines to your ~/.zshrc
:
if command -v ank &> /dev/null; then\n source <(COMPLETE=zsh ank)\nfi\n
"},{"location":"usage/shell-completion/#fish","title":"Fish","text":"Add the following lines to your ~/.config/fish/config.fish
:
if type -q ank\n source (COMPLETE=fish ank | psub)\nend\n
"},{"location":"usage/shell-completion/#elvish","title":"Elvish","text":"echo \"eval (COMPLETE=elvish ank)\" >> ~/.elvish/rc.elv\n
"},{"location":"usage/shell-completion/#powershell","title":"Powershell","text":"echo \"COMPLETE=powershell ank | Invoke-Expression\" >> $PROFILE\n
"},{"location":"usage/tutorial-vehicle-signals/","title":"Tutorial: Sending and receiving vehicle signals","text":""},{"location":"usage/tutorial-vehicle-signals/#introduction","title":"Introduction","text":"In this tutorial, we will show you how to use Ankaios to set up workloads that publish and subscribe to vehicle signals in accordance with the Vehicle Signal Specification (VSS). The central workload will be a databroker from the Kuksa.val project. It will receive vehicle speed signals published from a speed provider workload. Finally a speed consumer workload will consume those speed limits.
Overview of workloads
To run this tutorial you will need a Linux platform, which can be a RaspberryPi or a Linux PC or virtual machine. Additionally, it's assumed that the Ankaios setup is done with mutual TLS (mTLS) disabled or using its default installation settings.
"},{"location":"usage/tutorial-vehicle-signals/#start-the-databroker","title":"Start the databroker","text":"If you have not yet installed Ankaios, please follow the instructions here. The following examples assume that the installation script has been used with the default options.
Make sure that Ankaios server and agent are started:
sudo systemctl start ank-server\nsudo systemctl start ank-agent\n
Now we have Ankaios up and running with a server and an agent. To run the databroker we need to create an Ankaios manifest:
databroker.yamlapiVersion: v0.1\nworkloads:\n databroker:\n runtime: podman\n agent: agent_A\n runtimeConfig: |\n image: ghcr.io/eclipse/kuksa.val/databroker:0.4.1\n commandArgs: [\"--insecure\"]\n commandOptions: [\"--net=host\"]\n
This defines a workload databroker
to be scheduled on agent agent_A
(default agent name when using standard installation procedure) using the runtime podman
. See the reference documentation for the other attributes.
Let's have a look at the runtimeConfig
which in this case is specific for the podman
runtime.
image: ghcr.io/eclipse/kuksa.val/databroker:0.4.1
specifies the container image according to the OCI image format. Fortunately, the Kuksa.val project already provides an image for the databroker that we can use here.commandArgs: [\"--insecure\"]
: These are command arguments which are passed to the container, in this case to the container's entrypoint. As we are not using authentication for the databroker we pass the argument --insecure
.commandOptions: [\"--net=host\"]
: These options are passed to the podman run
command. We want to use the host network for the databroker.Store the Ankaios manifest listed above in a file databroker.yaml
.
Then start the workload:
ank -k apply databroker.yaml\n
Note
The instructions assume the default installation without mutual TLS (mTLS) for communication. With -k
or --insecure
the ank
CLI will connect without mTLS. Alternatively, set the environment variable ANK_INSECURE=true
to avoid passing the argument to each ank
CLI command. For an Ankaios setup with mTLS, see here.
The Ankaios agent agent_A
will now instruct podman to start the workload. The command waits until the databroker is running. It should finally print:
WORKLOAD NAME AGENT RUNTIME EXECUTION STATE ADDITIONAL INFO\n databroker agent_A podman Running(Ok)\n
"},{"location":"usage/tutorial-vehicle-signals/#start-the-speed-provider","title":"Start the speed provider","text":"Now we want to start a workload that publishes vehicle speed values and call that speed-provider
.
apiVersion: v0.1\nworkloads:\n speed-provider:\n runtime: podman\n agent: agent_A\n runtimeConfig: |\n image: ghcr.io/eclipse-ankaios/speed-provider:0.1.1\n commandOptions:\n - \"--net=host\"\n
The source code for that image is available in the Anakios repo.
Start the workload with:
ank -k apply speed-provider.yaml\n
The command waits until the speed-provider is running. It should finally print:
WORKLOAD NAME AGENT RUNTIME EXECUTION STATE ADDITIONAL INFO\n speed-provider agent_A podman Running(Ok)\n
The speed-provider workload provides a web UI that allows the user to enter a speed value that is then sent to the databroker. The web UI is available on http://127.0.0.1:5000. If your web browser is running on a different host than the Ankaios agent, replace 127.0.0.1 with the IP address of the host running the Ankaios agent.
Speed provider web UI"},{"location":"usage/tutorial-vehicle-signals/#add-an-agent","title":"Add an agent","text":"
We currently have an agent running as part of the Ankaios cluster, running the databroker and the speed provider. The next workload we want to start is a speed consumer that consumes vehicle speed values. A speed consumer such as a navigation system typically runs on a separate node for infotainment. A separate node requires a new Ankaios agent. Let's create another Ankaios agent to connect to the existing server. For this tutorial we can either use a separate Linux host or use the existing one. Start a new agent with:
ank-agent -k --name infotainment --server-url http://<SERVER_IP>:25551\n
If the agent is started on the same host as the existing Ankaios server and agent, then we will call it as follows:
ank-agent -k --name infotainment --server-url http://127.0.0.1:25551\n
As the first agent was started by systemd, it runs as root and therefore calls podman as root. The second agent is started by a non-root user and therefore also uses podman in user mode. Ankaios does not need root privileges and can be started as any user.
Now we have two agents runnings in the Ankaios cluster, agent_A
and infotainment
.
For the next steps we need to keep this terminal untouched in order to keep the agent running.
"},{"location":"usage/tutorial-vehicle-signals/#list-the-connected-agents","title":"List the connected agents","text":"Let's verify that the new infotainment
agent has connected to the Ankaios server by running the following command, which will list all Ankaios agents currently connected to the Ankaios server, along with their number of workloads:
ank -k get agents\n
It should print:
NAME WORKLOADS CPU USAGE FREE MEMORY\nagent_A 2 42.42% 42B\ninfotainment 0 42.42% 42B\n
Since agent_A
is already managing the databroker
and the speed-provider
workloads, the WORKLOADS
column contains the number 2
. The Ankaios agent infotainment
has recently been started and does not yet manage any workloads.
Note
The currently connected Ankaios agents are part of the CompleteState and can also be retrieved working with the CompleteState.
"},{"location":"usage/tutorial-vehicle-signals/#start-the-speed-consumer","title":"Start the speed consumer","text":"Now we can start a speed-consumer workload on the new agent:
speed-consumer.yamlapiVersion: v0.1\nworkloads:\n speed-consumer:\n runtime: podman\n runtimeConfig: |\n image: ghcr.io/eclipse-ankaios/speed-consumer:0.1.2\n commandOptions:\n - \"--net=host\"\n - \"-e\"\n - \"KUKSA_DATA_BROKER_ADDR=127.0.0.1\"\n
In case the speed-consumer workload is not running on the same host as the databroker you need to adjust the KUKSA_DATA_BROKER_ADDR
.
Note that this time the image does not specify the agent. While we could add agent: infotainment
, this time we pass the agent name when the workload starts:
ank -k apply --agent infotainment speed-consumer.yaml\n
Note
If you are running the ank command on a host that is different from the host on which the Ankaios server is running, you need to add a parameter -s <SERVER_URL>
like:
ank -k apply -s http://127.0.0.1:25551 --agent infotainment speed-consumer.yaml\n
Optionally the server URL can also be provided via environment variable:
export ANK_SERVER_URL=http://127.0.0.1:25551\nank -k apply --agent infotainment speed-consumer.yaml\n
The command waits until speed consumer is running. It should print:
WORKLOAD NAME AGENT RUNTIME EXECUTION STATE ADDITIONAL INFO\n speed-consumer infotainment podman Running(Ok)\n
We can check all running workloads with
ank -k get workloads\n
The output should be:
WORKLOAD NAME AGENT RUNTIME EXECUTION STATE ADDITIONAL INFO\n databroker agent_A podman Running(Ok)\n speed-consumer infotainment podman Running(Ok)\n speed-provider agent_A podman Running(Ok)\n
Optionally, you can re-run the previous ank -k get agents
command again, to verify that the number of workloads managed by the infotainment
agent has now increased.
The speed-consumer workload subscribes to the vehicle speed signal and prints it to stdout. Use the web UI of the speed-provider to send a few vehicle speed values and watch the log messages of the speed-consumer. As the logs are specific for a runtime, we use Podman to read the logs:
podman logs -f $(podman ps -a | grep speed-consumer | awk '{print $1}')\n
Info
If you want to see the logs of the databroker or speed-provider you need to use sudo podman
instead of podman
(two occurences) as those workloads run on podman as root on agent_A.
Now, we want to change the existing Ankaios manifest of the speed-provider to use auto mode which sends a new speed limit value every second.
speed-provider.yamlapiVersion: v0.1\nworkloads:\n speed-provider:\n runtime: podman\n agent: agent_A\n runtimeConfig: |\n image: ghcr.io/eclipse-ankaios/speed-provider:0.1.1\n commandOptions:\n - \"--net=host\"\n - \"-e\"\n - \"SPEED_PROVIDER_MODE=auto\"\n
We apply the changes with:
ank -k apply speed-provider.yaml\n
and recognize that we get a new speed value every 1 second.
"},{"location":"usage/tutorial-vehicle-signals/#ankaios-state","title":"Ankaios state","text":"Previously we have used ank -k get workloads
to a get list of running workloads. Ankaios also maintains a current state which can be retrieved with:
ank -k get state\n
Let's delete all workloads and check the state again:
ank -k delete workload databroker speed-provider speed-consumer\nank -k get state\n
If we want to start the three workloads on startup of the Ankaios server and agents we need to create a startup manifest file. In the default installation this file is /etc/ankaios/state.yaml
as we can see in the systemd until file of the Ankaios server:
$ systemctl cat ank-server\n# /etc/systemd/system/ank-server.service\n[Unit]\nDescription=Ankaios server\n\n[Service]\nEnvironment=\"RUST_LOG=info\"\nExecStart=/usr/local/bin/ank-server --insecure --startup-config /etc/ankaios/state.yaml\n\n[Install]\nWantedBy=default.target\n
Now we create a startup manifest file containing all three workloads:
/etc/ankaios/state.yamlapiVersion: v0.1\nworkloads:\n databroker:\n runtime: podman\n agent: agent_A\n runtimeConfig: |\n image: ghcr.io/eclipse/kuksa.val/databroker:0.4.1\n commandArgs: [\"--insecure\"]\n commandOptions: [\"--net=host\"]\n speed-provider:\n runtime: podman\n agent: agent_A\n dependencies:\n databroker: ADD_COND_RUNNING\n runtimeConfig: |\n image: ghcr.io/eclipse-ankaios/speed-provider:0.1.1\n commandOptions:\n - \"--net=host\"\n - \"-e\"\n - \"SPEED_PROVIDER_MODE=auto\"\n speed-consumer:\n runtime: podman\n agent: infotainment\n dependencies:\n databroker: ADD_COND_RUNNING\n runtimeConfig: |\n image: ghcr.io/eclipse-ankaios/speed-consumer:0.1.2\n commandOptions:\n - \"--net=host\"\n - \"-e\"\n - \"KUKSA_DATA_BROKER_ADDR=127.0.0.1\"\n
As the speed-provider and the speed-consumer shall only be started after the databroker is running, we have added dependencies:
dependencies:\n databroker: ADD_COND_RUNNING\n
The next time the Ankaios server and the two agents will be started, this startup config will be applied.
"},{"location":"usage/tutorial-vehicle-signals/#define-re-usable-configuration","title":"Define re-usable configuration","text":"Let's improve the previous startup manifest by introducing a templated configuration for workloads to avoid configuration repetition and have a single point of change. The supported fields and syntax are described here.
/etc/ankaios/state.yamlapiVersion: v0.1\nworkloads:\n databroker:\n runtime: podman\n agent: \"{{agent.name}}\" # (1)!\n configs:\n agent: agents # (2)!\n network: network # (3)!\n runtimeConfig: | # (4)!\n image: ghcr.io/eclipse/kuksa.val/databroker:0.4.1\n commandArgs: [\"--insecure\"]\n commandOptions: [\"--net={{network}}\"]\n speed-provider:\n runtime: podman\n agent: \"{{agent.name}}\"\n dependencies:\n databroker: ADD_COND_RUNNING\n configs:\n agent: agents\n net: network\n env: env_provider # (5)!\n runtimeConfig: | # (6)!\n image: ghcr.io/eclipse-ankaios/speed-provider:0.1.1\n commandOptions:\n - \"--net={{net}}\"\n {{#each env}}\n - \"-e {{this.key}}={{this.value}}\"\n {{/each}}\n speed-consumer:\n runtime: podman\n agent: infotainment\n dependencies:\n databroker: ADD_COND_RUNNING\n configs:\n network: network\n env: env_consumer # (7)!\n runtimeConfig: | # (8)!\n image: ghcr.io/eclipse-ankaios/speed-consumer:0.1.2\n commandOptions:\n - \"--net={{network}}\"\n {{#each env}}\n - \"-e {{this.key}}={{this.value}}\"\n {{/each}}\nconfigs: # (9)!\n network: host\n env_provider:\n - key: SPEED_PROVIDER_MODE\n value: auto\n env_consumer:\n - key: KUKSA_DATA_BROKER_ADDR\n value: \"127.0.0.1\"\n agents:\n name: agent_A\n
Start the Ankaios cluster again, by executing the following command:
sudo systemctl start ank-server\nsudo systemctl start ank-agent\n
Start the infotainment
agent, remembering to change the server URL if the agent is not running on the same host:
ank-agent -k --name infotainment --server-url http://127.0.0.1:25551\n
Verify again that all workloads are up and running.
"},{"location":"usage/tutorial-vehicle-signals/#update-configuration-items","title":"Update configuration items","text":"Let's update the content of a configuration item with the ank apply
command.
Using ank apply
:
apiVersion: v0.1\nconfigs:\n env_provider:\n - key: SPEED_PROVIDER_MODE\n value: webui\n
ank -k apply new-manifest.yaml\n
Ankaios will update workloads that reference an updated configuration item. After running one of these commands, the speed-provider
workload has been updated to run in the 'webui' mode.
You can verify this by re-opening the web UI on http://127.0.0.1:5000.
"},{"location":"usage/tutorial-vehicle-signals/#list-configuration-items","title":"List configuration items","text":"Let's list the configuration items present in current state with the ank get configs
command.
Using ank -k get configs
, it should print:
CONFIGS\nnetwork\nenv_provider\nenv_consumer\nagents\n
"},{"location":"usage/tutorial-vehicle-signals/#delete-configuration-items","title":"Delete configuration items","text":"Let's try to delete a configuration item still referenced by workloads in its configs
field by re-using the previous manifest content.
ank -k delete config env_provider\n
The command returns an error that the rendering of the new state fails due to a missing configuration item.
Ankaios will always reject a new state if it fails to render. The speed-provider
still references the configuration item in its configs
field which would no longer exist.
Running the ank -k get state
command afterwards will show that Ankaios still has the previous state in memory.
To remove configuration items, remove the configuration references for the desired configuration items in the workload's configs
field, and remove the desired configuration items from the state.
When upgrading from v0.2 to v0.3, the installation script simply needs to be run again. However, due to breaking changes, some manual adjustments are required for existing configurations and workloads.
"},{"location":"usage/upgrading/v0_2_to_v0_3/#configurations","title":"Configurations","text":"CompleteState
currentState
has been renamed to desiredState
State
apiVersion
was added to avoid incompatibility issues.restart
has been supplemented with a restartPolicy
enum.configs
and cronjobs
have been removed for now as they are not implemented yet.Workload
accessRights
and updateStrategy
have been removed for now as they are not implemented yet.Application using the control interface or communicating directly with the Ankaios server (custom CLIs) need to be adapted.
The two main messages have been renamed:
StateChangeRequest
-> ToServer
ExecutionRequest
-> FromServer
A new type of ToServer
message, Request
, has been introduced. Every Request
to the server requires a requestId
which is used by the server for the response message. Request IDs allow sending multiple parallel requests to the server. The two messages UpdateStateRequest
and CompleteStateRequest
have been moved to the new Request
message.
A new type of FromServer
message, Response
, has been introduced. A Response
message is always an answer from the Server to a Request
message. The Response
message contains the same requestId
as the answered Request
message. This allows to identify the correct Response
. The CompleteState
message has been moved to the new Response
message. Additionally, the Ankaios server now responds to an UpdateStateRequest
with an UpdateStateSuccess
or Error
message, which are both of type Response
.
When upgrading from v0.3 to v0.4, the installation script simply needs to be ran again. However, due to some breaking changes, some manual adjustments are required for existing workloads using the control interface and applications directly using the gRPC API of the Ankaios server.
"},{"location":"usage/upgrading/v0_3_to_v0_4/#optional-attributes-of-the-complete-state","title":"Optional attributes of the Complete State","text":"Ankaios allows filtering the Complete State at request level and setting only certain fields of the Complete State while updating the desired state of the cluster. To make this process more transparent and remove the need of returning or requiring default values for fields not targeted by the filter masks, Ankaios now explicitly handles all fields (beside versions) of the Complete State as optional. This allows returning only portions of the Complete State, e.g., when filtering with desiredState.workloads.nginx.tags
the response from the server will be:
desiredState:\n apiVersion: v0.1\n workloads:\n nginx:\n tags:\n - key: owner\n value: Ankaios team\n
The changes requires also some additional handling when pushing data over the Control Interface, as some fields must now be enclosed into wrapper objects, e.g., the Rust code for creating a workload object now looks as follows:
Workload {\n runtime: Some(\"podman\".to_string()),\n agent: Some(\"agent_A\".to_string()),\n restart_policy: Some(RestartPolicy::Never.into()),\n tags: Some(Tags {\n tags: vec![Tag {\n key: \"owner\".to_string(),\n value: \"Ankaios team\".to_string(),\n }],\n }),\n runtime_config: Some(\n \"image: docker.io/library/nginx\\ncommandOptions: [\\\"-p\\\", \\\"8080:80\\\"]\"\n .to_string(),\n ),\n dependencies: Some(Dependencies {\n dependencies: HashMap::new(),\n }),\n control_interface_access: None,\n}\n
Please review the examples from the Ankaios repository for more information on the topic.
"},{"location":"usage/upgrading/v0_3_to_v0_4/#removed-top-level-attribute-startupstate","title":"Removed top level attributestartupState
","text":"The top-level attribute startupState
was removed from the Ankaios configuration. Initially, we targeted at allowing a modification of the startup state of the cluster via Ankaios' control interface. As the requirements towards persistent storage in embedded environments could be quite different, e.g., due to flash wear-out protection, it is best to allow a dedicated application to perform the updates of the startup state. The startup state update app could be running as an Ankaios workload, but would be written specifically for the distinct use-case obeying the particular requirements.
The control interface has been decoupled from the API for server-agent communication, now exclusively handling essential messages with newly named identifiers for clarity.
To upgrade to the new version v0.4, use the new control_api.proto
file and the two new messages:
ToAnkaios
FromAnkaios
The new messages currently support requests and responses to and from Ankaios and will later support other functionality. The Request
and Response
messages and their content remain the same, but are now located in the ank_base.proto
file.
A sample how the new definition of the Control Interface is used can be found in the examples from the Ankaios repository.
The reason for splitting some messages into the dedicated file ank_base.proto
, is that they are also used for the gRPC API of the Ankaios server. This API is mainly used by the Ankaios agents and the ank
CLI, but could also be used by third party applications to directly communicate with the Ankaios server. The following chapter details the changes needed to upgrade to v0.4 in case you are using this API.
The usage of the control interface now requires an explicit authorization at the workload configuration. The authorization is done via the new controlInterfaceAccess
attribute.
The following configuration shows an example where the workload composer
can update all other workloads beside the workload watchdog
for which an explicit deny rule is added:
desiredState:\n workloads:\n composer:\n runtime: podman\n ...\n controlInterfaceAccess:\n allowRules:\n - type: StateRule\n operation: ReadWrite\n filterMask:\n - \"desiredState.workloads.*\"\n denyRules:\n - type: StateRule\n operation: Write\n filterMask:\n - \"desiredState.workloads.watchdog\"\n
More information on the control interface authorization can be found in the reference documentation.
"},{"location":"usage/upgrading/v0_3_to_v0_4/#grpc-api-of-the-ankaios-server","title":"gRPC API of the Ankaios server","text":"Ankaios facilitates server-agent-CLI communication through an interchangeable middleware, currently implemented using gRPC. By segregating the gRPC API into a distinct grpc_api.proto
file, we clearly show the target and purpose of this interface.
If you are using the gRPC API of the Ankaios server directly (and not the CLI), you would need to cope with the splitting of the messages into grpc_api.proto
and ank_base.proto
. Apart from that, the API itself is exactly the same.
The structure of the workload execution states field in the Complete State was changed both for the proto and the textual (yaml/json) representations. The change was needed to make the filtering and authorization of getting workload states more intuitive. The old flat vector was supplemented with a new hierarchical structure. Here is an example how the workload states look now in YAML format:
workloadStates:\n agent_A:\n nginx:\n 7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d:\n state: Pending\n subState: Initial\n additionalInfo: ''\n agent_B:\n hello1:\n 9f4dce2c90669cdcbd2ef8eddb4e38d6238abf721bbebffd820121ce1633f705:\n state: Failed\n subState: Lost\n additionalInfo: ''\n
"},{"location":"usage/upgrading/v0_3_to_v0_4/#authentication-and-encryption","title":"Authentication and encryption","text":"Starting from v0.4.0 Ankaios supports mutual TLS (mTLS) for communication between server, agent and ank
CLI. The default installation script will install Ankaios without mTLS. When using the ank
CLI with such an installation, the arguments --insecure
or -k
have to be passed.
So
ank get workloads\n
will have to be changed to
ank -k get workloads\n
Alternatively, set the environment variable ANK_INSECURE=true
to avoid passing the -k
argument to each ank
CLI command.
When upgrading from v0.4 to v0.5, the installation script simply needs to be ran again. However, due to some breaking changes, some manual adjustments are required for existing workloads using the control interface.
"},{"location":"usage/upgrading/v0_4_to_v0_5/#initial-hello-message-for-the-control-interface","title":"InitialHello
message for the Control Interface","text":"In order to ensure version compatibility and avoid undefined behavior resulting from version mismatch, a new obligatory Hello
message was added to the Control Interface protocol. The Hello
must be sent by a workload communicating over the Control Interface at the start of the session as a first message. It is part of the ToAnkaios
message and has the following format:
message Hello {\n string protocolVersion = 2; /// The protocol version used by the calling component.\n}\n
Failing to sent the message before any other communication is done, or providing an unsupported version would result in a preliminary closing of the Control Interface session by Ankaios. The required protocolVersion
string is the current Ankaios release version. As Ankaios is currently in the initial development (no official major release), minor version differences are also handled as incompatible. After the official major release, only the major versions will be compared.
To inform the workload of this, a ConnectionClosed
is sent as part of the FromAnkaios
message. The ConnectionClosed
message contains the reason for closing the session as a string:
message ConnectionClosed {\n string reason = 1; /// A string containing the reason for closing the connection.\n}\n
After the ConnectionClosed
, no more messages would be read or sent by Ankaios on the input and output pipes.
The Control Interface instance cannot be reopened, but a new instance would be created if the workload is restarted.
"}]} \ No newline at end of file +{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Welcome","text":""},{"location":"#eclipse-ankaios","title":"Eclipse Ankaios","text":"Watch Eclipse Ankaios presentation at Eclipse SDV community day on July 6, 2023 on Youtube"},{"location":"#scope","title":"Scope","text":"Eclipse Ankaios provides workload and container orchestration for automotive High Performance Computing (HPC) software. While it can be used for various fields of applications, it is developed from scratch for automotive use cases and provides a slim, yet powerful solution to manage containerized applications. It supports various container runtimes with Podman as the first one, but other container runtimes and even native applications can be supported. Eclipse Ankaios is independent of existing communication frameworks like SOME/IP, DDS, or REST API.
Eclipse Ankaios manages multiple nodes and virtual machines with a single unique API in order to start, stop, configure, and update containers and workloads. It provides a central place to manage automotive applications with a setup consisting of one server and multiple agents. Usually one agent per node connects to one or more runtimes that are running the workloads.
"},{"location":"#next-steps","title":"Next steps","text":"Eclipse Ankaios follows the UNIX philosophy to have one tool for one job and do that job well. It does not depend on a specific init system like systemd but can be started with any init system. It also does not handle persistency but can use an existing automotive persistency handling, e.g. provided by AUTOSAR Adaptive.
The workloads are provided access to the Eclipse Ankaios API using access control and thus are able to dynamically reconfigure the system. One possible use case is the dynamic startup of an application that is only required in a particular situation such as a parking assistant. When the driver wants to park the car, a control workload can start the parking assistant application. When the parking is finished, the parking assistant workload is stopped again.
Eclipse Ankaios also provides a CLI that allows developers to develop and test configurations. In order to gain compatibility with Kubernetes, Eclipse Ankaios accepts pod specifications.
An optional fleet connector can use the Eclipse Ankaios API to connect to a cloud-based software update system, which allows an OEM to manage a fleet of vehicles and provide new states to Eclipse Ankaios in order to update single or all applications.
In order to support the Automotive SPICE process, Eclipse Ankaios comes with requirements tracing supported by OpenFastTrace.
"},{"location":"architecture/","title":"Architecture","text":""},{"location":"architecture/#overview","title":"Overview","text":"Two executables are used for each Ankaios deployment: the Ankaios server and the Ankaios agent:
When started, the Ankaios server loads the configured startup manifest file of the cluster and stores it as a desired state. To reach this desired state, the server instructs the Ankaios agents to start and stop workloads. Each Ankaios cluster runs exactly one instance of the Ankaios server making the server the single source of truth.
A running instance of the Ankaios agent is present on every node where Ankaios needs to execute workloads. The Ankaios agent is responsible for starting and stopping workloads, according to the commands it gets from the Ankaios server.
The Ankaios server itself does not run workloads directly so in order to start workloads on the node running the server, an Ankaios agent shall be started there too.
Ankaios also allows workloads to change the state stored in the Ankaios server via the control interface. Workloads access this interface by sending their requests to the Ankaios agent managing them. Each request is checked by the Ankaios agent and, on successful authorization, forwarded to the Ankaios server. This interface can be used to, e.g.:
In the diagram above one of the workloads on node 1 acts as fleet connector. It accesses a backend and forwards commands to the Ankaios server. In the example below the fleet connector gets an update from the backend, which adds a workload to node 2.
"},{"location":"architecture/#notes","title":"Notes","text":"Join our developer mailing list for up to date information or sending questions.
"},{"location":"support/#discussion-forum","title":"Discussion forum","text":"If you have a general question, an idea or want to show how you use Ankaios, the discussion forum might be the right place for you.
"},{"location":"support/#issue","title":"Issue","text":"For reporting bugs or suggesting enhancements a new issue should be created using one of the templates if possible.
"},{"location":"support/#slack","title":"Slack","text":"Join the conversion with the community in the Ankaios Slack workspace.
"},{"location":"development/build/","title":"Build","text":""},{"location":"development/build/#dev-container","title":"Dev container","text":"The repo provides a Visual Studio Code dev container which includes all necessary tools to build all components and the documentation. It also contains Podman, which is needed to run the system tests for Ankaios. In case you want to extend the dev container see extending the dev container.
"},{"location":"development/build/#prerequisites","title":"Prerequisites","text":"As prerequisites, you need to have the following tools set up:
The following steps assume an x86_64 host. For Mac with Apple silicon, see chapter Build for arm64 target.
To build and test the Ankaios agent and server, run the following command inside the dev container:
cargo build\n
and for release
cargo build --release\n
As Ankaios uses musl for static linking, the binaries will be located in target/x86_64-unknown-linux-musl
.
The dev container adds required tools for arm64
architecture. To build Ankaios for arm64
, run the following command inside the dev container:
cargo build --target aarch64-unknown-linux-musl --release\n
Info
When using a dev container on Mac with Apple silicon and the build fails, change the file sharing implementation in Docker Desktop. Goto Docker Desktop and Settings
, then General
and change the file sharing implementation from VirtioFS
to gRPC FUSE
. See also eclipse-ankaios/ankaios#147.
A release shall be built directly using the CI/CD environment GitHub Actions. The release build creates and uploads all necessary artifacts that are required for a release.
"},{"location":"development/ci-cd-release/#release-branches","title":"Release branches","text":"In order to stabilize an upcoming release or to create a patch release, a release branch can be created. The naming convention for such a branch is:
release-<major>.<minor>\n
For example release-0.4
.
For building a release a separate workflow exists inside .github/workflows/release.yml
. The release workflow reuses the complete build workflow from .github/workflows/build.yml
and its artifacts.
This allows to avoid having to duplicate the steps of the build workflow into the release workflow and thus have a single point of change for the build workflow.
The release workflow executes the build workflow, exports the build artifacts into an archive for each supported platform and uploads it to the GitHub release.
As an example the following release artifacts are created for linux-amd64:
The tar.gz archive contains the pre-built binaries for the Ankaios CLI, Ankaios server and Ankaios agent. The *.sha512sum.txt file contains the sha-512 hash of the archive.
"},{"location":"development/ci-cd-release/#release-scripts","title":"Release scripts","text":"To package the desired release artifacts a separate script tools/create_release.sh
is called inside the release job. The script calls another script tools/create_artifacts.sh
for each platform that creates the artifacts mentioned above.
In addition, it exports the following:
Within the release workflow the build artifacts are downloaded into a temporary folder called dist
which has the following structure:
\u251c\u2500\u2500 coverage\n\u2502 \u251c\u2500\u2500 index.html\n\u2502 \u2514\u2500\u2500 style.css\n\u251c\u2500\u2500 linux-amd64\n\u2502 \u2514\u2500\u2500 bin\n\u2502 \u251c\u2500\u2500 ank\n\u2502 \u251c\u2500\u2500 ank-agent\n\u2502 \u2514\u2500\u2500 ank-server\n\u251c\u2500\u2500 linux-arm64\n\u2502 \u2514\u2500\u2500 bin\n\u2502 \u251c\u2500\u2500 ank\n\u2502 \u251c\u2500\u2500 ank-agent\n\u2502 \u2514\u2500\u2500 ank-server\n\u2514\u2500\u2500 req_tracing_report.html\n
The platform specific files are downloaded into a sub-folder dist/<os>-<platform>/bin
. Reports and shared artifacts are placed into the dist
folder directly.
The scripts expect this folder structure to create final release artifacts.
"},{"location":"development/ci-cd-release/#adding-a-new-platform","title":"Adding a new Platform","text":"If a new platform shall be supported the following steps must be done:
.github/workflows/build.yml
and configure the upload of the artifacts, see CI/CD section..github/workflows/release.yml
to download the new artifacts. Under jobs.release.steps
add a new step after the existing download steps and replace the parameters <os>-<platform>
with the correct text (e.g. linux-amd64): jobs:\n ...\n release:\n steps:\n ...\n - name: Download artifacts for ankaios-<os>-<platform>-bin\n uses: actions/download-artifact@v4.1.7\n with:\n name: ankaios-<os>-<platform>-bin\n path: dist/<os>-<platform>/bin\n ...\n
The name ankaios-<os>-<platform>-bin
must match the used name in the upload artifact action defined inside the build workflow (.github/workflows/build.yml
). 3. Inside tools/create_release.sh
script add a new call to the script tools/create_artifacts.sh
like the following:
...\n \"${SCRIPT_DIR}\"/create_artifacts.sh -p <os>-<platform>\n...\n
The <os>-<platform>
string must match the name of the sub-folder inside the dist folder. The called script expects the pre-built binaries inside <os>-<platform>/bin
.
.github/workflows/release.yml
. Inside the step that uploads the release artifacts add the new artifact(s) to the github upload command:...\nrun: |\n gh release upload ${{ github.ref_name }}\n ...\n <os>-<platform>/ankaios-<os>-<platform>.tar.gz \\\n <os>-<platform>/ankaios-<os>-<platform>.tar.gz.sha512sum.txt\n ...\n
tools/install.sh
and update the script if needed.The release notes are generated automatically if a release is created via the GitHub web frontend by clicking on the Generate release notes
button.
The procedure uses the filters for pull request labels configured inside .github/release.yml
.
The following steps shall be done before the actual release build is triggered.
tools/update_version.sh --release <new version>
).Before building the release, all preparation steps shall be finished before.
The release shall be created directly via the GitHub web frontend.
When creating a release a tag with the following naming convention must be provided: vX.Y.Z
(e.g. v0.1.0).
Draft a new release
.Generate release notes
to generate the release notes automatically based on the filter settings for pull requests inside .github/release.yml
configuration. In case of unwanted pull requests are listed, label the pull requests correctly, delete the description field and generate the release notes again (The correction of the labels and the regeneration of the release notes can also be done after the release build.).Set as the latest release
is enabled. This setting is important otherwise the provided link for the installation script in chapter installation is still pointing to the previous release marked as latest.Publish release
.Note
There is a GitHub Action available to automatically rollback the created release and tag. This action is not used to have a better control over the cleanup procedure before a next release build is triggered. For instance, without auto-rollback a manually entered release description is still available after a failing release build.
"},{"location":"development/ci-cd/","title":"CI/CD","text":"As CI/CD environment GitHub Actions is used. Merge verifications in case of opening a pull request and release builds are fully covered into GitHub Action workflows. For information about release builds, see CI/CD - Release section.
"},{"location":"development/ci-cd/#merge-verification","title":"Merge verification","text":"When a pull request is opened, the following pipeline jobs run:
After a pull request was merged into the main branch, the jobs listed above are executed again to validate stable branch behavior.
The steps for the build workflow are defined inside .github/workflows/build.yml
.
The produced artifacts of the build workflow are uploaded and can be downloaded from GitHub for debugging or testing purposes.
"},{"location":"development/ci-cd/#adding-a-new-merge-verification-job","title":"Adding a new merge verification job","text":"To add a new merge verification job adjust the workflow defined inside .github/workflows/build.yml
.
Select a GitHub runner image matching your purposes or in case of adding a cross-build first make sure that the build works locally within the dev container.
jobs
jobs section and define a job name.ankaios-<os>-<platform>-bin
(e.g. ankaios-linux-amd64-bin) otherwise define a custom name. If the artifact is needed inside a release the artifact is referenced with this name inside the release workflow. ...\n - uses: actions/upload-artifact@v4.3.3\n with:\n name: ankaios-<os>-<platform>-bin\n path: dist/\n ...\n
Note
GitHub Actions only runs workflow definitions from main (default) branch. That means when a workflow has been changed and a PR has been created for that, the change will not become effective before the PR is merged in main branch. For local testing the act tool can be used.
"},{"location":"development/ci-cd/#adding-a-new-github-action","title":"Adding a new GitHub action","text":"When introducing a new GitHub action, do not use a generic major version tag (e.g. vX
). Specify a specific release tag (e.g. vX.Y.Z
) instead. Using the generic tag might lead to an unstable CI/CD environment, whenever the authors of the GitHub action update the generic tag to point to a newer version that contains bugs or incompatibilities with the Ankaios project.
Example:
Bad:
...\n - uses: actions/checkout@v4\n...\n
Good:
...\n - uses: actions/checkout@v4.1.1\n...\n
"},{"location":"development/ci-cd/#adding-github-action-jobs","title":"Adding GitHub action jobs","text":"When creating a new job inside a workflow, specify a job name for each job.
Example:
...\n\njobs:\n test_and_build_linux_amd64:\n name: Test and Build Linux amd64\n...\n
Note
Beside being a best practice, giving a job a name is needed to reference it from the self-service repository in order to configure the job as a required status check.
"},{"location":"development/documentation-guidelines/","title":"Documentation guidelines","text":"These guidelines apply to all documentation which is created in the Ankaios project like this website, software design documents or README files. The aim is to support the creators of documents by enforcing a common look and feel.
"},{"location":"development/documentation-guidelines/#capitalization","title":"Capitalization","text":"As 'Ankaios' is a proper noun it shall be written with a capital 'A'. Other words which are not proper nouns shall be in lower case when they are not the first word in a sentence.
Examples:
Correct Incorrect Ankaios ankaios Ankaios server Ankaios-Server, Ankaios-server, Ankaios Server Ankaios agent Ankaios-Agent, Ankaios-agent, Ankaios Agent workload Workload control interface Control InterfaceThe same rule also applies to headlines, i.e. only the first word of a headline is in upper case.
"},{"location":"development/extending-dev-container/","title":"Extending the dev container","text":"The dev container is relatively large. If there is a need to include additional items in the dev container, please note that it is split into two parts due to its size:
A base container available from ghcr.io/eclipse-ankaios/devcontainer
which, in case of a change, needs to be build manually from .devcontainer/Dockerfile.base
(see below for instructions).
A docker container which derives from the base image mentioned above is specified in .devcontainer/Dockerfile
(so don't forget to reference your new version there once you build one).
If you want to add some additional tools, you can initially do it in .devcontainer/Dockerfile
, but later on they need to be pulled in the base image in order to speed up the initial dev container build.
The base container is available for amd64 and arm64/v8 architectures. There are two options to build the base container:
In case the multiplatform build is used, one image can be build natively on the host platform (usually amd64) while the other needs to be emulated.
Build the base container by running the following commands outside of the dev container:
# Prepare the build with buildx. Depending on you environment\n# the following steps might be necessary:\ndocker run --rm --privileged multiarch/qemu-user-static --reset -p yes --credential yes\n\n# Create and use a new builder. This needs to be called only once:\ndocker buildx create --name mybuilder --driver docker-container --bootstrap\ndocker buildx use mybuilder\n\n# Now build the new base image for the dev container\ncd .devcontainer\ndocker buildx build -t ghcr.io/eclipse-ankaios/devcontainer-base:<version> --platform linux/amd64,linux/arm64 -f Dockerfile.base .\n
In order to push the base image append --push
to the previous command.
Note: If you wish to locally test the base image in VSCode before proceeding, utilize the default builder and exclusively build for the default platform like
docker buildx use default\ndocker buildx build -t ghcr.io/eclipse-ankaios/devcontainer-base:<version> -f Dockerfile.base --load .\n
"},{"location":"development/extending-dev-container/#separate-builds-for-different-architectures","title":"Separate builds for different architectures","text":"Due to the emulation for the non-host architecture, the previous multiplatform build might take some time. An alternative is to build the two images separately on different hosts matching the target architecture. For arm64 for example cloud instances with ARM architecture (like AWS Graviton) can be used.
To build the base image this way, perform the following steps:
# On arm64 host: Build arm64 image\ncd .devcontainer\ndocker buildx build -t ghcr.io/eclipse-ankaios/devcontainer-base:<version>-arm64 -f Dockerfile.base --push .\n\n# On amd64 host: Build amd64 image\ncd .devcontainer\ndocker buildx build -t ghcr.io/eclipse-ankaios/devcontainer-base:<version>-amd64 -f Dockerfile.base --push .\n\n# On any host: Create manifest list referencing both images\ndocker buildx imagetools create \\\n -t ghcr.io/eclipse-ankaios/devcontainer-base:<version> \\\n ghcr.io/eclipse-ankaios/devcontainer-base:<version>-amd64 \\\n ghcr.io/eclipse-ankaios/devcontainer-base:<version>-arm64\n
"},{"location":"development/requirement-template/","title":"Requirement description template","text":"All requirements in Ankaios shall be written in the following format:
<Requirement title>\n`swdd~<component>-<descriptive requirement id>~<version>`\n\nStatus: approved\n\n[When <condition separated by and>], <object> shall <do something | be in state | execute a list of actions in order/in parallel | \u2026>\n\nComment:\n<comment body>\n\nRationale:\n<rationale body>\n\nTags:\n- <tag1>\n- <tag2>\n- \u2026\n\nNeeds:\n- [impl/utest/stest]\n
NOTE:
Here is an example of the requirement from the Ankaios agent:
#### AgentManager listens for requests from the Server\n`swdd~agent-manager-listens-requests-from-server~1`\n\nStatus: approved\n\nThe AgentManager shall listen for request from the Server.\n\nTags:\n- AgentManager\n\nNeeds:\n- impl\n- utest\n- itest\n
This requirement template has been inspired by:
https://aaltodoc.aalto.fi/server/api/core/bitstreams/d518c3cc-4d7d-4c69-b7db-25d2da9e847f/content
"},{"location":"development/requirement-tracing/","title":"Requirement tracing","text":""},{"location":"development/requirement-tracing/#introduction","title":"Introduction","text":"The Eclipse Ankaios project provides requirement tracing using the OpenFastTrace requirement tracing suite. The dev container already includes the required tooling. To generate a requirement tracing report call:
just trace-requirements\n
Afterwards the HTML report is available under build/req/req_tracing_report.html
and shows the current coverage state.
For details on the OpenFastTrace tool, please consult OFT's user documentation or execute oft help
.
Eclipse Ankaios traces requirements between
**/doc/README.md
)**/src/**
)**/src/**
, tests/**
)Thus, for new features:
swdd
)impl
, e.g., // [impl->swdd~this-is-a-requirement~1]
utest
, itest
or stest
depending on the type of the test, e.g., // [utest->swdd~this-is-a-requirement~1]
for a unit testThe format of a requirement is described in the next section Requirement description template.
"},{"location":"development/run-unit-tests/","title":"Unit tests with cargo-nextest","text":"We use test runner cargo-nextest because of the following reasons:
cargo test
.If you want to run all unit tests without traces, call in the root of the project:
cargo nextest run\n
Some unit tests can print trace logs. If you want to see them, you have to set the RUST_LOG
environment variable before running unit tests.
RUST_LOG=debug cargo nextest run\n
Cargo-nextest also allows to run only a subset of unit tests. You have to set the \"filter string\" in the command:
cargo nextest run <filter string>\n
Where the filter string
is part of unit test name. For example we have a unit test with the name:
test podman::workload::container_create_success\n
If you want to call only this test, you can call:
cargo nextest run workload::container_create_success\n
If you want to call all tests in workload.rs
, you have to call:
cargo nextest run podman::workload\n
You can also call only tests in workload.rs
, which have a name starting with container
:
cargo nextest run podman::workload::container\n
"},{"location":"development/rust-coding-guidelines/","title":"Rust coding guidelines","text":"When engaging in collaborative software projects, it is crucial to ensure that the code is well-organized and comprehensible. This facilitates ease of maintenance and allows for seamless extension of the project. To accomplish this objective, it is essential to establish shared guidelines that the entire development team adheres to.
The goal is to get a harmonized code-base which appears to come from the same hands. This simplifies reading and understanding the intention of the code and helps maintaining the development speed.
The following chapters describe rules and concepts to fit clean code expectations.
"},{"location":"development/rust-coding-guidelines/#clean-code","title":"Clean code","text":"We like our code clean and thus use the \"Clean Code\" rules from \"uncle Bob\". A short summary can be found here.
As rust could get a bit messy, feel free to add some additional code comments to blocks that cannot be made readable using the clean code rules.
"},{"location":"development/rust-coding-guidelines/#naming-conventions","title":"Naming conventions","text":"We follow the standard Rust naming conventions.
Names of components, classes , functions, etc. in code should also follow the prescriptions in SW design. Before thinking of new names, please make sure that we have not named the beast already.
Names of unit tests within a file shall be hierarchical. Tests which belong together shall have the same prefix. For example the file workload.rs
contains following tests:
container_create_success
container_create_failed
container_start_success
container_start_failure_no_id
So if you want to call tests which work with container, you can write
cargo nextest run container\n
If you want to call tests of the \"container create\" function, you can call:
cargo nextest run container_create\n
More information about calling unit tests is in The Rust Programming Language.
"},{"location":"development/rust-coding-guidelines/#logging-conventions","title":"Logging conventions","text":"The following chapters describe rules for creating log messages.
"},{"location":"development/rust-coding-guidelines/#log-format-of-internal-objects","title":"Log format of internal objects","text":"When writing log messages that reference internal objects, the objects shall be surrounded in single quotes, e.g.:
log::info!(\"This is about object '{}'.\", object.name)\n
This helps differentiate static from dynamic data in the log message.
"},{"location":"development/rust-coding-guidelines/#log-format-of-multiline-log-messages","title":"Log format of multiline log messages","text":"Multi line log messages shall be created with the concat!
macro, e.g.:
log::debug!(concat!(\n \"First line of a log message that lists something:\\n\",\n \" flowers are: '{}'\\n\",\n \" weather is: {}\")\n color, current_weather);\n
This ensures that the log messages are formatted correctly and simplifies writing the message.
"},{"location":"development/rust-coding-guidelines/#choose-a-suitable-log-severity","title":"Choose a suitable log severity","text":"Severity Use Case Trace A log that is useful for diagnostic purposes and/or more granular than severity debug. Debug A log that is useful for developers meant for debugging purposes or hit very often. Info A log communicating important information like important states of an application suitable for any kind of user and that does not pollute the output. Warn A log communicating wrong preconditions or occurrences of something unexpected but do not lead to a panic of the application. Error A log communicating failures and consequences causing a potential panic of the application."},{"location":"development/rust-coding-guidelines/#unit-test-convenience-rules","title":"Unit test convenience rules","text":"The following chapter describes important rules about how to write unit tests.
"},{"location":"development/rust-coding-guidelines/#test-mockobject-generation","title":"Test mock/object generation","text":"When writing tests, one of the most tedious task is to setup the environment and create the necessary objects and/or mocks to be able to test the desired functionality. Following the DRY principle and trying to save some effort, we shall always place the code that generates a test or mock object in the same module/file where the mock of the object is defined.
For example, when you would like to generate and reuse a mock for the Directory
structure located in the agent/src/control_interface/directory.rs
file, you shall
pub fn generate_test_directory_mock() -> __mock_MockDirectory::__new::Context;\n
The <datatype_name>
in __mock_Mock<datatype_name>::__new::Context
must be replaced with the name of the type the mock is created for.
#[cfg(test)]
(or #[cfg(feature = \"test_utils\")]
in case of a library) before the function to restrict its compilation to test onlyAll object/mock generation functions shall start with generate_test_
.
Bad:
let numbers = vec![1, 2, 3, 4, 5, 6, 7, 8];\n\nlet mut filtered_numbers = Vec::new();\n// filter numbers smaller then 3\nfor number in numbers {\n if number < 3 {\n filtered_numbers.push(number);\n }\n}\n
Good:
Prefer standard library algorithms over own implementations to avoid error prone code.
let numbers = vec![1, 2, 3, 4, 5, 6, 7, 8];\nlet filtered_numbers: Vec<i32> = numbers.into_iter().filter(|x| x < &3).collect();\n
"},{"location":"development/rust-coding-guidelines/#prefer-error-propagation","title":"Prefer error propagation","text":"Bad:
A lot of conditionals for opening and reading a file.
use std::fs::File;\nuse std::io;\nuse std::io::Read;\n\nfn read_from_file(filepath: &str) -> Result<String, io::Error> {\n let file_handle = File::open(filepath);\n let mut file_handle = match file_handle {\n Ok(file) => file,\n Err(e) => return Err(e),\n };\n\n let mut buffer = String::new();\n\n match file_handle.read_to_string(&mut buffer) {\n Ok(_) => Ok(buffer),\n Err(e) => Err(e)\n }\n}\n
Good:
Prefer error propagation over exhaustive match and conditionals.
Error propagation shortens and cleans up the code path by replacing complex and exhaustive conditionals with the ?
operator without loosing the failure checks.
The refactored variant populates the error and success case the same way to the caller like in the bad example above, but is more readable:
fn read_from_file(filepath: &str) -> Result<String, io::Error> {\n let mut buffer = String::new();\n File::open(filepath)?.read_to_string(&mut buffer)?;\n Ok(buffer)\n}\n
In case of mismatching error types, provide a custom From-Trait implementation to convert between error types to keep the benefits of using the ?
operator. But keep in mind that error conversion shall be used wisely (e.g. for abstracting third party library error types or if there is a benefit to introduce a common and reusable error type). The code base shall not be spammed with From-Trait implementations to replace each single match or conditional.
Error propagation shall also be preferred when converting between Result<T,E>
and Option<T>
.
Bad:
fn string_to_percentage(string: &str) -> Option<f32> {\n // more error handling\n match string.parse::<f32>() {\n Ok(value) => Some(value * 100.),\n _ => None,\n }\n}\n
Good:
fn string_to_percentage(string: &str) -> Option<f32> {\n // more error handling\n let value = string.parse::<f32>().ok()?; // returns None on parsing error\n Some(value * 100.)\n}\n
"},{"location":"development/rust-coding-guidelines/#avoid-unwrap-and-expect","title":"Avoid unwrap and expect","text":"Unwrap
or expect
return the value in success case or call the panic!
macro if the operation has failed. Applications that are often terminated directly in case of errors are considered as unprofessional and not useful.
Bad:
let value = division(10, 0).unwrap(); // panics, because of a simple division!!!\n
Good:
Replace unwrap
or expect
with a conditional check, e.g. match expression:
let value = division(10, 0); // division 10 / 0 not allowed, returns Err\n\n// conditional check before accessing the value\nmatch value {\n Ok(value) => println!(\"{value}\"),\n Err(e) => eprintln!(\"{e}\")\n}\n
or with if-let condition when match is awkward:
// access value only on success\nif let Ok(value) = division(10, 0) {\n println!(\"{value}\")\n}\n
or if possible continue with some default value in case of an error:
let result = division(10, 0).unwrap_or(0.);\n
Exceptions:
In some cases terminating a program might be necessary. To make a good decision when to panic a program or not, the official rust book might help: To panic! or Not to panic!
When writing unit tests using unwrap
helps to keep tests short and to concentrate on the assert!
statements:
Bad:
let container: Option<HashMap<i32, String>> = operation_under_test();\nmatch container {\n Some(container) => {\n match container.get(&0) {\n Some(value_of_0) => assert_eq!(value_of_0, &\"hello world\".to_string()),\n _ => { panic!(\"Test xy failed, no entry.\") }\n }\n },\n _ => { panic!(\"Test xy failed, no container.\") }\n}\n
Good:
Prefer direct unwrap
calls over assert!
statements nested in complex conditional clauses. It is shorter and the assert!
statement is directly eye-catching.
let container: Option<HashMap<i32, String>> = operation_under_test();\nlet value_of_0 = container.unwrap().remove(&0).unwrap(); // the test is failing on error\n\nassert_eq!(value_of_0, \"hello world\".to_string());\n
"},{"location":"development/rust-coding-guidelines/#prefer-while-let-over-match-in-loops","title":"Prefer while-let over match in loops","text":"Use the shorter and cleaner while-let expression to eliminate exhaustive match sequences in loops:
Bad:
loop {\n match generate() {\n Some(value) => println!(\"{value}\"),\n _ => { break; },\n }\n}\n
Good:
// if success use the value else break\n// ...or while let Ok(value) in case of Result<T,E> instead of Option<T>\nwhile let Some(value) = generate() {\n println!(\"{value}\")\n}\n
"},{"location":"development/rust-coding-guidelines/#prefer-lazily-evaluated-functional-chaining","title":"Prefer lazily evaluated functional chaining","text":"Bad:
Eagerly evaluated functions are always evaluated regardless of the success or error case. If the alternative is not taken potentially costly operations are performed unnecessarily.
let value = division(2., 10.);\nlet result = value.and(to_percentage(value)); // eagerly evaluated\n\nlet value = division(2., 10.);\nlet result = value.or(provide_complex_alternative()); // eagerly evaluated\n\nlet value = division(2., 10.);\nlet result = value.unwrap_or(generate_complex_default()); // eagerly evaluated\n
Good:
Lazily evaluated functions are only evaluated if the case actually occurs and are preferred if the alternatives provide costly operations.
let result = division(2., 10.).and_then(to_percentage); // lazily evaluated\n\nlet result = division(2., 10.).or_else(provide_complex_alternative); // lazily evaluated\n\nlet result = division(2., 10.).unwrap_or_else(generate_complex_default); // lazily evaluated\n
"},{"location":"development/rust-coding-guidelines/#avoid-exhaustive-nested-code","title":"Avoid exhaustive nested code","text":"Bad:
The code is hard to read and the interesting code path is not an eye-catcher.
fn list_books(&self) -> Option<Vec<String>> {\n if self.wifi {\n if self.login {\n if self.admin {\n return Some(get_list_of_books());\n } else {\n eprintln!(\"Expected login as admin.\");\n }\n } else {\n eprintln!(\"Expected login.\");\n }\n } else {\n eprintln!(\"Expected connection.\");\n }\n None\n}\n
Good:
Nest code only into 1 or 2 levels. Use early-exit pattern to reduce the nest level and to separate error handling code from code doing the actual logic.
fn list_books(&self) -> Option<Vec<String>> {\n if !self.wifi {\n eprintln!(\"Expected connection.\");\n return None;\n }\n\n if !self.login {\n eprintln!(\"Expected login.\");\n return None;\n }\n\n if !self.admin {\n eprintln!(\"Expected login as admin.\");\n return None;\n }\n\n // interesting part\n Some(get_list_of_books())\n}\n
As an alternative, when dealing with Option<T>
or Result<T,E>
use Rust's powerful combinators to keep the code readable.
Understanding and practicing important Rust idioms help to write code in an idiomatic way, meaning resolving a task by following the conventions of a given language. Writing idiomatic Rust code ensures a clean and consistent code base. Thus, please follow the guidelines of Idiomatic Rust.
"},{"location":"development/rust-coding-guidelines/#avoid-common-anti-patterns","title":"Avoid common anti-patterns","text":"There are a lot of Rust anti-patterns that shall not be used in general. To get more details about anti-patterns, see here.
"},{"location":"development/rust-coding-guidelines/#dont-make-sync-code-async","title":"Don't make sync code async","text":"Async code is mainly used for I/O intensive, network or background tasks (Databases, Servers) to allow executing such tasks in a non-blocking way, so that waiting times can be used reasonably for executing other operations. However operations that do not fit to async use cases and are called synchronously shall not be made async because there is no real benefit. Async code is more difficult to understand than synchronous code.
Bad:
No need for making those operations async, because they are exclusively called synchronously. It is just more syntax and the code raises more questions about the intent to the reader.
let result1 = operation1().await;\nlet result2 = operation2().await;\nlet result3 = operation3().await;\n
Good:
Keep it synchronous and thus simple.
let result1 = operation1();\nlet result2 = operation2();\nlet result3 = operation3();\n
"},{"location":"development/rust-coding-guidelines/#dont-mix-sync-and-async-code-without-proper-consideration","title":"Don\u2019t mix sync and async code without proper consideration","text":"Mixing sync and async code can lead to a number of problems, including performance issues, deadlocks, and race conditions. Avoid mixing async with sync code unless there is a good reason to do so.
"},{"location":"development/rust-coding-guidelines/#further-readings","title":"Further Readings","text":"The Eclipse Foundation offers self-service of GitHub resources. We are using this self-service to customize Github settings, for example to change branch protection rules or other important settings of the Ankaios project. The current GitHub configuration is hosted as code inside a separate repository called .eclipsefdn.
The settings are in jsonnet format and can be modified by contributors.
A detailed overview of the self-service please have a look into the self-service handbook.
"},{"location":"development/self-service/#process-of-changing-the-settings","title":"Process of changing the settings","text":"If a configuration needs to be changed the process is the following:
System tests are a critical phase of software testing, aimed at evaluating the entire software system as a whole to ensure that it meets its specified requirements and functions correctly in its intended environment. These tests are conducted after unit and integration testing and serve as a comprehensive validation of the software's readiness for deployment.
Here are key aspects of system tests:
End-to-End Evaluation: System tests assess the software's performance, functionality, and reliability in a real-world scenario, simulating the complete user journey. They cover all aspects of the system, from the user interface to the backend processes.
Functional and Non-Functional Testing: These tests not only verify that the software's features work as intended (functional testing) but also assess non-functional attributes like performance, scalability, security, and usability.
Scenario-Based Testing: Test scenarios are designed to replicate various user interactions, use cases, and business workflows. This includes testing different paths, inputs, and error conditions to ensure the system handles them correctly.
Interoperability Testing: In cases where the software interacts with external systems or components, system tests evaluate its compatibility and ability to communicate effectively with these external entities.
Data Integrity and Security: Ensuring the protection of sensitive data and the integrity of information is a critical part of system testing. This includes checking for vulnerabilities and ensuring compliance with security standards.
Performance Testing: Assessing the system's response times, resource utilization, and scalability under various load conditions to ensure it can handle expected levels of usage.
Regression Testing: System tests often include regression testing to ensure that new features or changes do not introduce new defects or disrupt existing functionality.
The Robot test framework, often referred to as just \"Robot Framework,\" is a popular open-source test automation framework used for automating test cases in various software applications. It is designed to be easy to use, highly readable, and adaptable for both beginners and experienced testers. It employs a keyword-driven approach, which means that test cases are written using a combination of keywords that represent actions, objects, and verifications. These keywords can be custom-defined by using Python programming language or come from libraries specific to the application under test. One of the standout features of Robot Framework is its human-readable syntax. Test cases are written in plain text composed with defined keywords, making it accessible to non-programmers and allowing stakeholders to understand and contribute to test case creation. Because of the ability to create custom keywords, a pool of domain specific and generic keywords could be defined to form an Ankaios project specific language for writing test cases.This makes it possible to directly use the test specifications written in natural language or the same wording of it to write automated test cases. This is the main reason why we use this test framework for system tests in Ankaios.
"},{"location":"development/system-tests/#system-tests-structure","title":"System tests structure","text":"ankaios # Ankaios root\n |--tests # Location for system tests and their resources\n | |--resources # Location for test resources\n | | |--configs # Location for test case specific start-up configuration files\n | | | |--default.yaml # A start-up configuration file\n | | | |--... <---------------- # Add more configuration files here!\n | | |\n | | |--ankaios_library.py # Ankaios keywords implementations\n | | |--ankaios.resource # Ankaios keywords\n | | |--variables.resource # Ankaios variables\n | | |--... <------------------- # Add more keywords and keywords implementation resources here!\n | |\n | |--stests # Location for system tests\n | | |--workloads # Location for tests with specific test subject focus e.g. \"workloads\" for tests related \"workloads\"\n | | | |--list_workloads.robot # A test suite testing \"list workloads\"\n | | | |--... <---------------- # Add more tests related to \"workloads\" here!\n | | |... <--------------------- # Add test subject focus here!\n
"},{"location":"development/system-tests/#system-test-creation","title":"System test creation","text":""},{"location":"development/system-tests/#a-generic-ankaios-system-test-structure","title":"A generic Ankaios system test structure","text":"The most common approach to create a robot test is using the space separated format where pieces of the data, such as keywords and their arguments, are separated from each others with two or more spaces. A basic Ankaios system test consists of the following sections:
# ./tests/stests/workloads/my_workload_stest.robot\n\n*** Settings ***\nDocumentation Add test suit documentation here. # Test suite documentation\nResource ../../resources/ankaios.resource # Ankaios specific keywords that forms the Ankaios domain language\nResource ../../resources/variables.resource # Ankaios variables e.g. CONFIGS_DIR\n\n*** Test Cases ***\n[Setup] Setup Ankaios\n# ADD YOUR SYSTEM TEST HERE!\n[Teardown] Clean up Ankaios\n
For more best practices about writing tests with Robot framework see here.
"},{"location":"development/system-tests/#behavior-driven-system-test","title":"Behavior-driven system test","text":"Behavior-driven tests (BDT) use natural language specifications to describe expected system behavior, fostering collaboration between teams and facilitating both manual and automated testing. It's particularly valuable for user-centric and acceptance testing, ensuring that software aligns with user expectations. The Robot test framework supports BDT, and this approach shall be preferred for writing system tests in Ankaios the project.
Generic structure of BDT:
*** Test Cases ***\n[Setup] Setup Ankaios\nGiven <preconditions>\nWhen <actions>\nThen <asserts>\n[Teardown] Clean up Ankaios\n
Example: System test testing listing of workloads.
*** Settings ***\nDocumentation Tests to verify that ank cli lists workloads correctly.\nResource ../../resources/ankaios.resource\nResource ../../resources/variables.resource\n\n*** Test Cases ***\nTest Ankaios CLI get workloads\n [Setup] Setup Ankaios\n # Preconditions\n Given Ankaios server is started with \"ank-server --startup-config ${CONFIGS_DIR}/default.yaml\"\n And Ankaios agent is started with \"ank-agent --name agent_B\"\n And all workloads of agent \"agent_B\" have an initial execution state\n And Ankaios agent is started with \"ank-agent --name agent_A\"\n And all workloads of agent \"agent_A\" have an initial execution state\n # Actions\n When user triggers \"ank -k get workloads\"\n # Asserts\n Then the workload \"nginx\" shall have the execution state \"Running\" on agent \"agent_A\"\n And the workload \"hello1\" shall have the execution state \"Removed\" from agent \"agent_B\"\n And the workload \"hello2\" shall have the execution state \"Succeeded\" on agent \"agent_B\"\n And the workload \"hello3\" shall have the execution state \"Succeeded\" on agent \"agent_B\"\n [Teardown] Clean up Ankaios\n
Note
For Ankaios manifests that are used for system tests, only images from ghcr.io should be used. A lot of other registries (docker.io, quay.io) apply rate limits which might cause failures when executing the system tests.
"},{"location":"development/system-tests/#run-long-runtime-system-tests-upon-merge-into-main","title":"Run long-runtime system tests upon merge into main","text":"To keep the pull request status check runtime short, system tests with a longer runtime (> 30-40 seconds) shall be excluded from the pull request CI/CD verification by assigning the tag \"non_execution_during_pull_request_verification\" directly to the test case. When the pull request is merged into the main branch, the system test is executed. A contributor shall check the test results of those system tests afterwards.
Example system test that runs only on merge into main:
...\n\n*** Test Cases ***\n...\n\nTest Ankaios Podman stops retries after reaching the retry attempt limit\n [Tags] non_execution_during_pull_request_verification\n [Setup] Run Keywords Setup Ankaios\n\n...\n
"},{"location":"development/system-tests/#system-test-execution","title":"System test execution","text":"Warning
The system tests will delete all Podman containers, pods and volume. We recomment to only execute the system tests in the dev container.
A shell script is provided for the easy execution of the system tests. The script does the following:
ank
, ank-server
and ank-agent
) are available at specified path.{Ankaios root folder}/target/robot_tests_result
.Generic syntax:
/workspaces/ankaios$ [ANK_BIN_DIR=path_to_ankaios_executables] tools/run_robot_tests <options> <directory or robot file>\n
If ANK_BIN_DIR is not provided the script looks in the path {Ankaios root folder}/target/x86_64-unknown-linux-musl/debug
for the Ankaios executables. The supported options are the same as of robot
cli, so for more detailed description about it see here.
Note: In order to be able to start podman
runtime in the dev container properly, the dev container needs to be run in privilege
mode.
/workspaces/ankaios$ tools/run_robot_tests.sh tests\n
Example output:
Use default executable directory: /workspaces/ankaios/tools/../target/x86_64-unknown-linux-musl/debug\nFound ank 0.1.0\nFound ank-server 0.1.0\nFound ank-agent 0.1.0\n==============================================================================\nTests\n==============================================================================\nTests.Stests\n==============================================================================\nTests.Stests.Workloads\n==============================================================================\nTests.Stests.Workloads.List Workloads :: List workloads test cases.\n==============================================================================\nTest Ankaios CLI get workloads | PASS |\n------------------------------------------------------------------------------\nTests.Stests.Workloads.List Workloads :: List workloads test cases. | PASS |\n1 test, 1 passed, 0 failed\n==============================================================================\nTests.Stests.Workloads.Update Workload :: Update workload test cases.\n==============================================================================\nTest Ankaios CLI update workload | PASS |\n------------------------------------------------------------------------------\nTests.Stests.Workloads.Update Workload :: Update workload test cases. | PASS |\n1 test, 1 passed, 0 failed\n==============================================================================\nTests.Stests.Workloads | PASS |\n2 tests, 2 passed, 0 failed\n==============================================================================\nTests.Stests | PASS |\n2 tests, 2 passed, 0 failed\n==============================================================================\nTests | PASS |\n2 tests, 2 passed, 0 failed\n==============================================================================\nOutput: /workspaces/ankaios/target/robot_tests_result/output.xml\nLog: /workspaces/ankaios/target/robot_tests_result/log.html\nReport: /workspaces/ankaios/target/robot_tests_result/report.html\n
"},{"location":"development/system-tests/#example-run-a-single-test-file","title":"Example: Run a single test file","text":"/workspaces/ankaios$ tools/run_robot_tests.sh tests/stests/workloads/list_workloads.robot\n
Example output:
Use default executable directory: /workspaces/ankaios/tools/../target/x86_64-unknown-linux-musl/debug\nFound ank 0.1.0\nFound ank-server 0.1.0\nFound ank-agent 0.1.0\n==============================================================================\nList Workloads :: List workloads test cases.\n==============================================================================\nTest Ankaios CLI get workloads | PASS |\n------------------------------------------------------------------------------\nList Workloads :: List workloads test cases. | PASS |\n1 test, 1 passed, 0 failed\n==============================================================================\nOutput: /workspaces/ankaios/target/robot_tests_result/output.xml\nLog: /workspaces/ankaios/target/robot_tests_result/log.html\nReport: /workspaces/ankaios/target/robot_tests_result/report.html\n
"},{"location":"development/system-tests/#integration-in-github-workflows","title":"Integration in GitHub workflows","text":"The execution of the system tests is integrated in the GitHub workflow build step and will be triggered on each commit on a pull request.
"},{"location":"development/test-coverage/","title":"Test coverage","text":"To generate the test coverage report, run the following commands in ankaios
workspace which is /home/vscode/workspaces/ankaios/
:
To print out directly into the console:
cov test\n
Or to produce a report in html:
cov test --html\n
The script outputs where to find the report html:
...\nFinished report saved to /workspaces/ankaios/target/llvm-cov/html\n
Note: By the first usage you might be asked for confirmation to install the llvm-tools-preview
tool.
While writing tests, you may want to execute only the tests in a certain file and check the reached coverage. To do so you can execute:
To print out directly into the console:
cov test ankaios_server\n
Or to produce a report in html:
cov test ankaios_server --html\n
Once the run is complete, you can check the report to see which lines are not covered yet.
"},{"location":"development/unit-verification/","title":"Unit verification","text":"This page defines which tools and processes are used in in this project for the purposes of software unit verification. The unit verification process is performed during implementation phase and is as automated as possible, one exception is the code review which cannot be done automatically. Automated unit test runs are executed by the CI build system as well as during the regular releasing process.
"},{"location":"development/unit-verification/#verification-tools-and-procedures","title":"Verification tools and procedures","text":"Ankaios development follows the guidelines specified in the Rust coding guidelines.
"},{"location":"development/unit-verification/#code-review","title":"Code review","text":"Code reviews are part of the implementation process and performed before code is merged to the main branch. Contributors create pull requests and request a review s.t. the process can be started. The review is performed by at least one committer who has good knowledge of the area under review. When all applicable review criteria and checklists are passed and reviewer(s) have accepted the change, code can be merged to the main branch.
"},{"location":"development/unit-verification/#verification-by-unit-test","title":"Verification by unit test","text":""},{"location":"development/unit-verification/#test-focus-and-goal","title":"Test focus and goal","text":"The objective of the unit test is to confirm the correct internal behavior of a software unit according to the design aspects documented in the SW design. A unit test will test the unit in the target environment by triggering unit methods/functions and verifying the behavior. Stubbed interfaces/mocking techniques can be used to meet the code coverage requirements. This means that unit tests shall be written according to the detailed requirements. Requirement source is SW design.
"},{"location":"development/unit-verification/#unit-test-case-naming-convention","title":"Unit test case naming convention","text":"By introducing a naming convention for unit test cases a harmonized test code-base can be achieved. This simplifies reading and understanding the intention of the unit test case. Please see the naming convention defined in Rust coding guidelines.
"},{"location":"development/unit-verification/#unit-test-organization","title":"Unit test organization","text":"The unit tests shall be written in the same file as the source code like suggested in the Rust Language Book and shall be prefixed with utest_
.
At the end of the file e.g. my_module/src/my_component.rs
:
...\nfn my_algorithm(input: i32) -> Vec<u8> {\n ...\n}\n\nasync fn my_async_function(input: i32) -> Vec<u8> {\n ...\n}\n...\n#[cfg(test)]\nmod tests {\n ...\n #[test]\n fn utest_my_algorithm_returns_empty_array_when_input_is_0_or_negative() {\n ...\n }\n\n #[tokio::test]\n async fn utest_my_async_function_returns_empty_array_when_input_is_0_or_negative() {\n ...\n }\n}\n
"},{"location":"development/unit-verification/#test-execution-and-reports","title":"Test Execution and Reports","text":"Unit test cases are executed manually by the developer during implementation phase and later automatically in CI builds. Unit test and coverage reports are generated and stored automatically by the CI build system. If unit test case fails before code is merged to main branch (merge verification), the merge is not allowed until the issue is fixed. If unit test case fails after the code is merged to main branch, it is reported via email and fixed via internal Jira ticket reported by the developer.
Regression testing is done by the CI build system.
"},{"location":"development/unit-verification/#goals-and-metrics","title":"Goals and Metrics","text":"The following table show how test coverage is currently shown in the coverage report:
Goal Metric Red Yellow Green Code coverage <80% >80% 100%Currently there is no proper way of explicitly excluding parts of the code from the test coverage report in order to get to an easily observable value of 100%. The explicitly excluded code would have a corresponding comment stating the reason for excluding it. As this is not possible, we would initially target at least 80% line coverage in each file.
"},{"location":"reference/_ankaios.proto/","title":"Protocol Documentation","text":""},{"location":"reference/_ankaios.proto/#table-of-contents","title":"Table of Contents","text":"control_api.proto
ank_base.proto
WorkloadStatesMap.AgentStateMapEntry
AddCondition
Scalar Value Types
Top
"},{"location":"reference/_ankaios.proto/#control_apiproto","title":"control_api.proto","text":"The Ankaios Control Interface is used in the communcation between a workload and Ankaios
The protocol consists of the following top-level message types:
ToAnkaios: workload -> ankaios
FromAnkaios: ankaios -> workload
This message informs the user of the Control Interface that the connection was closed by Ankaios. No more messages will be processed by Ankaios after this message is sent.
Field Type Label Description reason string A string containing the reason for closing the connection. "},{"location":"reference/_ankaios.proto/#fromankaios","title":"FromAnkaios","text":"Messages from the Ankaios server to e.g. the Ankaios agent.
Field Type Label Description response ank_base.Response A message containing a response to a previous request. connectionClosed ConnectionClosed A message sent by Ankaios to inform a workload that the connection to Anakios was closed. "},{"location":"reference/_ankaios.proto/#hello","title":"Hello","text":"This message is the first one that needs to be sent when a new connection to the Ankaios cluster is established. Without this message being sent all further request are rejected.
Field Type Label Description protocolVersion string The protocol version used by the calling component. "},{"location":"reference/_ankaios.proto/#toankaios","title":"ToAnkaios","text":"Messages to the Ankaios server.
Field Type Label Description hello Hello The fist message sent when a connection is established. The message is needed to make sure the connected components are compatible. request ank_base.Request A request to AnkaiosTop
"},{"location":"reference/_ankaios.proto/#ank_baseproto","title":"ank_base.proto","text":""},{"location":"reference/_ankaios.proto/#accessrightsrule","title":"AccessRightsRule","text":"A message containing an allow or deny rule.
Field Type Label Description stateRule StateRule Rule for getting or setting the state "},{"location":"reference/_ankaios.proto/#agentattributes","title":"AgentAttributes","text":"A message that contains attributes of the agent.
Field Type Label Description cpu_usage CpuUsage The cpu usage of the agent. free_memory FreeMemory The amount of free memory of the agent. "},{"location":"reference/_ankaios.proto/#agentmap","title":"AgentMap","text":"A nested map that provides the names of the connected agents and their optional attributes. The first level allows searches by agent name.
Field Type Label Description agents AgentMap.AgentsEntry repeated "},{"location":"reference/_ankaios.proto/#agentmapagentsentry","title":"AgentMap.AgentsEntry","text":"Field Type Label Description key string value AgentAttributes"},{"location":"reference/_ankaios.proto/#completestate","title":"CompleteState","text":"A message containing the complete state of the Ankaios system. This is a response to the CompleteStateRequest message.
Field Type Label Description desiredState State The state the user wants to reach. workloadStates WorkloadStatesMap The current execution states of the workloads. agents AgentMap The agents currently connected to the Ankaios cluster. "},{"location":"reference/_ankaios.proto/#completestaterequest","title":"CompleteStateRequest","text":"A message containing a request for the complete/partial state of the Ankaios system. This is usually answered with a CompleteState message.
Field Type Label Description fieldMask string repeated A list of symbolic field paths within the State message structure e.g. 'desiredState.workloads.nginx'. "},{"location":"reference/_ankaios.proto/#configarray","title":"ConfigArray","text":"Field Type Label Description values ConfigItem repeated"},{"location":"reference/_ankaios.proto/#configitem","title":"ConfigItem","text":"An enum type describing possible configuration objects.
Field Type Label Description String string array ConfigArray object ConfigObject "},{"location":"reference/_ankaios.proto/#configmap","title":"ConfigMap","text":"This is a workaround for proto not supporing optional maps
Field Type Label Description configs ConfigMap.ConfigsEntry repeated "},{"location":"reference/_ankaios.proto/#configmapconfigsentry","title":"ConfigMap.ConfigsEntry","text":"Field Type Label Description key string value ConfigItem"},{"location":"reference/_ankaios.proto/#configmappings","title":"ConfigMappings","text":"This is a workaround for proto not supporing optional maps
Field Type Label Description configs ConfigMappings.ConfigsEntry repeated "},{"location":"reference/_ankaios.proto/#configmappingsconfigsentry","title":"ConfigMappings.ConfigsEntry","text":"Field Type Label Description key string value string"},{"location":"reference/_ankaios.proto/#configobject","title":"ConfigObject","text":"Field Type Label Description fields ConfigObject.FieldsEntry repeated"},{"location":"reference/_ankaios.proto/#configobjectfieldsentry","title":"ConfigObject.FieldsEntry","text":"Field Type Label Description key string value ConfigItem"},{"location":"reference/_ankaios.proto/#controlinterfaceaccess","title":"ControlInterfaceAccess","text":"A message containing the parts of the control interface the workload as authorized to access. By default, all access is denied. Only if a matching allow rule is found, and no matching deny rules is found, the access is allowed.
Field Type Label Description allowRules AccessRightsRule repeated Rules allow the access denyRules AccessRightsRule repeated Rules denying the access "},{"location":"reference/_ankaios.proto/#cpuusage","title":"CpuUsage","text":"A message containing the CPU usage information of the agent.
Field Type Label Description cpu_usage uint32 expressed in percent, the formula for calculating: cpu_usage = (new_work_time - old_work_time) / (new_total_time - old_total_time) * 100 "},{"location":"reference/_ankaios.proto/#dependencies","title":"Dependencies","text":"This is a workaround for proto not supporing optional maps
Field Type Label Description dependencies Dependencies.DependenciesEntry repeated "},{"location":"reference/_ankaios.proto/#dependenciesdependenciesentry","title":"Dependencies.DependenciesEntry","text":"Field Type Label Description key string value AddCondition"},{"location":"reference/_ankaios.proto/#error","title":"Error","text":"Field Type Label Description message string"},{"location":"reference/_ankaios.proto/#executionstate","title":"ExecutionState","text":"A message containing information about the detailed state of a workload in the Ankaios system.
Field Type Label Description additionalInfo string The additional info contains more detailed information from the runtime regarding the execution state. agentDisconnected AgentDisconnected The exact state of the workload cannot be determined, e.g., because of a broken connection to the responsible agent. pending Pending The workload is going to be started eventually. running Running The workload is operational. stopping Stopping The workload is scheduled for stopping. succeeded Succeeded The workload has successfully finished its operation. failed Failed The workload has failed or is in a degraded state. notScheduled NotScheduled The workload is not scheduled to run at any agent. This is signalized with an empty agent in the workload specification. removed Removed The workload was removed from Ankaios. This state is used only internally in Ankaios. The outside world removed states are just not there. "},{"location":"reference/_ankaios.proto/#executionsstatesforid","title":"ExecutionsStatesForId","text":"A map providing the execution state of a specific workload for a given id. This level is needed as a workload could be running more than once on one agent in different versions.
Field Type Label Description idStateMap ExecutionsStatesForId.IdStateMapEntry repeated "},{"location":"reference/_ankaios.proto/#executionsstatesforididstatemapentry","title":"ExecutionsStatesForId.IdStateMapEntry","text":"Field Type Label Description key string value ExecutionState"},{"location":"reference/_ankaios.proto/#executionsstatesofworkload","title":"ExecutionsStatesOfWorkload","text":"A map providing the execution state of a workload for a given name.
Field Type Label Description wlNameStateMap ExecutionsStatesOfWorkload.WlNameStateMapEntry repeated "},{"location":"reference/_ankaios.proto/#executionsstatesofworkloadwlnamestatemapentry","title":"ExecutionsStatesOfWorkload.WlNameStateMapEntry","text":"Field Type Label Description key string value ExecutionsStatesForId"},{"location":"reference/_ankaios.proto/#freememory","title":"FreeMemory","text":"A message containing the amount of free memory of the agent.
Field Type Label Description free_memory uint64 expressed in bytes "},{"location":"reference/_ankaios.proto/#request","title":"Request","text":"A message containing a request to the Ankaios server to update the state or to request the complete state of the Ankaios system.
Field Type Label Description requestId string updateStateRequest UpdateStateRequest A message to Ankaios server to update the state of one or more agent(s). completeStateRequest CompleteStateRequest A message to Ankaios server to request the complete state by the given request id and the optional field mask. "},{"location":"reference/_ankaios.proto/#response","title":"Response","text":"A message containing a response from the Ankaios server to a particular request. The response content depends on the request content previously sent to the Ankaios server.
Field Type Label Description requestId string error Error completeState CompleteState UpdateStateSuccess UpdateStateSuccess "},{"location":"reference/_ankaios.proto/#state","title":"State","text":"A message containing the state information.
Field Type Label Description apiVersion string The current version of the API. workloads WorkloadMap A mapping from workload names to workload configurations. configs ConfigMap Configuration values which can be referenced in workload configurations. "},{"location":"reference/_ankaios.proto/#staterule","title":"StateRule","text":"Message containing a rule for getting or setting the state
Field Type Label Description operation ReadWriteEnum Defines which actions are allowed filterMasks string repeated Pathes definind what can be accessed. Segements of path can be a wildcare \"*\". "},{"location":"reference/_ankaios.proto/#tag","title":"Tag","text":"A message to store a tag.
Field Type Label Description key string The key of the tag. value string The value of the tag. "},{"location":"reference/_ankaios.proto/#tags","title":"Tags","text":"This is a workaround for proto not supporing optional repeated values
Field Type Label Description tags Tag repeated "},{"location":"reference/_ankaios.proto/#updatestaterequest","title":"UpdateStateRequest","text":"A message containing a request to update the state of the Ankaios system. The new state is provided as state object. To specify which part(s) of the new state object should be updated a list of update mask (same as field mask) paths needs to be provided.
Field Type Label Description newState CompleteState The new state of the Ankaios system. updateMask string repeated A list of symbolic field paths within the state message structure e.g. 'desiredState.workloads.nginx' to specify what to be updated. "},{"location":"reference/_ankaios.proto/#updatestatesuccess","title":"UpdateStateSuccess","text":"A message from the server containing the ids of the workloads that have been started and stopped in response to a previously sent UpdateStateRequest.
Field Type Label Description addedWorkloads string repeated Workload istance names of workloads which will be started deletedWorkloads string repeated Workload instance names of workloads which will be stopped "},{"location":"reference/_ankaios.proto/#workload","title":"Workload","text":"A message containing the configuration of a workload.
Field Type Label Description agent string optional The name of the owning Agent. restartPolicy RestartPolicy optional An enum value that defines the condition under which a workload is restarted. dependencies Dependencies A map of workload names and expected states to enable a synchronized start of the workload. tags Tags A list of tag names. runtime string optional The name of the runtime e.g. podman. runtimeConfig string optional The configuration information specific to the runtime. controlInterfaceAccess ControlInterfaceAccess configs ConfigMappings A mapping containing the configurations assigned to the workload. "},{"location":"reference/_ankaios.proto/#workloadinstancename","title":"WorkloadInstanceName","text":"Field Type Label Description workloadName string The name of the workload. agentName string The name of the owning Agent. id string A unique identifier of the workload."},{"location":"reference/_ankaios.proto/#workloadmap","title":"WorkloadMap","text":"This is a workaround for proto not supporing optional maps Workload names shall not be shorter than 1 symbol longer then 63 symbols and can contain only regular characters, digits, the \"-\" and \"_\" symbols.
Field Type Label Description workloads WorkloadMap.WorkloadsEntry repeated "},{"location":"reference/_ankaios.proto/#workloadmapworkloadsentry","title":"WorkloadMap.WorkloadsEntry","text":"Field Type Label Description key string value Workload"},{"location":"reference/_ankaios.proto/#workloadstate","title":"WorkloadState","text":"A message containing the information about the workload state.
Field Type Label Description instanceName WorkloadInstanceName executionState ExecutionState The workload execution state. "},{"location":"reference/_ankaios.proto/#workloadstatesmap","title":"WorkloadStatesMap","text":"A nested map that provides the execution state of a workload in a structured way. The first level allows searches by agent.
Field Type Label Description agentStateMap WorkloadStatesMap.AgentStateMapEntry repeated "},{"location":"reference/_ankaios.proto/#workloadstatesmapagentstatemapentry","title":"WorkloadStatesMap.AgentStateMapEntry","text":"Field Type Label Description key string value ExecutionsStatesOfWorkload"},{"location":"reference/_ankaios.proto/#addcondition","title":"AddCondition","text":"An enum type describing the expected workload state. Used for dependency management.
Name Number Description ADD_COND_RUNNING 0 The workload is operational. ADD_COND_SUCCEEDED 1 The workload has successfully exited. ADD_COND_FAILED 2 The workload has exited with an error or could not be started. "},{"location":"reference/_ankaios.proto/#agentdisconnected","title":"AgentDisconnected","text":"The exact state of the workload cannot be determined, e.g., because of a broken connection to the responsible agent.
Name Number Description AGENT_DISCONNECTED 0 "},{"location":"reference/_ankaios.proto/#failed","title":"Failed","text":"The workload has failed or is in a degraded state.
Name Number Description FAILED_EXEC_FAILED 0 The workload has failed during operation FAILED_UNKNOWN 1 The workload is in an unsupported by Ankaios runtime state. The workload was possibly altered outside of Ankaios. FAILED_LOST 2 The workload cannot be found anymore. The workload was possibly altered outside of Ankaios or was auto-removed by the runtime. "},{"location":"reference/_ankaios.proto/#notscheduled","title":"NotScheduled","text":"The workload is not scheduled to run at any agent. This is signalized with an empty agent in the workload specification.
Name Number Description NOT_SCHEDULED 0 "},{"location":"reference/_ankaios.proto/#pending","title":"Pending","text":"The workload is going to be started eventually.
Name Number Description PENDING_INITIAL 0 The workload specification has not yet being scheduled PENDING_WAITING_TO_START 1 The start of the workload will be triggered once all its dependencies are met. PENDING_STARTING 2 Starting the workload was scheduled at the corresponding runtime. PENDING_STARTING_FAILED 8 The starting of the workload by the runtime failed. "},{"location":"reference/_ankaios.proto/#readwriteenum","title":"ReadWriteEnum","text":"An enum type describing which action is allowed.
Name Number Description RW_NOTHING 0 Allow nothing RW_READ 1 Allow read RW_WRITE 2 Allow write RW_READ_WRITE 5 Allow read and write "},{"location":"reference/_ankaios.proto/#removed","title":"Removed","text":"The workload was removed from Ankaios. This state is used only internally in Ankaios. The outside world removed states are just not there.
Name Number Description REMOVED 0 "},{"location":"reference/_ankaios.proto/#restartpolicy","title":"RestartPolicy","text":"An enum type describing the restart behavior of a workload.
Name Number Description NEVER 0 The workload is never restarted. Once the workload exits, it remains in the exited state. ON_FAILURE 1 If the workload exits with a non-zero exit code, it will be restarted. ALWAYS 2 The workload is restarted upon termination, regardless of the exit code. "},{"location":"reference/_ankaios.proto/#running","title":"Running","text":"The workload is operational.
Name Number Description RUNNING_OK 0 The workload is operational. "},{"location":"reference/_ankaios.proto/#stopping","title":"Stopping","text":"The workload is scheduled for stopping.
Name Number Description STOPPING 0 The workload is being stopped. STOPPING_WAITING_TO_STOP 1 The deletion of the workload will be triggered once neither 'pending' nor 'running' workload depending on it exists. STOPPING_REQUESTED_AT_RUNTIME 2 This is an Ankaios generated state returned when the stopping was explicitly trigged by the user and the request was sent to the runtime. STOPPING_DELETE_FAILED 8 The deletion of the workload by the runtime failed. "},{"location":"reference/_ankaios.proto/#succeeded","title":"Succeeded","text":"The workload has successfully finished operation.
Name Number Description SUCCEEDED_OK 0 The workload has successfully finished operation."},{"location":"reference/_ankaios.proto/#scalar-value-types","title":"Scalar Value Types","text":".proto Type Notes C++ Java Python Go C# PHP Ruby double double double float float64 double float Float float float float float float32 float float Float int32 Uses variable-length encoding. Inefficient for encoding negative numbers \u2013 if your field is likely to have negative values, use sint32 instead. int32 int int int32 int integer Bignum or Fixnum (as required) int64 Uses variable-length encoding. Inefficient for encoding negative numbers \u2013 if your field is likely to have negative values, use sint64 instead. int64 long int/long int64 long integer/string Bignum uint32 Uses variable-length encoding. uint32 int int/long uint32 uint integer Bignum or Fixnum (as required) uint64 Uses variable-length encoding. uint64 long int/long uint64 ulong integer/string Bignum or Fixnum (as required) sint32 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. int32 int int int32 int integer Bignum or Fixnum (as required) sint64 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. int64 long int/long int64 long integer/string Bignum fixed32 Always four bytes. More efficient than uint32 if values are often greater than 2^28. uint32 int int uint32 uint integer Bignum or Fixnum (as required) fixed64 Always eight bytes. More efficient than uint64 if values are often greater than 2^56. uint64 long int/long uint64 ulong integer/string Bignum sfixed32 Always four bytes. int32 int int int32 int integer Bignum or Fixnum (as required) sfixed64 Always eight bytes. int64 long int/long int64 long integer/string Bignum bool bool boolean boolean bool bool boolean TrueClass/FalseClass string A string must always contain UTF-8 encoded or 7-bit ASCII text. string String str/unicode string string string String (UTF-8) bytes May contain any arbitrary sequence of bytes. string ByteString str []byte ByteString string String (ASCII-8BIT)"},{"location":"reference/complete-state/","title":"Working with CompleteState","text":""},{"location":"reference/complete-state/#completestate","title":"CompleteState","text":"The complete state data structure CompleteState is used for building a request to Ankaios server to change or receive the state of the Ankaios system. It contains the desiredState
which describes the state of the Ankaios system the user wants to have, the workloadStates
which gives the information about the execution state of all the workloads and the agents
field containing the names of the Ankaios agents that are currently connected to the Ankaios server. By using of CompleteState in conjunction with the object field mask specific parts of the Ankaios state could be retrieved or updated.
Example: ank -k get state
returns the complete state of Ankaios system:
Note
The instructions assume the default installation without mutual TLS (mTLS) for communication. With -k
or --insecure
the ank
CLI will connect without mTLS. Alternatively, set the environment variable ANK_INSECURE=true
to avoid passing the argument to each ank
CLI command. For an Ankaios setup with mTLS, see here.
desiredState:\n apiVersion: v0.1\n workloads:\n hello-pod:\n agent: agent_B\n tags:\n - key: owner\n value: Ankaios team\n dependencies: {}\n restartPolicy: NEVER\n runtime: podman-kube\n runtimeConfig: |\n manifest: |\n apiVersion: v1\n kind: Pod\n metadata:\n name: hello-pod\n spec:\n restartPolicy: Never\n containers:\n - name: looper\n image: alpine:latest\n command:\n - sleep\n - 50000\n - name: greater\n image: alpine:latest\n command:\n - echo\n - \"Hello from a container in a pod\"\n configs: {}\n hello1:\n agent: agent_B\n tags:\n - key: owner\n value: Ankaios team\n dependencies: {}\n runtime: podman\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--rm\"]\n commandArgs: [ \"echo\", \"Hello Ankaios\"]\n configs: {}\n hello2:\n agent: agent_B\n tags:\n - key: owner\n value: Ankaios team\n dependencies: {}\n restartPolicy: ALWAYS\n runtime: podman\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--entrypoint\", \"/bin/sh\" ]\n commandArgs: [ \"-c\", \"echo 'Always restarted.'; sleep 2\"]\n configs: {}\n nginx:\n agent: agent_A\n tags:\n - key: owner\n value: Ankaios team\n dependencies: {}\n restartPolicy: ON_FAILURE\n runtime: podman\n runtimeConfig: |\n image: docker.io/nginx:latest\n commandOptions: [\"-p\", \"8081:80\"]\n configs: {}\n configs: {}\nworkloadStates: []\nagents: {}\n
It is not necessary to provide the whole structure of the the CompleteState data structure when using it in conjunction with the object field mask. It is sufficient to provide the relevant branch of the CompleteState object. As an example, to change the restart behavior of the nginx workload, only the relevant branch of the CompleteState needs to be provided:
desiredState:\n workloads:\n nginx:\n restartPolicy: ALWAYS\n
Note
In case of workload names, the naming convention states that their names shall: - contain only regular upper and lowercase characters (a-z and A-Z), numbers and the symbols \"-\" and \"\" - have a minimal length of 1 character - have a maximal length of 63 characters Also, agent name shall contain only regular upper and lowercase characters (a-z and A-Z), numbers and the symbols \"-\" and \"\".
"},{"location":"reference/complete-state/#object-field-mask","title":"Object field mask","text":"With the object field mask only specific parts of the Ankaios state could be retrieved or updated. The object field mask can be constructed using the field names of the CompleteState data structure:
<top level field name>.<second level field name>.<third level field name>.<...>\n
Example: ank -k get state desiredState.workloads.nginx
returns only the information about nginx workload:
desiredState:\n apiVersion: v0.1\n workloads:\n nginx:\n agent: agent_A\n tags:\n - key: owner\n value: Ankaios team\n dependencies: {}\n restartPolicy: ALWAYS\n runtime: podman\n runtimeConfig: |\n image: docker.io/nginx:latest\n commandOptions: [\"-p\", \"8081:80\"]\n configs: {}\n
Example ank -k get state desiredState.workloads.nginx.runtimeConfig
returns only the runtime configuration of nginx workload:
desiredState:\n apiVersion: v0.1\n workloads:\n nginx:\n runtimeConfig: |\n image: docker.io/nginx:latest\n commandOptions: [\"-p\", \"8081:80\"]\n
Example ank -k set state desiredState.workloads.nginx.restartPolicy new-state.yaml
changes the restart behavior of nginx workload to NEVER
:
desiredState:\n workloads:\n nginx:\n restartPolicy: NEVER\n
The control interface allows the workload developers to easily integrate the communication between the Ankaios system and their applications.
Note
The control interface is currently only available for workloads using the podman
runtime and not for the podman-kube
runtime.
flowchart TD\n a1(Ankaios Agent 1)\n w1(Workload 1)\n w2(Workload 2)\n a2(Ankaios Agent 2)\n w3(Workload 3)\n w4(Workload 4)\n s(Ankaios server)\n\n\n s <--> a1 <-->|Control Interface| w1 & w2\n s <--> a2 <-->|Control Interface| w3 & w4
The control interface enables a workload to communicate with the Ankaios system by interacting with the Ankaios server through writing/reading communication data to/from the provided FIFO files in the FIFO mount point.
"},{"location":"reference/control-interface/#authorization","title":"Authorization","text":"Ankaios checks for each request from a workload to the control interface, if the workload is authorized. The authorization is configured for each workload using controlInterfaceAccess
. A workload without controlInterfaceAccess
configuration is denied all actions on the control interface. The authorization configuration consists of allow and deny rules. Each rule defines the operation (e.g. read) the workload is allowed to execute and with which filter masks it is allowed to execute this operation.
A filter mask describes a path in the CompleteState object. The segments of the path are divided by the '.' symbol. Segments can also be the wildcard character '*', indicating this segment shall match every possible field. E.g. desiredState.workloads.*.tag
allows access to the tags of all workloads.
In an allow rule the path gives access to the exact path and also all subfields. E.g. an allow rule with desiredState.workloads.example
would also give access to desiredState.workload.example.tags
. In a deny rule the path prohibits access to the exact path and also all parent fields. E.g. a deny rule with desiredState.workloads.example
would also deny access to desiredState.workloads
, but has no effect on desiredState.workloads.other_example
.
Every request not allowed by a rule in controlInterfaceAccess
is prohibited. Every request allowed by a rule, but denied by another rule is also prohibited. E.g. with an allow rule for path desiredState.workloads.*.agent
and a deny rule for desiredState.workloads.controller
, a workload would be allowed to change the agent of each workload, except for the controller
workload.
flowchart TD\n a1(Ankaios Agent 1)\n w1(Workload 1)\n w2(Workload 2)\n s(Ankaios server)\n\n\n s <--> a1 <-->|\"/run/ankaios/control_interface/{input,output}\"| w1 & w2
The control interface relies on FIFO (also known as named pipes) to enable a workload to communicate with the Ankaios system. For that purpose, Ankaios creates a mount point for each workload to store the FIFO files. At the mount point /run/ankaios/control_interface/
the workload developer can find the FIFO files input
and output
and use them for the communication with the Ankaios server. Ankaios uses its own communication protocol described in protocol documentation as a protobuf IDL which allows the client code to be generated in any programming language supported by the protobuf compiler. The generated client code can then be integrated and used in a workload.
flowchart TD\n proto(\"ankaios.proto\")\n gen_code(\"Generated Client Code\")\n workload(\"Workload\")\n\n proto -->|generate code with protoc| gen_code\n workload-->|uses| gen_code
In order to enable the communication between a workload and the Ankaios system, the workload needs to make use of the control interface by sending and processing serialized messages defined in ankaios.proto
via writing to and reading from the provided FIFO files output
and input
found in the mount point /run/ankaios/control_interface/
. By using the protobuf compiler (protoc) code in any programming language supported by the protobuf compiler can be generated. The generated code contains functions for serializing and deserializing the messages to and from the Protocol Buffers binary format.
The messages are encoded using the length-delimited wire type format and layout inside the FIFO file according to the following visualization:
Every protobuf message is prefixed with its byte length telling the reader how much bytes to read to consume the protobuf message. The byte length has a dynamic length and is encoded as VARINT.
"},{"location":"reference/control-interface/#control-interface-examples","title":"Control interface examples","text":"The subfolder examples
inside the Ankaios repository contains example workload applications in various programming languages that are using the control interface. They demonstrate how to easily use the control interface in self-developed workloads. All examples share the same behavior regardless of the programming language and are simplified to focus on the usage of the control interface. Please note that the examples are not are not optimized for production usage.
The following sections showcase in Rust some important parts of the communication with the Ankaios cluster using the control interface. The same concepts are also used in all of the example workload applications.
"},{"location":"reference/control-interface/#sending-request-message-from-a-workload-to-ankaios-server","title":"Sending request message from a workload to Ankaios server","text":"To send out a request message from the workload to the Ankaios server the request message needs to be serialized using the generated serializing function, then encoded as length-delimited protobuf message and then written directly into the output
FIFO file. The type of request message is ToAnkaios.
flowchart TD\n begin([Start])\n req_msg(Fill ToAnkaios message)\n ser_msg(Serialize ToAnkaios message using the generated serializing function)\n enc_bytes(Encode as length-delimited varint)\n output(\"Write encoded bytes to /run/ankaios/control_interface/output\")\n fin([end])\n\n begin --> req_msg\n req_msg --> ser_msg\n ser_msg -->enc_bytes\n enc_bytes --> output\n output --> fin
Send request message via control interface Code snippet in Rust for sending request message via control interface:
use api::ank_base::{Workload, RestartPolicy, Tag, UpdateStateRequest, Request, request::RequestContent, CompleteState, State};\nuse api::control_api::{ToAnkaios, to_ankaios::ToAnkaiosEnum};\nuse prost::Message;\nuse std::{collections::HashMap, fs::File, io::Write, path::Path};\n\nconst ANKAIOS_CONTROL_INTERFACE_BASE_PATH: &str = \"/run/ankaios/control_interface\";\n\nfn create_update_workload_request() -> ToAnkaios {\n let new_workloads = HashMap::from([(\n \"dynamic_nginx\".to_string(),\n Workload {\n runtime: \"podman\".to_string(),\n agent: \"agent_A\".to_string(),\n restart_policy: RestartPolicy::Never.into(),\n tags: vec![Tag {\n key: \"owner\".to_string(),\n value: \"Ankaios team\".to_string(),\n }],\n runtime_config: \"image: docker.io/library/nginx\\ncommandOptions: [\\\"-p\\\", \\\"8080:80\\\"]\"\n .to_string(),\n dependencies: HashMap::new(),\n },\n )]);\n\n ToAnkaios {\n to_ankaios_enum: Some(ToAnkaiosEnum::Request(Request {\n request_id: \"request_id\".to_string(),\n request_content: Some(RequestContent::UpdateStateRequest(\n UpdateStateRequest {\n new_state: Some(CompleteState {\n desired_state: Some(State {\n api_version: \"v0.1\".to_string(),\n workloads: new_workloads,\n }),\n ..Default::default()\n }),\n update_mask: vec![\"desiredState.workloads.dynamic_nginx\".to_string()],\n },\n )),\n })),\n }\n}\n\nfn write_to_control_interface() {\n let pipes_location = Path::new(ANKAIOS_CONTROL_INTERFACE_BASE_PATH);\n let sc_req_fifo = pipes_location.join(\"output\");\n\n let mut sc_req = File::create(&sc_req_fifo).unwrap();\n\n let protobuf_update_workload_request = create_update_workload_request();\n\n println!(\"{}\", &format!(\"Sending UpdateStateRequest containing details for adding the dynamic workload \\\"dynamic_nginx\\\": {:#?}\", protobuf_update_workload_request));\n\n sc_req\n .write_all(&protobuf_update_workload_request.encode_length_delimited_to_vec())\n .unwrap();\n}\n\nfn main() {\n write_to_control_interface();\n}\n
"},{"location":"reference/control-interface/#processing-response-message-from-ankaios-server","title":"Processing response message from Ankaios server","text":"To process a response message from the Ankaios server the workload needs to read out the bytes from the input
FIFO file. As the bytes are encoded as length-delimited protobuf message with a variable length, the length needs to be decoded and extracted first. Then the length can be used to decode and deserialize the read bytes to a response message object for further processing. The type of the response message is FromAnkaios.
flowchart TD\n begin([Start])\n input(\"Read bytes from /run/ankaios/control_interface/input\")\n dec_length(Get length from read length delimited varint encoded bytes)\n deser_msg(Decode and deserialize FromAnkaios message using decoded length and the generated functions)\n further_processing(Process FromAnkaios message object)\n fin([end])\n\n begin --> input\n input --> dec_length\n dec_length --> deser_msg\n deser_msg --> further_processing\n further_processing --> fin
Read response message via control interface Code Snippet in Rust for reading response message via control interface:
use api::control_api::FromAnkaios;\nuse prost::Message;\nuse std::{fs::File, io, io::Read, path::Path};\n\nconst ANKAIOS_CONTROL_INTERFACE_BASE_PATH: &str = \"/run/ankaios/control_interface\";\nconst MAX_VARINT_SIZE: usize = 19;\n\nfn read_varint_data(file: &mut File) -> Result<[u8; MAX_VARINT_SIZE], io::Error> {\n let mut res = [0u8; MAX_VARINT_SIZE];\n let mut one_byte_buffer = [0u8; 1];\n for item in res.iter_mut() {\n file.read_exact(&mut one_byte_buffer)?;\n *item = one_byte_buffer[0];\n // check if most significant bit is set to 0 if so it is the last byte to be read\n if *item & 0b10000000 == 0 {\n break;\n }\n }\n Ok(res)\n}\n\nfn read_protobuf_data(file: &mut File) -> Result<Box<[u8]>, io::Error> {\n let varint_data = read_varint_data(file)?;\n let mut varint_data = Box::new(&varint_data[..]);\n\n // determine the exact size for exact reading of the bytes later by decoding the varint data\n let size = prost::encoding::decode_varint(&mut varint_data)? as usize;\n\n let mut buf = vec![0; size];\n file.read_exact(&mut buf[..])?; // read exact bytes from file\n Ok(buf.into_boxed_slice())\n}\n\nfn read_from_control_interface() {\n let pipes_location = Path::new(ANKAIOS_CONTROL_INTERFACE_BASE_PATH);\n let ex_req_fifo = pipes_location.join(\"input\");\n\n let mut ex_req = File::open(&ex_req_fifo).unwrap();\n\n loop {\n if let Ok(binary) = read_protobuf_data(&mut ex_req) {\n let proto = FromAnkaios::decode(&mut Box::new(binary.as_ref()));\n\n println!(\"{}\", &format!(\"Received FromAnkaios message containing the response from the server: {:#?}\", proto));\n }\n }\n}\n\nfn main() {\n read_from_control_interface();\n}\n
"},{"location":"reference/glossary/","title":"Glossary","text":"This glossary is intended to be a comprehensive, uniform list of Ankaios terminology. It consists of technical terms specific to Ankaios, as well as more general terms that provide useful context.
"},{"location":"reference/glossary/#node","title":"Node","text":"A machine, either physical or virtual, that provides the necessary prerequisites (e.g. OS) to run an Ankaios server and/or agent.
"},{"location":"reference/glossary/#runtime","title":"Runtime","text":"The base an which a workload can be started. For OCI container this is a container runtime or engine. For native applications the runtime is the OS itself.
"},{"location":"reference/glossary/#workload","title":"Workload","text":"A functionality that the Ankaios orchestrator can manage (e.g. start, stop). A workload could be packed inside an OCI container (e.g. Podman container) or could also be just a native program (native workload). Ankaios is build to be extensible for different workload types by adding support for other runtimes.
"},{"location":"reference/glossary/#container","title":"Container","text":"A container is a lightweight, standalone, executable software package that includes everything needed to run an application, including the binaries, runtime, system libraries and dependencies. Containers provide a consistent and isolated environment for applications to run, ensuring that they behave consistently across different computing environments, from development to testing to production.
"},{"location":"reference/glossary/#podman-container","title":"Podman container","text":"A Podman container refers to a container managed by Podman, which is an open-source container engine similar to Docker. Podman aims to provide a simple and secure container management solution for developers and system administrators.
"},{"location":"reference/glossary/#native-workload","title":"Native workload","text":"An application developed specifically for a particular platform or operating system (OS). It is designed to run directly on the target platform without the need for bringing in any additional translation or emulation layers.
"},{"location":"reference/inter-workload-dependencies/","title":"Inter-workload dependencies","text":"Ankaios enables users to configure dependencies between workloads.
There are two types of inter-workload dependencies supported by Ankaios:
The user configures explicit inter-workload dependencies within a workload's configuration, which Ankaios considers when starting the workload. Ankaios starts workloads with dependencies only when all dependencies are met, allowing the user to define a specific sequence for starting workloads.
Ankaios defines implicit inter-workload dependencies internally and takes them into account when a dependency is deleted.
"},{"location":"reference/inter-workload-dependencies/#explicit-inter-workload-dependencies","title":"Explicit inter-workload dependencies","text":"Ankaios supports the following dependency types:
Dependency type AddCondition Description running ADD_COND_RUNNING The dependency must be operational. succeeded ADD_COND_SUCCEEDED The dependency must be successfully exited. failed ADD_COND_FAILED The dependency must exit with a non-zero return code.The user configures the AddCondition
for each dependency in the dependencies
field to define one or multiple dependencies for a workload.
apiVersion: v0.1\nworkloads:\n logger:\n agent: agent_A\n runtime: podman\n dependencies:\n storage_provider: ADD_COND_RUNNING\n ...\n
When the storage_provider
is operational, Ankaios starts the logger
workload. The ExecutionState of the workload remains Pending(WaitingToStart)
until all dependencies are met.
Note
Ankaios rejects manifests and workload configurations with cyclic dependencies. A manifest is valid only when its workloads and dependencies form a directed acyclic graph.
This example demonstrates how to use dependency types to configure inter-workload dependencies:
---\ntitle:\n---\nflowchart RL\n logger(logger)\n init(init_storage)\n storage(storage_provider)\n err_handler(error_handler)\n\n\n logger-- running -->storage\n err_handler-- failed -->storage\n storage-- succeeded -->init
The logging service requires an operational storage provider to write logs. Therefore, the storage provider must be started first and its initialization (init_storage) must be completed before starting the provider itself. In case of a failure, an error handler is started to manage errors.
The Ankaios manifest below includes the configuration of each workload and its dependencies:
apiVersion: v0.1\nworkloads:\n logger:\n runtime: podman\n agent: agent_A\n dependencies:\n storage_provider: ADD_COND_RUNNING # (1)!\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--entrypoint\", \"/bin/sleep\" ]\n commandArgs: [ \"3\" ]\n storage_provider:\n runtime: podman\n agent: agent_B\n dependencies:\n init_storage: ADD_COND_SUCCEEDED # (2)!\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--entrypoint\", \"/bin/sh\" ]\n commandArgs: [ \"-c\", \"sleep 5; exit 1\" ]\n init_storage: # (3)!\n runtime: podman\n agent: agent_B\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--entrypoint\", \"/bin/sleep\" ]\n commandArgs: [ \"2\" ]\n error_handler:\n runtime: podman\n agent: agent_A\n dependencies:\n storage_provider: ADD_COND_FAILED # (4)!\n runtimeConfig: |\n image: alpine:latest\n commandArgs: [ \"echo\", \"report failed storage provider\"]\n
Workloads may have dependencies that do not currently exist in the Ankaios state.
Assuming Ankaios is started with a manifest containing all previous workloads except for the error_handler
, a user can update the desired state by adding the restart_service
workload. This workload restarts certain workloads and should run after the error_handler
has completed. The following Ankaios manifest includes the restart_service
workload, which depends on the non-existent error_handler
in the current desired state:
workloads:\n restart_service:\n runtime: podman\n agent: agent_B\n dependencies:\n error_handler: ADD_COND_SUCCEEDED\n runtimeConfig: |\n image: alpine:latest\n commandArgs: [ \"echo\", \"restart of storage workloads\"]\n
Ankaios delays the restart_service
until the error_handler
reaches the specified state.
Ankaios automatically defines implicit dependencies to prevent a workload from failing or entering an undesired state when a dependency is deleted. These dependencies cannot be configured by the user. Ankaios only defines implicit dependencies for dependencies that other workloads depend on with the running
dependency type.
Ankaios does not explicitly delete a workload when its dependency is deleted. Instead, Ankaios delays the deletion of a dependency until all dependent workloads have been deleted. The dependency will have the ExecutionState Stopping(WaitingToStop)
as long as it cannot be deleted.
In the previous example, the workload logger
depends on the storage_provider
with a running
dependency type. When the user updates or deletes the storage_provider
dependency, Ankaios delays the deletion until the dependent workload logger
is neither pending nor running.
If an update meets the delete conditions but not the add conditions, Ankaios will execute the delete operation directly without delaying the entire update.
Note
Ankaios does not define implicit dependencies for workloads that have dependencies with the succeeded
and failed
types.
Ankaios offers two ways of dynamically interacting with a running cluster - the ank
CLI and the control interface.
The ank
CLI is targeted at integrators or workload developers that want to interact with the cluster during development or for a manual intervention. It is developed for ergonomics and not automation purposes. If required, an external application can connect to the interface used by the CLI, but this is not the standard way of automating a dynamic reconfiguration of the cluster during runtime.
The Ankaios control interface is provided to workloads managed by Ankaios and allows implementing the so-called \"operator pattern\". The control interface allows each workload to send messages to the agent managing it. After successful authorization, the Ankaios agent forwards the request to the Ankaios server and provides the response to the requesting workload. Through the control interface, a workload has the capability to obtain the complete state of the Ankaios cluster or administer the cluster by declaratively adjusting its state, thereby facilitating the addition or removal of other workloads.
"},{"location":"reference/resource-usage/","title":"Resource usage","text":"The following table shows the resource usage of Ankaios v0.2.0 with the setup:
The restart policy of a workload enables the user to determine whether a workload is automatically restarted when it terminates. By default, workloads are not restarted. However, the restart policy can be configured to always restart the workload, or to restart the workload under certain conditions.
"},{"location":"reference/restart-policy/#supported-restart-policies","title":"Supported Restart Policies","text":"The following restart policies are available for a workload:
Restart Policy Description Restart on ExecutionState NEVER The workload is never restarted. Once the workload exits, it remains in the exited state. - ON_FAILURE If the workload exits with a non-zero exit code, it will be restarted. Failed(ExecFailed) ALWAYS The workload is restarted upon termination, regardless of the exit code. Succeeded(Ok) or Failed(ExecFailed)Ankaios restarts the workload when the workload has exited and the configured restart policy aligns with the workload's ExecutionState
, as detailed in the aforementioned table. It does not restart the workload if the user explicitly deletes the workload via the Ankaios CLI or if Ankaios receives a delete request for that workload via the Control Interface.
Note
Ankaios does not consider inter-workload dependencies when restarting a workload because it was already running before it has exited.
"},{"location":"reference/restart-policy/#configure-restart-policies","title":"Configure Restart Policies","text":"The field restartPolicy
enables the user to define the restart policy for each workload within the Ankaios manifest. The field is optional. If the field is not provided, the default restart policy NEVER
is applied.
The following Ankaios manifest contains workloads with different restart policies:
apiVersion: v0.1\nworkloads:\n restarted_always:\n runtime: podman\n agent: agent_A\n restartPolicy: ALWAYS # (1)!\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--entrypoint\", \"/bin/sh\" ]\n commandArgs: [ \"-c\", \"echo 'Always restarted.'; sleep 2\"]\n restarted_never:\n runtime: podman\n agent: agent_A\n restartPolicy: NEVER # (2)!\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--entrypoint\", \"/bin/sh\" ]\n commandArgs: [ \"-c\", \"echo 'Explicitly never restarted.'; sleep 2\"]\n default_restarted_never: # default restart policy is NEVER\n runtime: podman\n agent: agent_A\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--entrypoint\", \"/bin/sh\" ]\n commandArgs: [ \"-c\", \"echo 'Implicitly never restarted.'; sleep 2\"]\n restarted_on_failure:\n runtime: podman\n agent: agent_A\n restartPolicy: ON_FAILURE # (3)!\n runtimeConfig: |\n image: alpine:latest\n commandOptions: [ \"--entrypoint\", \"/bin/sh\" ]\n commandArgs: [ \"-c\", \"echo 'Restarted on failure.'; sleep 2; exit 1\"]\n
Depending on the use-case, the Ankaios cluster can be started with an optional predefined list of workloads - the startup configuration. Currently the startup configuration is provided as a file which is in YAML file format and can be passed to the Ankaios server through a command line argument. If Ankaios is started without or with an empty startup configuration, workloads can still be added to the cluster dynamically during runtime.
Note: To be able to run a workload an Ankaios agent must be started on the same or on a different node.
"},{"location":"reference/startup-configuration/#configuration-structure","title":"Configuration structure","text":"The startup configuration is composed of a list of workload specifications within the workloads
object. A workload specification must contain the following information:
workload name
(via field key), specify the workload name to identify the workload in the Ankaios system.runtime
, specify the type of the runtime. Currently supported values are podman
and podman-kube
.agent
, specify the name of the owning agent which is going to execute the workload. Supports templated strings.restartPolicy
, specify how the workload should be restarted upon exiting (not implemented yet).tags
, specify a list of key
value
pairs.runtimeConfig
, specify as a string the configuration for the runtime whose configuration structure is specific for each runtime, e.g., for podman
runtime the PodmanRuntimeConfig is used. Supports templated strings.configs
: assign configuration items defined in the state's configs
field to the workloadcontrolInterfaceAccess
, specify the access rights of the workload for the control interface.Example startup-config.yaml
file:
apiVersion: v0.1\nworkloads:\n nginx: # this is used as the workload name which is 'nginx'\n runtime: podman\n agent: agent_A\n restartPolicy: ALWAYS\n tags:\n - key: owner\n value: Ankaios team\n configs:\n port: web_server_port\n runtimeConfig: |\n image: docker.io/nginx:latest\n commandOptions: [\"-p\", \"{{port.access_port}}:80\"]\n controlInterfaceAccess:\n allowRules:\n - type: StateRule\n operation: Read\n filterMask:\n - \"workloadStates\"\nconfigs:\n web_server_port:\n access_port: \"8081\"\n
Ankaios supports templated strings and essential control directives in the handlebars templating language for the following workload fields:
agent
runtimeConfig
Ankaios renders a templated state at startup or when the state is updated. The rendering replaces the templated strings with the configuration items associated with each workload. The configuration items themselves are defined in a configs
field, which contains several key-value pairs. The key specifies the name of the configuration item and the value is a string, list or associative data structure. To see templated workload configurations in action, follow the tutorial about sending and receiving vehicle data.
Note
The name of a configuration item can only contain regular characters, digits, the \"-\" and \"_\" symbols. The same applies to the keys and values of the workload's configs
field when assigning configuration items to a workload.
The runtime configuration for the podman
runtime is specified as follows:
generalOptions: [<comma>, <separated>, <options>]\nimage: <registry>/<image name>:<version>\ncommandOptions: [<comma>, <separated>, <options>]\ncommandArgs: [<comma>, <separated>, <arguments>]\n
where each attribute is passed directly to podman run
.
If we take as an example the podman run
command:
podman --events-backend file run --env VAR=able docker.io/alpine:latest echo Hello!
it would translate to the following runtime configuration:
generalOptions: [\"--events-backend\", \"file\"]\nimage: docker.io/alpine:latest\ncommandOptions: [\"--env\", \"VAR=able\"]\ncommandArgs: [\"echo\", \"Hello!\"]\n
"},{"location":"reference/startup-configuration/#podmankuberuntimeconfig","title":"PodmanKubeRuntimeConfig","text":"The runtime configuration for the podman-kube
runtime is specified as follows:
generalOptions: [<comma>, <separated>, <options>]\nplayOptions: [<comma>, <separated>, <options>]\ndownOptions: [<comma>, <separated>, <options>]\nmanifest: <string containing the K8s manifest>\n
where each attribute is passed directly to podman play kube
.
If we take as an example the podman play kube
command:
podman --events-backend file play kube --userns host manifest.yaml
and the corresponding command for deleting the manifest file:
podman --events-backend file play kube manifest.yaml --down --force
they would translate to the following runtime configuration:
generalOptions: [\"--events-backend\", \"file\"]\nplayOptions: [\"--userns\", \"host\"]\ndownOptions: [\"--force\"]\nmanifest: <contents of manifest.yaml>\n
"},{"location":"usage/awesome-ankaios/","title":"Awesome Ankaios","text":"Here you find a curated list of awesome things related to Ankaios.
If you have some missing resources, please feel free to open a pull request and add them.
"},{"location":"usage/awesome-ankaios/#extensions-for-ankaios","title":"Extensions for Ankaios","text":"Ankaios has been tested with the following Linux distributions. Others might work as well but have not been tested.
Ankaios currently requires a Linux OS and is available for x86_64 and arm64 targets.
The minimum system requirements are (tested with EB corbos Linux \u2013 built on Ubuntu):
Resource Min CPU 1 core RAM 128 MBPodman needs to be installed as this is used as container runtime (see Podman installation instructions). For using the podman
runtime, Podman version 3.4.2 is sufficient but the podman-kube
runtime requires at least Podman version 4.3.1.
Note
On Ubuntu 24.04 there is a known problem with Podman stopping containers. The following workaround disables AppArmor for Podman. Run the following steps as root after installation of Podman:
mkdir -p /etc/containers/containers.conf.d\nprintf '[CONTAINERS]\\napparmor_profile=\"\"\\n' > /etc/containers/containers.conf.d/disable-apparmor.conf\n
"},{"location":"usage/installation/#installation-methods","title":"Installation methods","text":"There are two ways to install Ankaios, depending on your specific needs and focus. If you are new to Ankaios or TLS is not a top priority, we recommend following the setup instructions in Setup with script without enabling mutual transport layer security (mTLS) for communication. On the other hand, if you want to setup Ankaios in a production environment, follow the setup instructions in Setting up Ankaios with mTLS.
"},{"location":"usage/installation/#setup-with-script","title":"Setup with script","text":"The recommended way to install Ankaios is using the installation script. To install the latest release version of Ankaios, please run the following command:
curl -sfL https://github.com/eclipse-ankaios/ankaios/releases/latest/download/install.sh | bash -\n
Note
Please note that installing the latest version of Ankaios in an automated workflow is discouraged. If you want to install Ankaios during an automated workflow, please install a specific version as described below.
The installation process automatically detects the platform and downloads the appropriate binaries. The default installation path for the binaries is /usr/local/bin
but can be changed. The installation also creates systemd unit files and an uninstall script.
Supported platforms: linux/amd64
, linux/arm64
Note
The script requires root privileges to install the pre-built binaries into the default installation path /usr/local/bin
and also for systemd integration. You can set a custom installation path and disable systemd unit file generation if only non-root privileges are available.
The following table shows the optional arguments that can be passed to the script:
Supported parameters Description -v <version> e.g.v0.1.0
, default: latest version -i <install-path> File path where Ankaios will be installed, default: /usr/local/bin
-t <install-type> Installation type for systemd integration: server
, agent
, none
or both
(default) -s <server-options> Options which will be passed to the Ankaios server. Default --insecure --startup-config /etc/ankaios/state.yaml
-a <agent-options> Options which will be passed to the Ankaios agent. Default --insecure --name agent_A
To install a specific version run the following command and substitute <version>
with a specific version tag e.g. v0.1.0
:
curl -sfL https://github.com/eclipse-ankaios/ankaios/releases/download/<version>/install.sh | bash -s -- -v <version>\n
For available versions see the list of releases.
"},{"location":"usage/installation/#set-the-log-level-for-ank-server-and-ank-agent-services","title":"Set the log level forank-server
and ank-agent
services","text":"To configure the log levels for ank-server
and ank-agent
during the installation process using the provided environment variables, follow these steps:
Set the desired log levels for each service by assigning valid values to the environment variables INSTALL_ANK_SERVER_RUST_LOG
and INSTALL_ANK_AGENT_RUST_LOG
. For the syntax see the documentation for RUST_LOG
.
Run the installation script, making sure to pass these environment variables as arguments if needed:
For a specific version:
curl -sfL https://github.com/eclipse-ankaios/ankaios/releases/download/<version>/install.sh | INSTALL_ANK_SERVER_RUST_LOG=debug INSTALL_ANK_AGENT_RUST_LOG=info bash -s -- -i /usr/local/bin -t both -v <version>\n
For the latest version:
curl -sfL https://github.com/eclipse-ankaios/ankaios/releases/download/latest/install.sh | INSTALL_ANK_SERVER_RUST_LOG=debug INSTALL_ANK_AGENT_RUST_LOG=info bash -s -- -i /usr/local/bin -t both\n
Now, both services will output logs according to the specified log levels. If no explicit value was provided during installation, both services will default to info
log level. You can always change the log level by updating the environment variables and reinstalling the services.
If Ankaios has been installed with the installation script, it can be uninstalled with:
ank-uninstall.sh\n
The folder /etc/ankaios
will remain.
As an alternative to the installation script, the pre-built binaries can be downloaded manually from the Ankaios repository here. This is useful if the automatic detection of the platform is failing in case of uname
system command is not allowed or supported on the target.
For building Ankaios from source see Build.
"},{"location":"usage/mtls-setup/","title":"Setting up Ankaios with mTLS","text":"Mutual TLS (mTLS) is a security protocol that verifies both the client and server identities before establishing a connection. In Ankaios mTLS can be used to secure communication between the server, agent and ank CLI.
"},{"location":"usage/mtls-setup/#prerequisites","title":"Prerequisites","text":"To set up mTLS with OpenSSL, perform the following actions:
First we need to create a folder to keep certificates and keys for ank-server
and ank-agent
:
sudo mkdir -p /etc/ankaios/certs\n
Then we need to create a folder to keep certificates and keys for the ank
CLI:
mkdir -p \"${XDG_CONFIG_HOME:-$HOME/.config}/ankaios\"\n
"},{"location":"usage/mtls-setup/#generate-ca-keys-and-certificate","title":"Generate CA keys and certificate","text":"Construct an OpenSSL configuration file named ca.cnf
. You are welcome to include additional fields if necessary:
[req]\ndistinguished_name = req_distinguished_name\nprompt = no\n\n[req_distinguished_name]\nCN = ankaios-ca\n
Generate CA key:
sudo openssl genpkey -algorithm ED25519 -out \"./ca-key.pem\"\n
Generate CA certificate:
sudo openssl req -config \"./ca.cnf\" -new -x509 -key \"./ca-key.pem\" -out \"/etc/ankaios/certs/ca.pem\"\n
"},{"location":"usage/mtls-setup/#generate-key-and-certificate-for-ank-server","title":"Generate key and certificate for ank-server
","text":"Construct an OpenSSL configuration file named ank-server.cnf
. You are welcome to include additional fields if necessary:
[req]\ndistinguished_name = req_distinguished_name\nreq_extensions = v3_req\nprompt = no\n\n[req_distinguished_name]\nCN = ank-server\n\n[v3_req]\nsubjectAltName = @alt_names\nextendedKeyUsage = serverAuth\n\n[alt_names]\nDNS.1 = ank-server\n
Generate ank-server key:
sudo openssl genpkey -algorithm ED25519 -out \"/etc/ankaios/certs/ank-server-key.pem\"\n
Generate ank-server certificate signing request:
sudo openssl req -config \"./ank-server.cnf\" -new -key \"/etc/ankaios/certs/ank-server-key.pem\" -out \"./ank-server.csr\"\n
Generate ank-server certificate:
sudo openssl x509 -req -in \"./ank-server.csr\" -CA \"/etc/ankaios/certs/ca.pem\" -CAkey \"./ca-key.pem\" -extensions v3_req -extfile \"./ank-server.cnf\" -out \"/etc/ankaios/certs/ank-server.pem\"\n
"},{"location":"usage/mtls-setup/#generate-key-and-certificate-for-ank-agent","title":"Generate key and certificate for ank-agent
","text":"Construct an OpenSSL configuration file named ank-agent.cnf
. You are welcome to include additional fields if necessary:
[req]\ndistinguished_name = req_distinguished_name\nreq_extensions = v3_req\nprompt = no\n\n[req_distinguished_name]\nCN = ank-agent\n\n[v3_req]\nsubjectAltName = @alt_names\nextendedKeyUsage = clientAuth\n\n[alt_names]\n# This certificate can only be used for agents with the names 'agent_A' or 'agent_B'\n# To allow the usage for any agent use the character '*'\n# like: DNS.1 = *\nDNS.1 = agent_A\nDNS.2 = agent_B\n
Generate ank-agent key:
sudo openssl genpkey -algorithm ED25519 -out \"/etc/ankaios/certs/ank-agent-key.pem\"\n
Generate ank-agent certificate signing request:
sudo openssl req -config \"./ank-agent.cnf\" -new -key \"/etc/ankaios/certs/ank-agent-key.pem\" -out \"./ank-agent.csr\"\n
Generate ank-agent certificate:
sudo openssl x509 -req -in \"./ank-agent.csr\" -CA \"/etc/ankaios/certs/ca.pem\" -CAkey \"./ca-key.pem\" -extensions v3_req -extfile \"./ank-agent.cnf\" -out \"/etc/ankaios/certs/ank-agent.pem\"\n
"},{"location":"usage/mtls-setup/#generate-key-and-certificate-for-the-cli-ank","title":"Generate key and certificate for the CLI ank
","text":"Construct an OpenSSL configuration file named ank.cnf
. You are welcome to include additional fields if necessary:
[req]\ndistinguished_name = req_distinguished_name\nreq_extensions = v3_req\nprompt = no\n[req_distinguished_name]\nCN = ank\n\n[v3_req]\nsubjectAltName = @alt_names\nextendedKeyUsage = clientAuth\n\n[alt_names]\nDNS.1 = ank\n
Generate ank key:
openssl genpkey -algorithm ED25519 -out \"${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank-key.pem\"\n
Generate ank certificate signing request:
openssl req -config \"./ank.cnf\" -new -key \"${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank-key.pem\" -out \"./ank.csr\"\n
Generate ank certificate:
sudo openssl x509 -req -in \"./ank.csr\" -CA \"/etc/ankaios/certs/ca.pem\" -CAkey \"./ca-key.pem\" -extensions v3_req -extfile \"./ank.cnf\" -out \"${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank.pem\"\n
"},{"location":"usage/mtls-setup/#perform-ankaios-installation-with-mtls-support","title":"Perform Ankaios installation with mTLS support","text":"To set up Ankaios with mTLS support, you need to supply the necessary mTLS certificates to the ank-server
, ank-agent
, and ank
CLI components. Here's a step-by-step guide:
ank-server
and ank-agent
with mTLS certificates","text":"curl -sfL https://github.com/eclipse-ankaios/ankaios/releases/latest/download/install.sh | bash -s -- -s \"--startup-config /etc/ankaios/state.yaml --ca_pem /etc/ankaios/certs/ca.pem --crt_pem /etc/ankaios/certs/ank-server.pem --key_pem /etc/ankaios/certs/ank-server-key.pem\" -a \"--name agent_A --ca_pem /etc/ankaios/certs/ca.pem --crt_pem /etc/ankaios/certs/ank-agent.pem --key_pem /etc/ankaios/certs/ank-agent-key.pem\"\n
Start the Ankaios server and an Ankaios agent as described in the Quickstart and continue below to configure the CLI with mTLS.
"},{"location":"usage/mtls-setup/#configure-the-ank-cli-with-mtls-certificates","title":"Configure theank
CLI with mTLS certificates","text":"To make it easier, we will set the mTLS certificates for the ank
CLI by using environment variables:
export ANK_CA_PEM=/etc/ankaios/certs/ca.pem\nexport ANK_CRT_PEM=${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank.pem\nexport ANK_KEY_PEM=${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank-key.pem\n
Now you can use the ank
CLI as follows:
ank get workloads\n
Or in a single line call:
ANK_CA_PEM=/etc/ankaios/certs/ca.pem ANK_CRT_PEM=${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank.pem ANK_KEY_PEM=${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank-key.pem ank get workloads\n
Alternatively, you can pass the mTLS certificates as command line arguments:
ank --ca_pem=/etc/ankaios/certs/ca.pem --crt_pem=\"${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank.pem\" --key_pem=\"${XDG_CONFIG_HOME:-$HOME/.config}/ankaios/ank-key.pem\" get workloads\n
"},{"location":"usage/quickstart/","title":"Quickstart","text":"If you have not installed Ankaios, please follow the instructions here. The following examples assumes that the installation script has been used with default options.
You can start workloads in Ankaios in a number of ways. For example, you can define a file with the startup configuration and use systemd to start Ankaios. The startup configuration file contains all of the workloads and their configuration that you want to be started by Ankaios.
Let's modify the default config which is stored in /etc/ankaios/state.yaml
:
apiVersion: v0.1\nworkloads:\n nginx:\n runtime: podman\n agent: agent_A\n restartPolicy: ALWAYS\n tags:\n - key: owner\n value: Ankaios team\n runtimeConfig: |\n image: docker.io/nginx:latest\n commandOptions: [\"-p\", \"8081:80\"]\n
Then we can start the Ankaios server:
sudo systemctl start ank-server\n
The Ankaios server will read the config but detect that no agent with the name agent_A
is available that could start the workload, see logs with:
journalctl -t ank-server\n
Now let's start an agent:
sudo systemctl start ank-agent\n
This Ankaios agent will run the workload that has been assigned to it. We can use the Ankaios CLI to check the current state:
ank -k get state\n
Note
The instructions assume the default installation without mutual TLS (mTLS) for communication. With -k
or --insecure
the ank
CLI will connect without mTLS. Alternatively, set the environment variable ANK_INSECURE=true
to avoid passing the argument to each ank
CLI command. For an Ankaios setup with mTLS, see here.
which creates:
desiredState:\n apiVersion: v0.1\n workloads:\n nginx:\n agent: agent_A\n tags:\n - key: owner\n value: Ankaios team\n dependencies: {}\n restartPolicy: ALWAYS\n runtime: podman\n runtimeConfig: |\n image: docker.io/nginx:latest\n commandOptions: [\"-p\", \"8081:80\"]\n configs: {}\n configs: {}\nworkloadStates:\n agent_A:\n nginx:\n cc74dd34189ef3181a2f15c6c5f5b0e76aaefbcd55397e15314e7a25bad0864b:\n state: Running\n subState: Ok\n additionalInfo: ''\nagents:\n agent_A:\n cpuUsage: 2\n freeMemory: 7989682176\n
or
ank -k get workloads\n
which results in:
WORKLOAD NAME AGENT RUNTIME EXECUTION STATE ADDITIONAL INFO\nnginx agent_A podman Running(Ok)\n
Ankaios also supports adding and removing workloads dynamically. To add another workload call:
ank -k run workload \\\nhelloworld \\\n--runtime podman \\\n--agent agent_A \\\n--config 'image: docker.io/busybox:1.36\ncommandOptions: [ \"-e\", \"MESSAGE=Hello World\"]\ncommandArgs: [ \"sh\", \"-c\", \"echo $MESSAGE\"]'\n
We can check the state again with ank -k get state
and see, that the workload helloworld
has been added to desiredState.workloads
and the execution state is available in workloadStates
.
As the workload had a one time job its state is Succeeded(Ok)
and we can delete it from the state again with:
ank -k delete workload helloworld\n
Note
Workload names shall not be longer then 63 symbols and can contain only regular characters, digits, the \"-\" and \"_\" symbols.
For next steps follow the tutorial on sending and receiving vehicle data with workloads orchestrated by Ankaios. Then also check the reference documentation for the startup configuration including the podman-kube
runtime and also working with the complete state data structure.
Ankaios supports command completion for the ank
CLI in various shells.
Note
For dynamic completion (workloads etc.) to work, the ank
CLI must be configured via environment variables. To use a non-default server URL, provide ANK_SERVER_URL
. Also provide either ANK_INSECURE=true
or ANK_CA_PEM
, ANK_CRT_PEM
and ANK_KEY_PEM
.
Add the following lines to your ~/.bashrc
:
if command -v ank &> /dev/null; then\n source <(COMPLETE=bash ank)\nfi\n
"},{"location":"usage/shell-completion/#z-shell-zsh","title":"Z shell (zsh)","text":"Add the following lines to your ~/.zshrc
:
if command -v ank &> /dev/null; then\n source <(COMPLETE=zsh ank)\nfi\n
"},{"location":"usage/shell-completion/#fish","title":"Fish","text":"Add the following lines to your ~/.config/fish/config.fish
:
if type -q ank\n source (COMPLETE=fish ank | psub)\nend\n
"},{"location":"usage/shell-completion/#elvish","title":"Elvish","text":"echo \"eval (COMPLETE=elvish ank)\" >> ~/.elvish/rc.elv\n
"},{"location":"usage/shell-completion/#powershell","title":"Powershell","text":"echo \"COMPLETE=powershell ank | Invoke-Expression\" >> $PROFILE\n
"},{"location":"usage/tutorial-vehicle-signals/","title":"Tutorial: Sending and receiving vehicle signals","text":""},{"location":"usage/tutorial-vehicle-signals/#introduction","title":"Introduction","text":"In this tutorial, we will show you how to use Ankaios to set up workloads that publish and subscribe to vehicle signals in accordance with the Vehicle Signal Specification (VSS). The central workload will be a databroker from the Kuksa.val project. It will receive vehicle speed signals published from a speed provider workload. Finally a speed consumer workload will consume those speed limits.
Overview of workloads
To run this tutorial you will need a Linux platform, which can be a RaspberryPi or a Linux PC or virtual machine. Additionally, it's assumed that the Ankaios setup is done with mutual TLS (mTLS) disabled or using its default installation settings.
"},{"location":"usage/tutorial-vehicle-signals/#start-the-databroker","title":"Start the databroker","text":"If you have not yet installed Ankaios, please follow the instructions here. The following examples assume that the installation script has been used with the default options.
Make sure that Ankaios server and agent are started:
sudo systemctl start ank-server\nsudo systemctl start ank-agent\n
Now we have Ankaios up and running with a server and an agent. To run the databroker we need to create an Ankaios manifest:
databroker.yamlapiVersion: v0.1\nworkloads:\n databroker:\n runtime: podman\n agent: agent_A\n runtimeConfig: |\n image: ghcr.io/eclipse/kuksa.val/databroker:0.4.1\n commandArgs: [\"--insecure\"]\n commandOptions: [\"--net=host\"]\n
This defines a workload databroker
to be scheduled on agent agent_A
(default agent name when using standard installation procedure) using the runtime podman
. See the reference documentation for the other attributes.
Let's have a look at the runtimeConfig
which in this case is specific for the podman
runtime.
image: ghcr.io/eclipse/kuksa.val/databroker:0.4.1
specifies the container image according to the OCI image format. Fortunately, the Kuksa.val project already provides an image for the databroker that we can use here.commandArgs: [\"--insecure\"]
: These are command arguments which are passed to the container, in this case to the container's entrypoint. As we are not using authentication for the databroker we pass the argument --insecure
.commandOptions: [\"--net=host\"]
: These options are passed to the podman run
command. We want to use the host network for the databroker.Store the Ankaios manifest listed above in a file databroker.yaml
.
Then start the workload:
ank -k apply databroker.yaml\n
Note
The instructions assume the default installation without mutual TLS (mTLS) for communication. With -k
or --insecure
the ank
CLI will connect without mTLS. Alternatively, set the environment variable ANK_INSECURE=true
to avoid passing the argument to each ank
CLI command. For an Ankaios setup with mTLS, see here.
The Ankaios agent agent_A
will now instruct podman to start the workload. The command waits until the databroker is running. It should finally print:
WORKLOAD NAME AGENT RUNTIME EXECUTION STATE ADDITIONAL INFO\n databroker agent_A podman Running(Ok)\n
"},{"location":"usage/tutorial-vehicle-signals/#start-the-speed-provider","title":"Start the speed provider","text":"Now we want to start a workload that publishes vehicle speed values and call that speed-provider
.
apiVersion: v0.1\nworkloads:\n speed-provider:\n runtime: podman\n agent: agent_A\n runtimeConfig: |\n image: ghcr.io/eclipse-ankaios/speed-provider:0.1.1\n commandOptions:\n - \"--net=host\"\n
The source code for that image is available in the Anakios repo.
Start the workload with:
ank -k apply speed-provider.yaml\n
The command waits until the speed-provider is running. It should finally print:
WORKLOAD NAME AGENT RUNTIME EXECUTION STATE ADDITIONAL INFO\n speed-provider agent_A podman Running(Ok)\n
The speed-provider workload provides a web UI that allows the user to enter a speed value that is then sent to the databroker. The web UI is available on http://127.0.0.1:5000. If your web browser is running on a different host than the Ankaios agent, replace 127.0.0.1 with the IP address of the host running the Ankaios agent.
Speed provider web UI"},{"location":"usage/tutorial-vehicle-signals/#add-an-agent","title":"Add an agent","text":"
We currently have an agent running as part of the Ankaios cluster, running the databroker and the speed provider. The next workload we want to start is a speed consumer that consumes vehicle speed values. A speed consumer such as a navigation system typically runs on a separate node for infotainment. A separate node requires a new Ankaios agent. Let's create another Ankaios agent to connect to the existing server. For this tutorial we can either use a separate Linux host or use the existing one. Start a new agent with:
ank-agent -k --name infotainment --server-url http://<SERVER_IP>:25551\n
If the agent is started on the same host as the existing Ankaios server and agent, then we will call it as follows:
ank-agent -k --name infotainment --server-url http://127.0.0.1:25551\n
As the first agent was started by systemd, it runs as root and therefore calls podman as root. The second agent is started by a non-root user and therefore also uses podman in user mode. Ankaios does not need root privileges and can be started as any user.
Now we have two agents runnings in the Ankaios cluster, agent_A
and infotainment
.
For the next steps we need to keep this terminal untouched in order to keep the agent running.
"},{"location":"usage/tutorial-vehicle-signals/#list-the-connected-agents","title":"List the connected agents","text":"Let's verify that the new infotainment
agent has connected to the Ankaios server by running the following command, which will list all Ankaios agents currently connected to the Ankaios server, along with their number of workloads:
ank -k get agents\n
It should print:
NAME WORKLOADS CPU USAGE FREE MEMORY\nagent_A 2 42.42% 42B\ninfotainment 0 42.42% 42B\n
Since agent_A
is already managing the databroker
and the speed-provider
workloads, the WORKLOADS
column contains the number 2
. The Ankaios agent infotainment
has recently been started and does not yet manage any workloads.
Note
The currently connected Ankaios agents are part of the CompleteState and can also be retrieved working with the CompleteState.
"},{"location":"usage/tutorial-vehicle-signals/#start-the-speed-consumer","title":"Start the speed consumer","text":"Now we can start a speed-consumer workload on the new agent:
speed-consumer.yamlapiVersion: v0.1\nworkloads:\n speed-consumer:\n runtime: podman\n runtimeConfig: |\n image: ghcr.io/eclipse-ankaios/speed-consumer:0.1.2\n commandOptions:\n - \"--net=host\"\n - \"-e\"\n - \"KUKSA_DATA_BROKER_ADDR=127.0.0.1\"\n
In case the speed-consumer workload is not running on the same host as the databroker you need to adjust the KUKSA_DATA_BROKER_ADDR
.
Note that this time the image does not specify the agent. While we could add agent: infotainment
, this time we pass the agent name when the workload starts:
ank -k apply --agent infotainment speed-consumer.yaml\n
Note
If you are running the ank command on a host that is different from the host on which the Ankaios server is running, you need to add a parameter -s <SERVER_URL>
like:
ank -k apply -s http://127.0.0.1:25551 --agent infotainment speed-consumer.yaml\n
Optionally the server URL can also be provided via environment variable:
export ANK_SERVER_URL=http://127.0.0.1:25551\nank -k apply --agent infotainment speed-consumer.yaml\n
The command waits until speed consumer is running. It should print:
WORKLOAD NAME AGENT RUNTIME EXECUTION STATE ADDITIONAL INFO\n speed-consumer infotainment podman Running(Ok)\n
We can check all running workloads with
ank -k get workloads\n
The output should be:
WORKLOAD NAME AGENT RUNTIME EXECUTION STATE ADDITIONAL INFO\n databroker agent_A podman Running(Ok)\n speed-consumer infotainment podman Running(Ok)\n speed-provider agent_A podman Running(Ok)\n
Optionally, you can re-run the previous ank -k get agents
command again, to verify that the number of workloads managed by the infotainment
agent has now increased.
The speed-consumer workload subscribes to the vehicle speed signal and prints it to stdout. Use the web UI of the speed-provider to send a few vehicle speed values and watch the log messages of the speed-consumer. As the logs are specific for a runtime, we use Podman to read the logs:
podman logs -f $(podman ps -a | grep speed-consumer | awk '{print $1}')\n
Info
If you want to see the logs of the databroker or speed-provider you need to use sudo podman
instead of podman
(two occurences) as those workloads run on podman as root on agent_A.
Now, we want to change the existing Ankaios manifest of the speed-provider to use auto mode which sends a new speed limit value every second.
speed-provider.yamlapiVersion: v0.1\nworkloads:\n speed-provider:\n runtime: podman\n agent: agent_A\n runtimeConfig: |\n image: ghcr.io/eclipse-ankaios/speed-provider:0.1.1\n commandOptions:\n - \"--net=host\"\n - \"-e\"\n - \"SPEED_PROVIDER_MODE=auto\"\n
We apply the changes with:
ank -k apply speed-provider.yaml\n
and recognize that we get a new speed value every 1 second.
"},{"location":"usage/tutorial-vehicle-signals/#ankaios-state","title":"Ankaios state","text":"Previously we have used ank -k get workloads
to a get list of running workloads. Ankaios also maintains a current state which can be retrieved with:
ank -k get state\n
Let's delete all workloads and check the state again:
ank -k delete workload databroker speed-provider speed-consumer\nank -k get state\n
If we want to start the three workloads on startup of the Ankaios server and agents we need to create a startup manifest file. In the default installation this file is /etc/ankaios/state.yaml
as we can see in the systemd until file of the Ankaios server:
$ systemctl cat ank-server\n# /etc/systemd/system/ank-server.service\n[Unit]\nDescription=Ankaios server\n\n[Service]\nEnvironment=\"RUST_LOG=info\"\nExecStart=/usr/local/bin/ank-server --insecure --startup-config /etc/ankaios/state.yaml\n\n[Install]\nWantedBy=default.target\n
Now we create a startup manifest file containing all three workloads:
/etc/ankaios/state.yamlapiVersion: v0.1\nworkloads:\n databroker:\n runtime: podman\n agent: agent_A\n runtimeConfig: |\n image: ghcr.io/eclipse/kuksa.val/databroker:0.4.1\n commandArgs: [\"--insecure\"]\n commandOptions: [\"--net=host\"]\n speed-provider:\n runtime: podman\n agent: agent_A\n dependencies:\n databroker: ADD_COND_RUNNING\n runtimeConfig: |\n image: ghcr.io/eclipse-ankaios/speed-provider:0.1.1\n commandOptions:\n - \"--net=host\"\n - \"-e\"\n - \"SPEED_PROVIDER_MODE=auto\"\n speed-consumer:\n runtime: podman\n agent: infotainment\n dependencies:\n databroker: ADD_COND_RUNNING\n runtimeConfig: |\n image: ghcr.io/eclipse-ankaios/speed-consumer:0.1.2\n commandOptions:\n - \"--net=host\"\n - \"-e\"\n - \"KUKSA_DATA_BROKER_ADDR=127.0.0.1\"\n
As the speed-provider and the speed-consumer shall only be started after the databroker is running, we have added dependencies:
dependencies:\n databroker: ADD_COND_RUNNING\n
The next time the Ankaios server and the two agents will be started, this startup config will be applied.
"},{"location":"usage/tutorial-vehicle-signals/#define-re-usable-configuration","title":"Define re-usable configuration","text":"Let's improve the previous startup manifest by introducing a templated configuration for workloads to avoid configuration repetition and have a single point of change. The supported fields and syntax are described here.
/etc/ankaios/state.yamlapiVersion: v0.1\nworkloads:\n databroker:\n runtime: podman\n agent: \"{{agent.name}}\" # (1)!\n configs:\n agent: agents # (2)!\n network: network # (3)!\n runtimeConfig: | # (4)!\n image: ghcr.io/eclipse/kuksa.val/databroker:0.4.1\n commandArgs: [\"--insecure\"]\n commandOptions: [\"--net={{network}}\"]\n speed-provider:\n runtime: podman\n agent: \"{{agent.name}}\"\n dependencies:\n databroker: ADD_COND_RUNNING\n configs:\n agent: agents\n net: network\n env: env_provider # (5)!\n runtimeConfig: | # (6)!\n image: ghcr.io/eclipse-ankaios/speed-provider:0.1.1\n commandOptions:\n - \"--net={{net}}\"\n {{#each env}}\n - \"-e {{this.key}}={{this.value}}\"\n {{/each}}\n speed-consumer:\n runtime: podman\n agent: infotainment\n dependencies:\n databroker: ADD_COND_RUNNING\n configs:\n network: network\n env: env_consumer # (7)!\n runtimeConfig: | # (8)!\n image: ghcr.io/eclipse-ankaios/speed-consumer:0.1.2\n commandOptions:\n - \"--net={{network}}\"\n {{#each env}}\n - \"-e {{this.key}}={{this.value}}\"\n {{/each}}\nconfigs: # (9)!\n network: host\n env_provider:\n - key: SPEED_PROVIDER_MODE\n value: auto\n env_consumer:\n - key: KUKSA_DATA_BROKER_ADDR\n value: \"127.0.0.1\"\n agents:\n name: agent_A\n
Start the Ankaios cluster again, by executing the following command:
sudo systemctl start ank-server\nsudo systemctl start ank-agent\n
Start the infotainment
agent, remembering to change the server URL if the agent is not running on the same host:
ank-agent -k --name infotainment --server-url http://127.0.0.1:25551\n
Verify again that all workloads are up and running.
"},{"location":"usage/tutorial-vehicle-signals/#update-configuration-items","title":"Update configuration items","text":"Let's update the content of a configuration item with the ank apply
command.
Using ank apply
:
apiVersion: v0.1\nconfigs:\n env_provider:\n - key: SPEED_PROVIDER_MODE\n value: webui\n
ank -k apply new-manifest.yaml\n
Ankaios will update workloads that reference an updated configuration item. After running one of these commands, the speed-provider
workload has been updated to run in the 'webui' mode.
You can verify this by re-opening the web UI on http://127.0.0.1:5000.
"},{"location":"usage/tutorial-vehicle-signals/#list-configuration-items","title":"List configuration items","text":"Let's list the configuration items present in current state with the ank get configs
command.
Using ank -k get configs
, it should print:
CONFIGS\nnetwork\nenv_provider\nenv_consumer\nagents\n
"},{"location":"usage/tutorial-vehicle-signals/#delete-configuration-items","title":"Delete configuration items","text":"Let's try to delete a configuration item still referenced by workloads in its configs
field by re-using the previous manifest content.
ank -k delete config env_provider\n
The command returns an error that the rendering of the new state fails due to a missing configuration item.
Ankaios will always reject a new state if it fails to render. The speed-provider
still references the configuration item in its configs
field which would no longer exist.
Running the ank -k get state
command afterwards will show that Ankaios still has the previous state in memory.
To remove configuration items, remove the configuration references for the desired configuration items in the workload's configs
field, and remove the desired configuration items from the state.
When upgrading from v0.2 to v0.3, the installation script simply needs to be run again. However, due to breaking changes, some manual adjustments are required for existing configurations and workloads.
"},{"location":"usage/upgrading/v0_2_to_v0_3/#configurations","title":"Configurations","text":"CompleteState
currentState
has been renamed to desiredState
State
apiVersion
was added to avoid incompatibility issues.restart
has been supplemented with a restartPolicy
enum.configs
and cronjobs
have been removed for now as they are not implemented yet.Workload
accessRights
and updateStrategy
have been removed for now as they are not implemented yet.Application using the control interface or communicating directly with the Ankaios server (custom CLIs) need to be adapted.
The two main messages have been renamed:
StateChangeRequest
-> ToServer
ExecutionRequest
-> FromServer
A new type of ToServer
message, Request
, has been introduced. Every Request
to the server requires a requestId
which is used by the server for the response message. Request IDs allow sending multiple parallel requests to the server. The two messages UpdateStateRequest
and CompleteStateRequest
have been moved to the new Request
message.
A new type of FromServer
message, Response
, has been introduced. A Response
message is always an answer from the Server to a Request
message. The Response
message contains the same requestId
as the answered Request
message. This allows to identify the correct Response
. The CompleteState
message has been moved to the new Response
message. Additionally, the Ankaios server now responds to an UpdateStateRequest
with an UpdateStateSuccess
or Error
message, which are both of type Response
.
When upgrading from v0.3 to v0.4, the installation script simply needs to be ran again. However, due to some breaking changes, some manual adjustments are required for existing workloads using the control interface and applications directly using the gRPC API of the Ankaios server.
"},{"location":"usage/upgrading/v0_3_to_v0_4/#optional-attributes-of-the-complete-state","title":"Optional attributes of the Complete State","text":"Ankaios allows filtering the Complete State at request level and setting only certain fields of the Complete State while updating the desired state of the cluster. To make this process more transparent and remove the need of returning or requiring default values for fields not targeted by the filter masks, Ankaios now explicitly handles all fields (beside versions) of the Complete State as optional. This allows returning only portions of the Complete State, e.g., when filtering with desiredState.workloads.nginx.tags
the response from the server will be:
desiredState:\n apiVersion: v0.1\n workloads:\n nginx:\n tags:\n - key: owner\n value: Ankaios team\n
The changes requires also some additional handling when pushing data over the Control Interface, as some fields must now be enclosed into wrapper objects, e.g., the Rust code for creating a workload object now looks as follows:
Workload {\n runtime: Some(\"podman\".to_string()),\n agent: Some(\"agent_A\".to_string()),\n restart_policy: Some(RestartPolicy::Never.into()),\n tags: Some(Tags {\n tags: vec![Tag {\n key: \"owner\".to_string(),\n value: \"Ankaios team\".to_string(),\n }],\n }),\n runtime_config: Some(\n \"image: docker.io/library/nginx\\ncommandOptions: [\\\"-p\\\", \\\"8080:80\\\"]\"\n .to_string(),\n ),\n dependencies: Some(Dependencies {\n dependencies: HashMap::new(),\n }),\n control_interface_access: None,\n}\n
Please review the examples from the Ankaios repository for more information on the topic.
"},{"location":"usage/upgrading/v0_3_to_v0_4/#removed-top-level-attribute-startupstate","title":"Removed top level attributestartupState
","text":"The top-level attribute startupState
was removed from the Ankaios configuration. Initially, we targeted at allowing a modification of the startup state of the cluster via Ankaios' control interface. As the requirements towards persistent storage in embedded environments could be quite different, e.g., due to flash wear-out protection, it is best to allow a dedicated application to perform the updates of the startup state. The startup state update app could be running as an Ankaios workload, but would be written specifically for the distinct use-case obeying the particular requirements.
The control interface has been decoupled from the API for server-agent communication, now exclusively handling essential messages with newly named identifiers for clarity.
To upgrade to the new version v0.4, use the new control_api.proto
file and the two new messages:
ToAnkaios
FromAnkaios
The new messages currently support requests and responses to and from Ankaios and will later support other functionality. The Request
and Response
messages and their content remain the same, but are now located in the ank_base.proto
file.
A sample how the new definition of the Control Interface is used can be found in the examples from the Ankaios repository.
The reason for splitting some messages into the dedicated file ank_base.proto
, is that they are also used for the gRPC API of the Ankaios server. This API is mainly used by the Ankaios agents and the ank
CLI, but could also be used by third party applications to directly communicate with the Ankaios server. The following chapter details the changes needed to upgrade to v0.4 in case you are using this API.
The usage of the control interface now requires an explicit authorization at the workload configuration. The authorization is done via the new controlInterfaceAccess
attribute.
The following configuration shows an example where the workload composer
can update all other workloads beside the workload watchdog
for which an explicit deny rule is added:
desiredState:\n workloads:\n composer:\n runtime: podman\n ...\n controlInterfaceAccess:\n allowRules:\n - type: StateRule\n operation: ReadWrite\n filterMask:\n - \"desiredState.workloads.*\"\n denyRules:\n - type: StateRule\n operation: Write\n filterMask:\n - \"desiredState.workloads.watchdog\"\n
More information on the control interface authorization can be found in the reference documentation.
"},{"location":"usage/upgrading/v0_3_to_v0_4/#grpc-api-of-the-ankaios-server","title":"gRPC API of the Ankaios server","text":"Ankaios facilitates server-agent-CLI communication through an interchangeable middleware, currently implemented using gRPC. By segregating the gRPC API into a distinct grpc_api.proto
file, we clearly show the target and purpose of this interface.
If you are using the gRPC API of the Ankaios server directly (and not the CLI), you would need to cope with the splitting of the messages into grpc_api.proto
and ank_base.proto
. Apart from that, the API itself is exactly the same.
The structure of the workload execution states field in the Complete State was changed both for the proto and the textual (yaml/json) representations. The change was needed to make the filtering and authorization of getting workload states more intuitive. The old flat vector was supplemented with a new hierarchical structure. Here is an example how the workload states look now in YAML format:
workloadStates:\n agent_A:\n nginx:\n 7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d:\n state: Pending\n subState: Initial\n additionalInfo: ''\n agent_B:\n hello1:\n 9f4dce2c90669cdcbd2ef8eddb4e38d6238abf721bbebffd820121ce1633f705:\n state: Failed\n subState: Lost\n additionalInfo: ''\n
"},{"location":"usage/upgrading/v0_3_to_v0_4/#authentication-and-encryption","title":"Authentication and encryption","text":"Starting from v0.4.0 Ankaios supports mutual TLS (mTLS) for communication between server, agent and ank
CLI. The default installation script will install Ankaios without mTLS. When using the ank
CLI with such an installation, the arguments --insecure
or -k
have to be passed.
So
ank get workloads\n
will have to be changed to
ank -k get workloads\n
Alternatively, set the environment variable ANK_INSECURE=true
to avoid passing the -k
argument to each ank
CLI command.
When upgrading from v0.4 to v0.5, the installation script simply needs to be ran again. However, due to some breaking changes, some manual adjustments are required for existing workloads using the control interface.
"},{"location":"usage/upgrading/v0_4_to_v0_5/#initial-hello-message-for-the-control-interface","title":"InitialHello
message for the Control Interface","text":"In order to ensure version compatibility and avoid undefined behavior resulting from version mismatch, a new obligatory Hello
message was added to the Control Interface protocol. The Hello
must be sent by a workload communicating over the Control Interface at the start of the session as a first message. It is part of the ToAnkaios
message and has the following format:
message Hello {\n string protocolVersion = 2; /// The protocol version used by the calling component.\n}\n
Failing to sent the message before any other communication is done, or providing an unsupported version would result in a preliminary closing of the Control Interface session by Ankaios. The required protocolVersion
string is the current Ankaios release version. As Ankaios is currently in the initial development (no official major release), minor version differences are also handled as incompatible. After the official major release, only the major versions will be compared.
To inform the workload of this, a ConnectionClosed
is sent as part of the FromAnkaios
message. The ConnectionClosed
message contains the reason for closing the session as a string:
message ConnectionClosed {\n string reason = 1; /// A string containing the reason for closing the connection.\n}\n
After the ConnectionClosed
, no more messages would be read or sent by Ankaios on the input and output pipes.
The Control Interface instance cannot be reopened, but a new instance would be created if the workload is restarted.
"}]} \ No newline at end of file diff --git a/main/sitemap.xml b/main/sitemap.xml index 0b7eca3d7..781f7e026 100644 --- a/main/sitemap.xml +++ b/main/sitemap.xml @@ -2,138 +2,138 @@