-
Notifications
You must be signed in to change notification settings - Fork 79
CI Environments
Tests in our CI run against live garden bosh deployments with different ops files (flavours) enabled. We also have full CF deployments. Here is a list of these deployments:
Garden deployments in the gating group in CI are all deployed on a director named eden
. In order to target the Eden
director and list the deployed environments do the following:
- navigate to
garden-ci/directors/eden
- execute
direnv allow
to init the environment (you need to be logged on to LastPass) bosh deployments
Here is a short description of each one:
Name | Description |
---|---|
baobab | This is a garden deployment with default properties that we run the periodic performance tests suite (GPATS) against. It is redeployed daily. Tests are run once a day and results are posted in the garden-ci channel in slack. |
clean-garden | This is another deployment with default properties. We run the acceptance tests (GATS) against it on each commit in garden-runc-release |
ci-boshlite | This is a lite deployment of the bosh director. We are running our GATS agains the garden server in that deployment to make sure it works fine. |
ci-boshlite-latest-grr | Same as above, but we are using the latest release candidate built by our pipeline |
concourse | This is concourse itself. It is deployed on Eden. |
containerd-garden | deployed with CONTAINERD_ENABLED=true. See Garden Modes |
cputhrottle-garden | deployed with experimental_cpu_throttling enabled. See CPU Entitlements (TODO: link) |
jackalberry-garden | a clean garden deployment used to run the garden-integration-tests/performance suite |
nerdful-garden | deployed with CONTAINERD_ENABLED=true and CONTAINERD_FOR_PROCESSES_ENABLED=true. See Garden-Modes |
performance-garden | a clean garden deployment used to run the garden-performance-acceptance-tests |
rootless-garden | a garden deployment with experimental_rootless_mode enabled |
treehouse-garden | a windows garden deployment |
If you want to create a new garden deployment on eden, the easiest way is to
get the clean-garden
manifest, edit it until it fits your needs and deploy it
under a different name.
cd "$HOME/workspace/garden-ci/directors/eden"
direnv allow # put director connection details in the env
bosh -d clean-garden manifest > new-garden-env.yml
# edit manifest
bosh -d new-garden-env deploy new-garden-env.yml
Given that Garden is the container engine of Cloud Foundry, at some point it is
natural to want to spin up a full cf deployment for testing, though we tend to
be quite conservative about that, because CF is a biggie and its acceptance
test suite (the CATS) is
quite flaky and out of the expertise and control of the Garden team. Anyway, we
have a
script
to do just that. It is as simple as running lite-me-up.sh create <env-name>
.
When it runs to completion the script will produce a directory called
<env-name>
that you should commit and push to github, so that you can destroy
the envrironment when you no longer need it by running lite-me-up.sh destoy <env-name>
.
As the name suggests the lite-me-up script will deploy CF on a bosh lite director. Each deployment has its own direcotr, and you can find some that we have been keepiong around if you look in the directors dir, right next to the Eden director. These are being deployed and periodically recreated in the non-gating section of the main pipeline. Then some tests are being run against them. Let's introduce each one in short.
Name | Description |
---|---|
mel-b | This is meant to be a cf deployment with a "spicy" garden config. It has experimental features like cpu throttling, containerd for processes and direct IO in grootfs turned on, the idea being that those are not widely deployed in production, hence not widely tested. |
sleepygary | This one is used for benchmarking app creation time on a standard cf deployment. |
croptopmorty | This one has OCI mode turned on and is used for running the same app creation benchmarks in order to show that OCI mode is (hopefully) more efficient. |
Wavefront is a dashboard frontend which we use to monitor the health and performance of our test deployments. Its graphs are useful to detect abnormal behaviour, the performance impact of a change, etc.
Wavefront dashboards are highly customisable and passive and it is up to its users to make sure they are feeded with data. Our test deployments are configured to emit three "flavours" of data: system health (load, disk usage, etc.), Garden server data (such as number of goroutines), performance date. As these flavours are quite different, we have implemented different approaches to implement them.
System helath data is collected from the output of various commands being run on the host. As system health is an ongoing thing, we use a cron job on every deployment VM to periodically (every hour) collect the health data and send it to Wavefront.
The definition of the cron job is contained in the very deployment manifest (feel free to have a look at e.g. containerd-garden
). The cron job is created by the os-conf release's pre-start-script
bosh job - it creates the cron job file /etc/cron.hourly/wavefront-metrics
. Upon every run, the job creates a metrics file that is posted onto the /report
endpoint of the Wavefront REST API, the so called Direct ingestion
, thus emitting multiple values in a single shot.
The Garden server has a special debug endpoint /debug/vars
which provides Garden related data such as number of goroutines in the server process. VMs emit that data to Wavefront via the metrics-adapter
bosh job, defined in the vantablackbox release.
The metrics-adapter
occasionally collects the Garden server data and emits it to the the wavefront-proxy that runs within the wavefront-proxy
bosh job.
Emitting performance data is built into the performance tests. They send the performance data to Wavefront via the wavefront proxy (see above)
Container creation time is sent by the the garden performance acceptance tests as well via a special Ginkgo reporter
The Deployment System Health dashboard provides an overview on all test deployments. The dashboard can display the data for either a single deployment, or all the deployments.
The Garden CI Monitor dashboard is configured to display aggregated metrics that are important to the Garden project as a whole. For example, it displays the container creation time
metric from the performance tests. In the past we have seen the metric to increase significantly after pushing a change and that made us aware that the change introduced has a significant performance impact.
The dashboard also displays important CI vitals such as concourse DB usage. In order to ensure smooth concourse DB migration during concourse version bumps it is advisable that the DB disk usage is below 50%.