Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garden Linux OS configuration from shoot manifest #63

Closed
wants to merge 7 commits into from

Conversation

MrBatschner
Copy link
Contributor

@MrBatschner MrBatschner commented May 23, 2022

How to categorize this PR?

/area os
/kind api-change
/kind enhancement
/os garden-linux

What this PR does / why we need it:

Garden Linux runs with AppArmor as a default Linux Security Module (lsm). While some K8S workloads require AppArmor to be present on the worker nodes, some other workloads might require SELinux. It therefore should be possible to configure the LSM to be used on worker nodes within the Shoot manifest when creating a cluster.
Likewise, Garden Linux so far runs with systemd and the (old) non-unified cgroup hierarchy, aka cgroup v1. Some users might want to use cgroup v2 instead which requires reconfiguring the operating system and switching containerd and the kubelet to use systemd as a cgroup driver.

With this PR, the Garden Linux OS extension will be able to understand the following provider configuration:

apiVersion: gardenlinux.os.extensions.gardener.cloud/v1alpha1
kind: OperatingSystemConfiguration
cgroupVersion: v2
linuxSecurityModule: SELinux

This allows configuring the linuxSecurityModule and the cgroupVersion within the Shoot manifest. The extension will place shell scripts that will take care of configuring these settings on a Garden Linux node into /opt/bin/gardener and will deploy systemd units that will run these scripts as one-shot during system boot before the kubelet comes up.

Changing these settings unfortunately requires a reboot. If any of the configuration scripts finds out that the desired setting is different from the current setting, it will place the file /var/run/gardener/restart-required. Another script that gets deployed by the extension (along with another systemd unit that will start it) will check if this file is present and will reboot the system but only if the kubelet is found not to be running.

Changing the cgroup version to v2 will be prevented for Shoots with Kubernetes version <1.19, v1 will be enforced on such systems.

Which issue(s) this PR fixes:

It does not entirely fix them but it lays the groundworks to get rid of #47, #50 and Garden Linux #310.

Special notes for your reviewer:

An admission controller that will check for valid inputs to the configuration settings will follow in a separate PR to keep this PR reasonable in size.

Release note:

- introduce configuration of the linuxSecurityModule and cgroupVersion used in Garden Linux from the Shoot manifest

* introduces API types to OperatingSystemConfiguration to
  * configure cgroup version in Garden Linux
  * configure the Linux Security Module (LSM) in Garden Linux
* deploys appropriate scripts into Garden Linux that reconfigure
  the settings given above and places systemd units to make them
  start before the kubelet comes up
* deploys a script that can reboot the system if a configuration
  script requires it (and will do nothing if it detects a running
  kubelet)
* places systemd unit drop-ins that will change the configuration
  files of containerd and kubelet (before they get started) to make
  them use systemd as cgroup driver on cgroup v2 enabled systems
@gardener-robot gardener-robot added kind/api-change API change with impact on API users needs/second-opinion Needs second review by someone else area/os Operation system related kind/enhancement Enhancement, improvement, extension os/garden-linux Related to Garden Linux OS needs/review Needs review size/xl Size of pull request is huge (see gardener-robot robot/bots/size.py) labels May 23, 2022
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 23, 2022
@gardener-robot-ci-1 gardener-robot-ci-1 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels May 23, 2022
@MrBatschner MrBatschner marked this pull request as ready for review May 23, 2022 13:46
@MrBatschner MrBatschner requested review from a team as code owners May 23, 2022 13:46
@MrBatschner
Copy link
Contributor Author

/invite @vpnachev

@gardener-robot gardener-robot requested a review from vpnachev May 26, 2022 07:37
pkg/apis/gardenlinux/types.go Outdated Show resolved Hide resolved
pkg/apis/gardenlinux/v1alpha1/types.go Outdated Show resolved Hide resolved
Comment on lines 5 to 10
function get_fs_of_directory {
[ -z "$1" -o ! -d "$1" ] && return
echo -n "$(stat -c %T -f "$1")"
}

function check_current_cgroup {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible these functions to be moved to a library alike script which is sourced here and in the other scripts where they are used?

Copy link
Contributor Author

@MrBatschner MrBatschner Jun 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I moved these functions which were present in three different scripts into a separate file g_functions.sh which is then sourced by the other scripts. Makes total sense to avoid this kind of code duplication.

pkg/generator/gardenlinux/securitymodule.go Outdated Show resolved Hide resolved
pkg/generator/gardenlinux/securitymodule.go Outdated Show resolved Hide resolved

{{- if not .Bootstrap }}
{{- range $_, $unitname := .AdditionalValues.unitsToEnable }}
systemctl enable {{ $unitname }} && systemctl restart {{ $unitname }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already done here

Copy link
Contributor Author

@MrBatschner MrBatschner Jun 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only for units which are in extensionsv1alpha1.OperatingSystemConfig, but I am adding these units to generator.OperatingSystemConfig before feeding it into the cloud-init generator and units added here are not covered by the code you linked to.

return nil, err
}

shootKubernetesAtLeast119, err := version.CompareVersions(shoot.Spec.Kubernetes.Version, ">=", "1.19")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you will need to check also the k8s version of the worker pool as it could differ from the control plane k8s version, ref: https://github.com/gardener/gardener/blob/master/docs/usage/worker_pool_k8s_versions.md

The worker pool name is set in a label in the OSC resource (ref: gardener/gardener#5256), so you should be able to easily find the worker config for the given operatingSystemConfig.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gardener/gardener-maintainers what do you think about deprecating and removing support for k8s <=1.18 for shoot cluster?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you will need to check also the k8s version of the worker pool as it could differ from the control plane k8s version, ref: https://github.com/gardener/gardener/blob/master/docs/usage/worker_pool_k8s_versions.md

The worker pool name is set in a label in the OSC resource (ref: gardener/gardener#5256), so you should be able to easily find the worker config for the given operatingSystemConfig.

That has been taken care of.

@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 30, 2022
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 30, 2022
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 30, 2022
@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 30, 2022
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 30, 2022
@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 30, 2022
@danielfoehrKn
Copy link
Contributor

I have some comments on the cgroupsv2 story:
First, I think it is necessary and good to finally tackle enabling cgroupsv2 for gardenlinux-nodes in Gardener.
However, I am not sure about the current approach.

Cgroupsv2 is clearly the way forward and is gradually replacing cgroupsv1. Support for cgroupsv2 in K8s was introduced as alpha almost 2 years ago and is up for beta graduation next.
However, this PR introduces a poor-mans solution that requires an OS restart when bootstrapping a cgroupsv2 node (add to startup latency).
The common case should and will be cgroupsv2 nodes - hence we should focus on making this case as optimal as possible.

What would be better from my POV:

Ideally: build cgroupsv1 and cgroupsv2 versions of Gardenlinux so that no restart is required. Offer both versions in the Cloudprofile.
How:

  • cgroupsv2/cgroupsv1 support is advertised in the CloudProfile.
  • Only offer cgroupvs1 Gardenlinux OS versions for K8s versions < 1.19.
  • At a certain point, stop building cgroupsv1 Gardenlinux images. This is the latest point when users have to switch to cgroupvs2 enabled OS.
  • Gardener advertises the cgroups version from the Shoot's referenced image into the OperatingSystemConfig resource

Advantage:

  • less complexity is introduced (no restart mechanism).
  • no new poor man's solution that introduces a startup penalty

Also supporting the migration of an existing node from cgroupvs1 to cgroupsv2 does not seem to be necessary. It should be simpler to just force new node rollouts when switching to cgroupsv2.

If different Gardenlinux OS versions are not wanted: reverse the approach: using cgroupsv1 instead of cgroupsv2 is possible but requires a restart

@vpnachev
Copy link
Member

vpnachev commented Jun 1, 2022

  • cgroupsv2/cgroupsv1 support is advertised in the CloudProfile.

  • Only offer cgroupvs1 Gardenlinux OS versions for K8s versions < 1.19.

I think it is too much overhead to do this, especially now when k8s 1.18 is out of maintenance since long time. Such temporary enhancement of the cloudprofile API could be easily avoided by simply dropping support for shoots with k8s < 1.19.

@danielfoehrKn
Copy link
Contributor

I think it is too much overhead to do this, especially now when k8s 1.18 is out of maintenance since long time. Such temporary enhancement of the cloudprofile API could be easily avoided by simply dropping support for shoots with k8s < 1.19.

Are you implying to only build and use cgroupsv2 images for Shoots > 1.18 -
So the os-extension can determine how to configure the cloud-config based on the k8s version only?
That would also work.

@danielfoehrKn
Copy link
Contributor

Thanks for the explanation. It sounds better with the prospect of getting rid of the restart eventually.

however, code that reconfigures the CRI and the Kubelet to use the systemd cgroup driver must remain but the question is if we should have it in the OS extensions (similar scripts have been gardener/gardener-extension-os-coreos#40) or if it should be supplied by the OperatingsystemConfiguration that Gardener deploys to the Seeds

I think only if this configuration is OS-specific - which is probably not the case.

I think it is too much overhead to do this, especially now when k8s 1.18 is out of maintenance since long time.

WDYT in general about a set of attributes (map[string]string) in the Cloudprofile for each image version?
An OS image has a lot of implicit information e.g containerd version, cgroup version, any other OS property.
Advertising this information explicitly allows OS extensions or any other component to make decisions based on that - without having to rely on implicit knowledge such as the right timing of OS version deprecation / what cgroup is used per default / LSM modules / .... .

This allows for better validation for features that rely on OS image pre-installed binaries/configuration

  • only able to use a certain LSM module if it is explicitly advertised for the OS
  • Which Container runtime is installed and with what version -> cannot use rootless Cluster for containerd version < x.x
  • Is the CRI version compatible with the Kubelet version?

On top of validation, it can be used for any kind of decision that has to be made based on binary versions installed on the OS (such as the ones we see in this PR).

@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 2, 2022
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 2, 2022
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jun 2, 2022
@gardener-robot-ci-1 gardener-robot-ci-1 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jun 2, 2022
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jun 2, 2022
@vpnachev
Copy link
Member

vpnachev commented Jun 3, 2022

WDYT in general about a set of attributes (map[string]string) in the Cloudprofile for each image version?
An OS image has a lot of implicit information e.g containerd version, cgroup version, any other OS property.

This works only for really immutable distributions, for example Ubuntu installs containerd and docker on start-up, ref. Maintaining such information will be cumbersome. I think the current approach checking configs and package versions directly on the node is the better approach.

@MrBatschner
Copy link
Contributor Author

I would be in line with @vpnachev. It creates the additional burden of maintaining these values and can create a lot of confusion should a mistake occur. I think it should be preferred obtaining any of these values from the OS itself as it does not involve constant maintenance effort and reduces the possibility for error.

@MrBatschner MrBatschner requested a review from vpnachev June 3, 2022 07:15
@gardener-robot-ci-1 gardener-robot-ci-1 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jun 3, 2022
@MrBatschner
Copy link
Contributor Author

Could someone ptal after I incorporated the review feedback?


// LinuxSecurityModule allows to configure default Linux Security Module for Garden Linux.
// +optional
LinuxSecurityModule LinuxSecurityModule `json:"linuxSecurityModule,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional APIs have pointer types. Same below.
If it is non-optional then remove // +optional and omitempty tags

Copy link
Contributor Author

@MrBatschner MrBatschner Jun 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intended them to be optional and changed them to pointers. PTAL.

@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 13, 2022
@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 13, 2022
@MrBatschner MrBatschner requested a review from rfranzke June 22, 2022 13:30
@MrBatschner
Copy link
Contributor Author

It would be cool, if the review of this PR could continue. We would like to deliver the next Garden Linux release with cgroup v2 enabled by default and without these changes, we would break the compatibility with any K8S cluster <1.19.

@MrBatschner
Copy link
Contributor Author

After some offline discussion it became apparent that changing between cgroup v1 and v2 from the Shoot manifest or as dependency from the Kubernetes version used might become irrelevant in the future as Kubernetes versions without cgroup v2 support (i.e. <1.19) are already deprecated and will expire soon.
Supporting a Linux Security Module other than AppArmor is currently a work in progress on Garden Linux level. Supporting dynamic configuration of the LSM might become important in the future and work on this PR might continue then. Until then however...
/close

@gardener-robot gardener-robot added the status/closed Issue is closed (either delivered or triaged) label Jul 19, 2022
LucaBernstein pushed a commit to LucaBernstein/gardener-extension-os-gardenlinux that referenced this pull request Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/os Operation system related kind/api-change API change with impact on API users kind/enhancement Enhancement, improvement, extension needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) needs/review Needs review needs/second-opinion Needs second review by someone else os/garden-linux Related to Garden Linux OS size/xl Size of pull request is huge (see gardener-robot robot/bots/size.py) status/closed Issue is closed (either delivered or triaged)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants