Skip to content

Commit

Permalink
Remove k8s patch and go instead with a docs patch
Browse files Browse the repository at this point in the history
  • Loading branch information
addyess committed Jul 16, 2024
1 parent 8c65397 commit 73e8abb
Show file tree
Hide file tree
Showing 4 changed files with 50 additions and 29 deletions.
51 changes: 50 additions & 1 deletion docs/src/snap/reference/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
This page provides techniques for troubleshooting common Canonical Kubernetes
issues.

## Kubectl error: "dial tcp 127.0.0.1:6443: connect: connection refused"

## Kubectl error: `dial tcp 127.0.0.1:6443: connect: connection refused`

### Problem

Expand Down Expand Up @@ -33,6 +34,54 @@ Use `k8s config` instead of `k8s kubectl config` to generate a kubeconfig file
that is valid for use on external machines.


## Kubelet Error: `failed to initialize top level QOS containers:root container [kubepods] doesn't exist`

Check failure on line 37 in docs/src/snap/reference/troubleshooting.md

View workflow job for this annotation

GitHub Actions / markdown-lint

Line length

docs/src/snap/reference/troubleshooting.md:37:81 MD013/line-length Line length [Expected: 80; Actual: 105] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md013.md

Check failure on line 37 in docs/src/snap/reference/troubleshooting.md

View workflow job for this annotation

GitHub Actions / markdown-lint

Line length

docs/src/snap/reference/troubleshooting.md:37:81 MD013/line-length Line length [Expected: 80; Actual: 105] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md013.md

### Problem

This is related to the `kubepods` cgroup not getting the cpuset controller up on
the kubelet. kubelet needs a feature from cgroup and the kernel may not be set
up appropriately to provide the cpuset feature.


### Explanation

An excellent deep-dive of the issue exists at [kubernetes/kubernetes
#122955][kubernetes-122955].

Check failure on line 49 in docs/src/snap/reference/troubleshooting.md

View workflow job for this annotation

GitHub Actions / markdown-lint

No space after hash on atx style heading

docs/src/snap/reference/troubleshooting.md:49:1 MD018/no-missing-space-atx No space after hash on atx style heading [Context: "#122955][kubernetes-122955]."] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md018.md

Check failure on line 49 in docs/src/snap/reference/troubleshooting.md

View workflow job for this annotation

GitHub Actions / markdown-lint

No space after hash on atx style heading

docs/src/snap/reference/troubleshooting.md:49:1 MD018/no-missing-space-atx No space after hash on atx style heading [Context: "#122955][kubernetes-122955]."] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md018.md

Commenter [@haircommander][] [states][kubernetes-122955-2020403422]
> basically: we've figured out that this issue happens because libcontainer
> doesn't initialize the cpuset cgroup for the kubepods slice when the kubelet
> initially calls into it to do so. This happens because there isn't a cpuset
> defined on the top level of the cgroup. however, we fail to validate all of
> the cgroup controllers we need are present. It's possible this is a
> limitation in the dbus API: how do you ask systemd to create a cgroup that
> is effectively empty?
> if we delegate: we are telling systemd to leave our cgroups alone, and not
> remove the "unneeded" cpuset cgroup.

### Solution

This is in the process of being fixed upstream via [kubernetes/kuberetes
#125923][kubernetes-125923].

Check failure on line 67 in docs/src/snap/reference/troubleshooting.md

View workflow job for this annotation

GitHub Actions / markdown-lint

No space after hash on atx style heading

docs/src/snap/reference/troubleshooting.md:67:1 MD018/no-missing-space-atx No space after hash on atx style heading [Context: "#125923][kubernetes-125923]."] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md018.md

Check failure on line 67 in docs/src/snap/reference/troubleshooting.md

View workflow job for this annotation

GitHub Actions / markdown-lint

No space after hash on atx style heading

docs/src/snap/reference/troubleshooting.md:67:1 MD018/no-missing-space-atx No space after hash on atx style heading [Context: "#125923][kubernetes-125923]."] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md018.md

In the meantime, the best solution is to create a `Delegate=yes` configuration
in systemd.

```bash
mkdir -p /etc/systemd/system/snap.k8s.kubelet.service.d
cat /etc/systemd/system/snap.k8s.kubelet.service.d/delegate.conf <<EOF
[Service]
Delegate=yes
EOF
reboot
```

<!-- LINKS -->

[kubeconfig file]: https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/
[kubernetes-122955]: https://github.com/kubernetes/kubernetes/issues/122955
[kubernetes-125923]: https://github.com/kubernetes/kubernetes/pull/125923
[kubernetes-122955-2020403422]: https://github.com/kubernetes/kubernetes/issues/122955#issuecomment-2020403422
[@haircommander]: https://github.com/haircommander
25 changes: 0 additions & 25 deletions k8s/lib.sh
Original file line number Diff line number Diff line change
Expand Up @@ -161,28 +161,3 @@ k8s::kubelet::ensure_shared_root_dir() {
mount -o remount --make-rshared "$SNAP_COMMON/var/lib/kubelet" /var/lib/kubelet
fi
}

# Ensure /etc/systemd/system/snap.k8s.kubelet.service.d/delegate.conf has delegate config
# Example: 'k8s::common::is_strict || k8s::kubelet::ensure_cgroup_delegate'
k8s::kubelet::ensure_cgroup_delegate() {
k8s::common::setup_env

if (systemctl show snap.k8s.kubelet -p NRestarts | grep -qv "NRestarts=0") &&
(grep -qv cpuset /sys/fs/cgroup/cgroup.subtree_control) &&
[ ! -e /etc/systemd/system/snap.k8s.kubelet.service.d/delegate.conf ]
then
mkdir -p /etc/systemd/system/snap.k8s.kubelet.service.d
tee /etc/systemd/system/snap.k8s.kubelet.service.d/delegate.conf > /dev/null <<EOF
[Service]
Delegate=yes
EOF
systemctl daemon-reload || true
snap restart k8s || true
fi
}


# Ensure /etc/systemd/system/snap.k8s.kubelet.service.d is removed
k8s::kubelet::remove() {
rm -rf /etc/systemd/system/snap.k8s.kubelet.service.d || true
}
1 change: 0 additions & 1 deletion k8s/wrappers/services/kubelet
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
k8s::common::setup_env

k8s::common::is_strict && k8s::kubelet::ensure_shared_root_dir
k8s::common::is_strict || k8s::kubelet::ensure_cgroup_delegate

k8s::util::wait_containerd_socket
k8s::util::wait_kube_apiserver
Expand Down
2 changes: 0 additions & 2 deletions snap/hooks/remove
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,3 @@ k8s::common::setup_env
k8s::remove::containers

k8s::remove::network

k8s::remove::kubelet

0 comments on commit 73e8abb

Please sign in to comment.