Skip to content

Commit

Permalink
Configure Graceful Node Shutdown and lengthen max inhibitor delay
Browse files Browse the repository at this point in the history
* Configure Kubelet Graceful Node Shutdown to detect system shutdown
events and stop running containers gracefully when possible
* Allow up to 30s for critical pods to gracefully shutdown
* Allow up to 15s for regular pods to gracefully shutdown
* Node will be marked as NotReady promptly, instead of having to
wait for health checks
* Kubelet uses systemd inhibitor locks to delay shutdown for a limited
number of seconds
* Raise the default max inhibitor time from 5s to 45s

Verify systemd inhibitor locks are present:

```
sudo systemd-inhibit --list
WHO     UID USER PID  COMM    WHAT     WHY                                        MODE
kubelet 0   root 4581 kubelet shutdown Kubelet needs time to handle node shutdown delay
```

Tail journal logs and then shutdown a node via systemctl reboot
or via the cloud console to watch container shutdown

Rel:

* https://kubernetes.io/blog/2021/04/21/graceful-node-shutdown-beta/
* https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/
* kubernetes/kubernetes#107043
* coreos/fedora-coreos-tracker#821
* https://www.freedesktop.org/software/systemd/man/systemd-inhibit.html
* https://github.com/kubernetes/kubernetes/blob/release-1.24/pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux.go
* https://github.com/godbus/dbus/blob/master/conn.go
  • Loading branch information
dghubble committed Aug 28, 2022
1 parent 76d92e9 commit 2796ee8
Show file tree
Hide file tree
Showing 21 changed files with 145 additions and 0 deletions.
5 changes: 5 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,11 @@ Notable changes between versions.

* Kubernetes [v1.25.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md#v1250)
* Disable LocalStorageCapacityIsolationFSQuotaMonitoring feature gate ([#1220](https://github.com/poseidon/typhoon/pull/1220))
* Configure Kubelet Graceful Node Shutdown
* Allow up to 30s for critical pods to gracefully shutdown on node shutdown
* Allow up to 15s for regular pods to gracefully shutdown on node shutdown
* Mark node NotReady promptly on node shutdown
* Lengthen systemd inhibitor lock max delay from 5s to 45s

### Fedora CoreOS

Expand Down
7 changes: 7 additions & 0 deletions aws/fedora-coreos/kubernetes/butane/controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,8 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
Expand Down Expand Up @@ -194,6 +196,11 @@ storage:
echo "Retry applying manifests"
sleep 5
done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
contents:
inline: |
Expand Down
7 changes: 7 additions & 0 deletions aws/fedora-coreos/kubernetes/workers/butane/worker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -122,10 +122,17 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
contents:
inline: |
Expand Down
7 changes: 7 additions & 0 deletions aws/flatcar-linux/kubernetes/butane/controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,8 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
Expand Down Expand Up @@ -193,6 +195,11 @@ storage:
echo "Retry applying manifests"
sleep 5
done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
mode: 0644
contents:
Expand Down
7 changes: 7 additions & 0 deletions aws/flatcar-linux/kubernetes/workers/butane/worker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -121,10 +121,17 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
mode: 0644
contents:
Expand Down
7 changes: 7 additions & 0 deletions azure/fedora-coreos/kubernetes/butane/controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,8 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
Expand Down Expand Up @@ -189,6 +191,11 @@ storage:
echo "Retry applying manifests"
sleep 5
done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
contents:
inline: |
Expand Down
7 changes: 7 additions & 0 deletions azure/fedora-coreos/kubernetes/workers/butane/worker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -117,10 +117,17 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
contents:
inline: |
Expand Down
7 changes: 7 additions & 0 deletions azure/flatcar-linux/kubernetes/butane/controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,8 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
Expand Down Expand Up @@ -189,6 +191,11 @@ storage:
echo "Retry applying manifests"
sleep 5
done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
mode: 0644
contents:
Expand Down
7 changes: 7 additions & 0 deletions azure/flatcar-linux/kubernetes/workers/butane/worker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -117,10 +117,17 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
mode: 0644
contents:
Expand Down
7 changes: 7 additions & 0 deletions bare-metal/fedora-coreos/kubernetes/butane/controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,8 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
Expand Down Expand Up @@ -199,6 +201,11 @@ storage:
echo "Retry applying manifests"
sleep 5
done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
contents:
inline: |
Expand Down
7 changes: 7 additions & 0 deletions bare-metal/fedora-coreos/kubernetes/butane/worker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -113,10 +113,17 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
contents:
inline: |
Expand Down
7 changes: 7 additions & 0 deletions bare-metal/flatcar-linux/kubernetes/butane/controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,8 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
Expand Down Expand Up @@ -200,6 +202,11 @@ storage:
echo "Retry applying manifests"
sleep 5
done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
mode: 0644
contents:
Expand Down
7 changes: 7 additions & 0 deletions bare-metal/flatcar-linux/kubernetes/butane/worker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -118,10 +118,17 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
mode: 0644
contents:
Expand Down
7 changes: 7 additions & 0 deletions digital-ocean/fedora-coreos/kubernetes/butane/controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,8 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
Expand Down Expand Up @@ -196,6 +198,11 @@ storage:
echo "Retry applying manifests"
sleep 5
done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
contents:
inline: |
Expand Down
7 changes: 7 additions & 0 deletions digital-ocean/fedora-coreos/kubernetes/butane/worker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -122,10 +122,17 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
contents:
inline: |
Expand Down
7 changes: 7 additions & 0 deletions digital-ocean/flatcar-linux/kubernetes/butane/controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,8 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
Expand Down Expand Up @@ -198,6 +200,11 @@ storage:
echo "Retry applying manifests"
sleep 5
done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
mode: 0644
contents:
Expand Down
7 changes: 7 additions & 0 deletions digital-ocean/flatcar-linux/kubernetes/butane/worker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -121,10 +121,17 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
mode: 0644
contents:
Expand Down
7 changes: 7 additions & 0 deletions google-cloud/fedora-coreos/kubernetes/butane/controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,8 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
Expand Down Expand Up @@ -188,6 +190,11 @@ storage:
echo "Retry applying manifests"
sleep 5
done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
contents:
inline: |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -116,10 +116,17 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
contents:
inline: |
Expand Down
7 changes: 7 additions & 0 deletions google-cloud/flatcar-linux/kubernetes/butane/controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,8 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
Expand Down Expand Up @@ -188,6 +190,11 @@ storage:
echo "Retry applying manifests"
sleep 5
done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
mode: 0644
contents:
Expand Down
Loading

0 comments on commit 2796ee8

Please sign in to comment.