-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[node-agent] Introduce gardener-node-init
script and unit
#8726
Conversation
acd875a
to
21df46b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the change, in particular the refactoring of the commands.
Initial feedback by @oliver-goetz and myself, 2 commits still to review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished review together with @oliver-goetz and found a few more things.
pkg/component/extensions/operatingsystemconfig/original/components/nodeagent/component.go
Outdated
Show resolved
Hide resolved
pkg/component/extensions/operatingsystemconfig/init/bootstrap_config.go
Outdated
Show resolved
Hide resolved
pkg/component/extensions/operatingsystemconfig/init/templates/scripts/init.tpl.sh
Outdated
Show resolved
Hide resolved
21df46b
to
c828538
Compare
@ScheererJ @oliver-goetz Thanks for your reviews. I have addressed the feedback, and I have also added a few more things that I found while integrating this into
Please take another look and let me know what you think. |
ae1923a
to
ef86300
Compare
- client connection - servers - logging settings
Otherwise, GNA would restart itself too early, effectively triggering an infinite loop because it would always consider itself as "changed unit". We already followed the same approach in GNA's `node` controller. Co-Authored-By: Oliver Götz <47362717+oliver-goetz@users.noreply.github.com>
This code is the very same for almost each binary, so let's extract it into a function to not duplicate it again and again.
Co-Authored-By: Oliver Götz <47362717+oliver-goetz@users.noreply.github.com>
Co-Authored-By: Oliver Götz <47362717+oliver-goetz@users.noreply.github.com>
Co-Authored-By: Oliver Götz <47362717+oliver-goetz@users.noreply.github.com>
- OSC controller calls the root `context.CancelFunc` of the command - Bootstrap package no longer stops gardener-node-init unit - gardener-node-init unit no longer gets restarted each 30s but is a oneshot which only restarts on failure
Also add mutex to fake DBus to prevent flakes since the OSC controller in GNA is running systemd commands in parallel (and in the fake DBus they all write to the same slice in parallel)
Without this, registry mirrors in the hosts dir will not be respected. `ctr images pull` as well as the `docker.ConfigureHosts` function already ignore "not found" errors (in case the provided hosts dir does not exist), so we can safely specify it unconditionally
and then renames it - a direct copy while a process is running will not work (`device or resource busy`)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found a typo.
ef86300
to
5919a95
Compare
This is much simpler compared to passing all potential arguments
5919a95
to
a9d26a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
LGTM label has been added. Git tree hash: c80b97bc54a42a109d9ae1a648163fc94dc1a014
|
@rfranzke: The following test failed, say
Full PR test history. Your PR dashboard. Command help for this repository. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
awesome 🚀 |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: oliver-goetz The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
How to categorize this PR?
/area dev-productivity
/kind enhancement
What this PR does / why we need it:
This PR introduces the bash script ("
gardener-node-init
") which will replace thecloud-config-downloader
script when new shoot worker nodes are bootstrapped. It usesctr images {pull,mount}
to copy thegardener-node-agent
binary from its container image to the node, and then runsgardener-node-agent bootstrap
. This is a command whichgardener-node-agent.service
systemd unit file (the "real"gardener-node-agent
)gardener-node-agent.service
gardener-node-init.service
)From then on,
gardener-node-agent
takes care of reconciling the OSC secret with itsOperatingSystemConfig
controller (ref #8683). For self-upgrades, it may only restart itself at the very end of its reconciliation (after it persisted the "last applied OSC file") to prevent infinite loops.Which issue(s) this PR fixes:
Part of #8023
Special notes for your reviewer:
On the way, I added a few more minor enhancements:
/cc @oliver-goetz
Release note: