This repository has been archived by the owner on Jun 26, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 172
HNC: very slow startup after first installation #765
Milestone
Comments
There seem to be two problems here:
|
Here are some experiments, from the time
The difference between 2 and 3 (~22s) represents two leader elections, from which we can gather that LE takes about 11s. This suggests that without forcing HNC to exit but disabling LE, we'd reduce our startup time from ~95s to ~84s - not a huge advantage. However, the second restart of HNC, without LE, takes only ~4s, vs ~15s with LE. This is probably worth doing if only for the high availability aspect (e.g. validators going down for 4s vs 15s). |
Here's the patch to make the cert manager exit when the secret is first written:
|
adrianludwin
added a commit
to adrianludwin/multi-tenancy
that referenced
this issue
Jul 1, 2020
See kubernetes-retired#765. If a mounted secret changes _after_ a pod is started, it can take a fairly long time (~60s) for the kubelet to notice the change and project the new secret to the pod. Since our internal cert manager writes a secret but then needs to wait for it to become available as a file, this leads to a poor onboarding experience with HNC. This change introduces a flag that exits the process as soon as the internal cert manager changes a secret, which should only occur on initial installation of HNC or every ten years (!). The restart time takes <5s so this is overall a much better experience. Tested: without changing the flags in the default manifest, observed no change when HNC is installed for the first time (i.e. from the first log message to when the HNCConfiguration is first reconciled takes 103s, and there are no restarts). When the flag is added, the startup time decreases to 10s with the one expected restart. Further restarts of HNC (e.g. deleting and recreating the deployment but not the secret) does not result in a restart and completes in 4s.
adrianludwin
added a commit
to adrianludwin/multi-tenancy
that referenced
this issue
Jul 2, 2020
See kubernetes-retired#765. If a mounted secret changes _after_ a pod is started, it can take a fairly long time (~60s) for the kubelet to notice the change and project the new secret to the pod. Since our internal cert manager writes a secret but then needs to wait for it to become available as a file, this leads to a poor onboarding experience with HNC. This change introduces a flag that exits the process as soon as the internal cert manager changes a secret, which should only occur on initial installation of HNC or every ten years (!). The restart time takes <5s so this is overall a much better experience. Tested: without changing the flags in the default manifest, observed no change when HNC is installed for the first time (i.e. from the first log message to when the HNCConfiguration is first reconciled takes 103s, and there are no restarts). When the flag is added, the startup time decreases to 10s with the one expected restart. Further restarts of HNC (e.g. deleting and recreating the deployment but not the secret) does not result in a restart and completes in 4s.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
The builtin cert manager seems to stall the first time we install HNC. This is a poor user experience.
The text was updated successfully, but these errors were encountered: