-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wait on startup to get permissions #1858
Conversation
/assign @swetharepakula @spencerhance just taking any of the jobs runs we can see these panics on the logs https://storage.googleapis.com/kubernetes-jenkins/logs/ci-ingress-gce-e2e/1591319036715601920/artifacts/e2e-c4f27e13ef-58614-master/glbc.log The time of 1 minute is taken from the observation of that logs |
@aojea: GitHub didn't allow me to assign the following users: spencerhance. Note that only kubernetes members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
/hold missing tests |
/hold cancel now I think I got it right |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm👍
return false, fmt.Errorf("failed to verify the existence of %v CRD: %v", meta.kind, err) | ||
} | ||
// CRD exists, get current resource version and update it | ||
updateCRD = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a followup we should check if we even need to ensure. The case I am thinking is say we graduate an API, but had to rollback. We will fail here because in the rollback we try ensure a version without the graduated API.
Thanks Antonio! /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aojea, spencerhance, swetharepakula The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The glbc component needs to get a CRD and have permission to read it, this is not inmidiate when bootstrapping a cluster so, instead of crashing and depending on a external component to restart the pod, we can active poll on this conditions before crashing.
This has the benefit of reducing the noise on the logs with panics, that may be misinterpreted by external tools processing the logs. It also improves the bootstrap latency.
We can observe this behavior in any of the jobs
where the glbc controller panics multiple times because it does not has permissions
https://storage.googleapis.com/kubernetes-jenkins/logs/ci-ingress-gce-e2e/1591319036715601920/artifacts/e2e-c4f27e13ef-58614-master/glbc.log