-
Notifications
You must be signed in to change notification settings - Fork 39.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removed containers are not always waiting #54593
Conversation
/assign @dchen1107 @sjenning |
/unassign @feiskyer |
commented on the issue where i meant the pr... can we also do the following: prior to calling updateStatus? if it detects an invalid state transition, panic? |
/approve |
confirmed fix, lgtm |
Removing label |
Automatic merge from submit-queue (batch tested with PRs 54597, 54593, 54081, 54271, 54600). If you want to cherry-pick this change to another branch, please follow the instructions here. |
Automatic merge from submit-queue (batch tested with PRs 54597, 54593, 54081, 54271, 54600). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. kubelet: check for illegal container state transition supersedes #54530 Puts a state transition check in the kubelet status manager to detect and block illegal transitions; namely from terminated to non-terminated. @smarterclayton @derekwaynecarr @dashpole @joelsmith @frobware I confirmed that the reproducer in #54499 does not work with this check in place. The erroneous kubelet status update is rejected: ``` status_manager.go:301] Status update on pod default/test aborted: terminated container test-container attempted illegal transition to non-terminated state ``` After fix #54593, I do not see the message with the above mentioned reproducer.
I would like to also potentially guard or flag an error on the API server. If we prevent the transition, we could break components in unexpected ways, but if we don't prevent the transition, we can break EVERY workload controller. |
I don't have a problem reopening #54530 (actually have to create a new PR now because branch changed 😞) Did you see @dchen1107 concern there? |
i think @dchen1107 concern was not that validation was bad, but it alone was not sufficient as the kubelet would still have been unhappy. now that we have the kubelet checking, the fix identified, adding another guard on the apiserver itself also seems fine to me. |
Automatic merge from submit-queue. UPSTREAM: 54593: Removed containers are not always waiting xref kubernetes/kubernetes#54593 fixes #17011
+1 on #54593 (comment) Once validation code is included, both API server and Kubelet will perform the same checking. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dashpole, dchen1107, derekwaynecarr Associated issue: 54499 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
Automatic merge from submit-queue (batch tested with PRs 54593, 54607, 54539, 54105). If you want to cherry-pick this change to another branch, please follow the instructions here. |
…93-upstream-release-1.8 Automatic merge from submit-queue. Automated cherry pick of #54593 Cherry pick of #54593 on release-1.8. #54593: fix #54499. Removed containers are not waiting ```release-note Fix an issue where pods were briefly transitioned to a "Pending" state during the deletion process. ```
fixes #54499
The issue was that a container that is removed (during pod deletion, for example), is assumed to be in a "waiting" state.
Instead, we should use the previous container state.
Fetching the most recent status is required to ensure that we accurately reflect the previous state. The status attached to a pod object is often stale.
I verified this by looking through the kubelet logs during a deletion, and verifying that the status updates do not transition from terminated -> pending.
cc @kubernetes/sig-node-bugs @sjenning @smarterclayton @derekwaynecarr @dchen1107