Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet: check for illegal container state transition #54597

Merged

Conversation

sjenning
Copy link
Contributor

@sjenning sjenning commented Oct 26, 2017

supersedes #54530

Puts a state transition check in the kubelet status manager to detect and block illegal transitions; namely from terminated to non-terminated.

@smarterclayton @derekwaynecarr @dashpole @joelsmith @frobware

I confirmed that the reproducer in #54499 does not work with this check in place. The erroneous kubelet status update is rejected:

status_manager.go:301] Status update on pod default/test aborted: terminated container test-container attempted illegal transition to non-terminated state

After fix #54593, I do not see the message with the above mentioned reproducer.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 26, 2017
@sjenning sjenning force-pushed the validate-state-transition branch from a1bbc4b to 5a36fc4 Compare October 26, 2017 02:38
@sjenning sjenning changed the title WIP: kubelet: check for illegal container state transition kubelet: check for illegal container state transition Oct 26, 2017
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 26, 2017
@sjenning
Copy link
Contributor Author

/sig node
/release-note-none

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Oct 26, 2017
@derekwaynecarr
Copy link
Member

is there any benefit to doing the same check on the other side of the channel?

doing it here is definitely good.

@derekwaynecarr
Copy link
Member

/assign @derekwaynecarr

@sjenning
Copy link
Contributor Author

is there any benefit to doing the same check on the other side of the channel?

I don't think so. The only writer to the channel is updateStatusInternal.

@derekwaynecarr
Copy link
Member

fair enough.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 26, 2017
@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 26, 2017
@sjenning
Copy link
Contributor Author

fyi @dchen1107

@sjenning sjenning force-pushed the validate-state-transition branch from 420b272 to be7093c Compare October 26, 2017 02:57
@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 26, 2017
@sjenning
Copy link
Contributor Author

@derekwaynecarr improved error message to include pod namespace and name. please re-lgtm.

@joelsmith
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 26, 2017
@joelsmith
Copy link
Contributor

/test pull-kubernetes-node-e2e
Is this breaking the e2e tests? It's failed 3 in a row and other PRs are generally passing this test.
(sorry for the comment spam, I misspelled the test name last time)

@sjenning
Copy link
Contributor Author

@joelsmith I think it is breaking them. Looking at it.

@sjenning sjenning force-pushed the validate-state-transition branch from be7093c to 449fc02 Compare October 26, 2017 04:06
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Oct 26, 2017
@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 26, 2017
@sjenning
Copy link
Contributor Author

@joelsmith I've restricted the check to pods with RestartPolicy set to Never.

@joelsmith
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 26, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: derekwaynecarr, joelsmith, sjenning

Associated issue: 54499

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 54597, 54593, 54081, 54271, 54600). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 443338b into kubernetes:master Oct 26, 2017
deads2k pushed a commit to openshift/kubernetes that referenced this pull request Oct 31, 2017
xref kubernetes#54597

:100644 100644 a2047a8b54... 6b9f3cffd0... M	pkg/kubelet/status/status_manager.go
k8s-github-robot pushed a commit that referenced this pull request Nov 1, 2017
Automatic merge from submit-queue (batch tested with PRs 53962, 54708). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Prevent successful containers from restarting with OnFailure restart policy

**What this PR does / why we need it**:

This is a follow-on to #54597 which makes sure that its validation
also applies to pods with a restart policy of OnFailure. This
deficiency was pointed out by @smarterclayton here:
#54530 (comment)

**Which issue this PR fixes**  This is another fix to address #54499

**Release note**:
```release-note
NONE
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants