Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix using the wrong pods when validating do-not-evict pods #583

Merged

Conversation

jonathan-innis
Copy link
Member

@jonathan-innis jonathan-innis commented Oct 9, 2023

Fixes aws/karpenter-provider-aws#4310, aws/karpenter-provider-aws#4758

Description

This PR correctly validates that the candidates that we are attempting to deprovision are still eligible to be deprovisioned. It does this by mapping the old candidates onto the new candidates that we pull from state and checking that all of those candidates are still valid for deprovisioning based on PDB criteria, do-not-evict annotations, and scheduling criteria.

This PR also removes the caching mechanism for validationCandidates since we weren't calling IsValid multiple times in code, meaning that storing the candidates like this wasn't necessary for our usage.

#247 had originally intended to fix this issue, but uses the wrong pods for validation within the filterCandidates function. This causes validation to pass for candidate nodes that receive pods with blocking PDBs or do-not-evict pods within the validation window.

How was this change tested?

make presubmit

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@jonathan-innis jonathan-innis force-pushed the fix-validation-do-not-evict branch 2 times, most recently from d20c6fc to f081464 Compare October 9, 2023 22:24
@coveralls
Copy link

coveralls commented Oct 9, 2023

Pull Request Test Coverage Report for Build 6473509748

  • 6 of 7 (85.71%) changed or added relevant lines in 1 file are covered.
  • 3 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.01%) to 81.881%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/controllers/deprovisioning/validation.go 6 7 85.71%
Files with Coverage Reduction New Missed Lines %
pkg/controllers/deprovisioning/validation.go 1 71.05%
pkg/test/cachesyncingclient.go 2 80.21%
Totals Coverage Status
Change from base Build 6466099147: 0.01%
Covered Lines: 8934
Relevant Lines: 10911

💛 - Coveralls

@jonathan-innis jonathan-innis marked this pull request as ready for review October 9, 2023 22:56
@jonathan-innis jonathan-innis requested a review from a team as a code owner October 9, 2023 22:56
@jonathan-innis jonathan-innis force-pushed the fix-validation-do-not-evict branch 2 times, most recently from ed5d058 to 1bf77db Compare October 9, 2023 23:06
@jonathan-innis jonathan-innis force-pushed the fix-validation-do-not-evict branch 3 times, most recently from 4735d39 to a82437f Compare October 9, 2023 23:35
@jonathan-innis jonathan-innis changed the title chore: Fix using the wrong pods when validating do-not-evict pods fix: Fix using the wrong pods when validating do-not-evict pods Oct 9, 2023
Copy link
Contributor

@njtran njtran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! Just some nits on the tests.

@jonathan-innis jonathan-innis enabled auto-merge (squash) October 10, 2023 18:06
njtran
njtran previously approved these changes Oct 10, 2023
Copy link
Contributor

@njtran njtran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved with comments nits

pkg/controllers/deprovisioning/emptynodeconsolidation.go Outdated Show resolved Hide resolved
pkg/controllers/deprovisioning/validation.go Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pod is scheduled right before karpenter decides to delete a node
3 participants