Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-3333 Retroactive default StorageClass assignment #3337

Merged
merged 10 commits into from
Jun 15, 2022

Conversation

RomanBednar
Copy link
Contributor

  • One-line PR description:
    Retroactive default StorageClass assignment
  • Other comments:

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels Jun 3, 2022
@k8s-ci-robot k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Jun 3, 2022
@jsafrane
Copy link
Member

jsafrane commented Jun 3, 2022

We will provide PRR either in a separate commit later (or in separate PR, if this one gets merged)

@wojtek-t wojtek-t self-assigned this Jun 6, 2022
Copy link
Contributor

@xing-yang xing-yang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look mostly good to me. Just some minor comments.


### Notes/Constraints/Caveats (Optional)

#### Behavior change
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the newly proposed behavior seems more intuitive, but it's also potentially a breaking change for existing users.

Before approving such change from PRR perspective, I would like to get opinion from API approvers (as even though we're not introducing a new API, we're technically changing the meaning of existing API) if we can even allow such change.

@liggitt - can you please help with evaluating it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt, please also check Alternatives at the bottom, there are ways how to keep some parts of the existing behavior, but the behavior would be quite complex from user point of view.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems Jordan is out - let me ask @thockin about this.

This may break an existing cluster that depends on the behavior described above.
We expect that the new behavior is better than the existing one. Users that
want to bind their PVC to PVs with `pv.spec.storageClassName=""` can create
PVCs explicitly requesting `pvc.spec.storageClassName=""` to provision those
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: how about "bind the PVC to a statically provisioned PV" rather than "provision those PVs statically"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds more accurate - changed.

`pvc.spec.storageClassName=nil` forever. Such PVC can be bound only to PV with
`pvc.spec.storageClassName=""`.

With the new behavior, the PV controller will wait until a default SC exists.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there going to be any changes to scenarios involving pre-binding of PVs to PVCs with claimRef field of PV set to a PVC that will be created in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there will be changes since PVC with nil storage class are no longer bound to PVs with "" and claimRef will not have any effect on this change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to make this a behavior change - see my above comments. If we WANT this behavior change, I need you to explain why it is valuable (it is a breaking change, strictly speaking).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that retroactively applying a default seems sane and reasonable. I just don't see why we have to include the breaking-change part.

Copy link
Member

@jsafrane jsafrane Jun 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could argue that even retroactively applying a default SC is a breaking change.

See Alternatives at the bottom of this KEP - it's possible (and actually easier) to keep PVC with nil binding to PV with "", until a default SC is created. After a new default SC is created, all PVCs with nil will get the new default SC and never bind to "".

It was actually our first idea, but then when I saw this, I thought it's too complicated for users. What PVC with nil then means? "I want default SC, if it exists at PVC creation, if it does not exist, then I want a PV with "", and when there is no such PV, then wait for it or for a new default SC."

It will be easier if nil always means "I want the default and wait for it if it does not exist".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we're willing to rewrite this KEP to keep binding nil <-> "" if the current version is too breaking.

Copy link
Member

@thockin thockin Jun 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm always extra paranoid about changing really subtle semantics like this, even when they are weird. It's just so hard to know if it will impact anyone. Here's my thinking:

  • Almost everyone uses StoirageClass
  • Almost everyone uses dynamic provisioning
  • Almost everyone who is NOT using dynamic provisioning is using StorageClass
  • Result: the vast vast vast majority of clusters do not have and will never have PVs where class = ""
  • Inference: anyone who DOES have such PVs probably relies on all the weird semantics of them

So leaving this functionality (PVCs with class=nil can bind to preallocated class="" PVs) is unlikely to have negative impact on people who don't want that (because almost nobody has those), but changing this is likely to have negative impact on that small subset of users who DO depend on it.

It's yucky from a code and docs POV, but that's what compat is all about.

Does my argument hold water for you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your feedback @thockin. Those are all great points! We've changed the KEP to describe the behavior you're suggesting and moved the original proposal to the alternatives section. Please review the latest version and let us know if there's anything else we should add/change.

@ddebroy
Copy link
Member

ddebroy commented Jun 9, 2022

Thanks for the additional clarifications, @RomanBednar ! Looks great.

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 2 things here:

  1. apply default SC to pending PVCs with a nil class (sounds good to me)
  2. do not match PVCs with nil class to PVS with "" class (sounds bad to me)

explain why 2 is important, please?

1. It’s hard to mark a different SC as the default one. Cluster admin can
choose between two bad solutions:

1. Cluster has two default SCs for a short time, i.e. admin marks the new
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should change this semantic - while there are two, pick one randomly or heuristically. In normal use it's the same thing as losing a race condition.

E.g., in real-time you have one user changing the default and another user creating a PVC at the same time. Depending on network, CPU scheduling, etc (totally non-deterministic) the API server might see either of those operations first. The PVC will either win the race and get the old default or it will lose the race and get the new default.

So whichever one you pick, as long as it is one of the two defaults, it is a correct choice!

SC and are Pending forever. Users must be smart enough to delete the PVC and
re-create it.

2. When users want to change the default SC parameters, they must delete the SC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we advise them to make a new SC and flip the default as I described above?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there may be quota on the original SC. Admins may want to re-create a default SC to keep the quota numbers. Re-create means there is no default SC for a short time.

@RomanBednar perhaps you could add a note about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are encouraging people to delete a class and recreate it with different params, we might as well allow them to mutate it. Do we ever refer back to the class after the volume is created?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are encouraging people to delete a class and recreate it with different params, we might as well allow them to mutate it

+1. Do we need a KEP for this?

if such a PV exists.

We want to change the existing `pvc.spec.storageClassName=nil` behavior
to always require a default SC. It would never bind to a PV with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced this is right. A PVC with nil class means "I don't care". A PV with class = "" should qualify, shouldn't it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing behavior is odd. nil means that "I want default, if it exists when PVC is created, or "" forever". "Don't care" would mean that nil can bind to PV with any storageClassName, which is IMO something else.

(I'm further expanding on this below in https://github.com/kubernetes/enhancements/pull/3337/files#r894338701)

is created. When a default SC is created, the PVC `pvc.spec.storageClassName`
will be updated with the new default SC name in the PV controller.

Any PVs with `pv.spec.storageClassName=""` will be able to bind only to a PVC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my comment above is wrong (that "" is a valid match), can you explain why this assertion you make here is important? I know there's a lot of history here :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PV with "" is a valid match for PVC with nil today, and yes, we want to change it. See https://github.com/kubernetes/enhancements/pull/3337/files#r894338701.

`pvc.spec.storageClassName=nil` forever. Such PVC can be bound only to PV with
`pvc.spec.storageClassName=""`.

With the new behavior, the PV controller will wait until a default SC exists.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to make this a behavior change - see my above comments. If we WANT this behavior change, I need you to explain why it is valuable (it is a breaking change, strictly speaking).

`pvc.spec.storageClassName=nil` forever. Such PVC can be bound only to PV with
`pvc.spec.storageClassName=""`.

With the new behavior, the PV controller will wait until a default SC exists.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that retroactively applying a default seems sane and reasonable. I just don't see why we have to include the breaking-change part.

@thockin thockin self-assigned this Jun 9, 2022
@@ -96,23 +96,27 @@ cases that can be problematic:
1. It’s hard to mark a different SC as the default one. Cluster admin can
choose between two bad solutions:

1. Cluster has two default SCs for a short time, i.e. admin marks the new
1. Option 1: Cluster has two default SCs for a short time, i.e. admin marks the new
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SC and are Pending forever. Users must be smart enough to delete the PVC and
re-create it.

2. When users want to change the default SC parameters, they must delete the SC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are encouraging people to delete a class and recreate it with different params, we might as well allow them to mutate it. Do we ever refer back to the class after the volume is created?

@thockin
Copy link
Member

thockin commented Jun 14, 2022

Thanks!

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 14, 2022
@jsafrane
Copy link
Member

/approve
the storage parts
/assign @wojtek-t
for PRR (there were very little changes in PRR questionnaire, if any)

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 14, 2022
@RomanBednar
Copy link
Contributor Author

Changing KEP status to implementable.

@jsafrane
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 14, 2022
@wojtek-t
Copy link
Member

I have two comments about graduation criteria. But approving for PRR now and holding to give you a chance to apply those (without waiting for another pass from me).

/approve PRR
/hold

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. labels Jun 14, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jsafrane, RomanBednar, thockin, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 14, 2022
@xing-yang
Copy link
Contributor

/milestone v1.25

@k8s-ci-robot k8s-ci-robot added this to the v1.25 milestone Jun 14, 2022
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 15, 2022
@wojtek-t
Copy link
Member

/lgtm

/hold cancel

Thanks1

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Jun 15, 2022
@k8s-ci-robot k8s-ci-robot merged commit a3f5900 into kubernetes:master Jun 15, 2022
Currently, when a default SC is not available at PVC creation time, a PVC
requesting a default SC (`pvc.spec.storageClassName=nil`) will keep
`pvc.spec.storageClassName=nil` forever. Such PVC can be bound only to PV with
`pvc.spec.storageClassName=""`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RomanBednar I understand it should be changed to:
Such PVC can be bound only to PV with pv.spec.storageClassName="".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants