Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
design-proposal: Feature lifecycle #251
design-proposal: Feature lifecycle #251
Changes from all commits
29daa8e
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not always easy to confirm whether a feature is actively being used. Furthermore, certain features, like recovery tools, might not see frequent use, but they can be crucial in critical situations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If features are added then they must be used to be evaluated.
Without evaluation we simply do not know (don't get challenged) if a feature is useful to the receipient/consumer.
After a feature graduated, then features such as backup related, might not be regularly used (although I hope this is the case with backup), however, somebody shoudl try a feature before it graduates. If nobody is trying it, then we should question ourselves if the feature is valueable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I understand this point. The reason I raised this is because, as mentioned here, "If a feature is not able to transition to the next stage in the defined period, it should be removed automatically."
I think that it might take a while until these important features are evaluated, and they might get removed, if I understand correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The feature owner needs to consider this timeline. There is a balance that needs to be kept between adding features, keeping them in the queue for examination and finally graduating.
The motivation is describing that exact problem.
If the owner cannot prove and convince the maintainers the feature is ready for GA, it means it needs to drop. Kubernetes had the same problem of features remaining in Beta with no limit, which causes a maintenance mess (one one hand you do not commit to your users with a final contract and on the other hand keep the code in and cause all other to consider it in any integration).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The statement in the design proposal about removing features if they don't progress within a set time frame might be too strict. IMO if we're going to enforce it, we should at least have answers to these questions:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is out of scope.
But plain reasoning should work fine.
This proposal defines it in a table below and exceptions are possible with enough maintainer votes.
Again, it is defined in this proposal.
I do not understand why you comment about this at the motivation section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that makes sense. I missed it.
Could you provide a concrete example to help clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example for plain reasoning:
or
or
tldr: Find data that indicates that somebody was using a feature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we refine the proposal to apply it first within kubevirt/kubevirt before extending it to other repositories. Some projects are still early in development, with limited usage and contributors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The focus is on v1 APIs
As an organization I think everyone should be aware of the expected process for a graduated API.
This proposal is attempting to give a direction, but it does not strongly enforces it on a specific project.
The wording has been intentionally made in a manner that leaves room for movement.
E.g.
If others will push in this direction, to reduce the scope, I will adjust.
@fabiand , any thoughts about this from your side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused because you said we treat the word "feature" like a black box, but it seems that now, it actually has a stable API version inside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this definition to define how graduation should work.
Then I'm expecting that this becomes more relevant for some repos (i.e. kubevirt/kuevirt) then for others i.e. AAQ.
But generally speaking we need this document in order to lay out a clear path of his probem can be tackled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should start by applying this new approach in 'kubevirt/kubevirt' first. If it proves useful (which I am optimistic about), other repositories may also choose to adopt it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is how I understood this document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some features may need to keep their feature gates. The default, however, would need to be set to the relevant value. If you look at k8s, there are many features that are already GA'ed but continue to have a feature gate.
This is needed for the cluster admin to control which functionality is provided in the cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, once K8S declares a feature as GA, the FG is hard-coded enabled, with no option to disable it.
The only reason the fields still exists, is to allow downgrades and keep the DB intact and not introduce in the future a new FG with the same name.
Ref: https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-stages
A General Availability (GA) feature is also referred to as a stable feature. It means:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CPUManager can be disabled afaik - just one example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes... I missed the
The feature is always enabled; you cannot disable it.
Not sure why some of the features can still be disabled that way.I think the main issue for me is that, unlike k8s features, most of KubeVirt's features don't have a secondary config option to disable it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a blocker from my side but an area of concern. We should be able to provide a way for admins to disable a functionality. One of the well known requests is for the admin to disable all VFIO related functionality, which isn't a single feature per se.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the concerns raised here, but I feel we are in a deadlock.
The proposal is based on the need to stop queuing FGs indefinitely and with the understanding that the current observation of what a FG is in Kubevirt [1] is well known and understood.
What you are attempting to argue here is that the current FG definition needs to change, together with it API (that I honestly to not understand how you can change).
To me its sounds like another proposal.
It makes sense to defer things to a follow up discussion, but in this case, I do not understand how we can do it without breaking this whole thing.
The maintainers of the project have the power to use exceptions to keep FGs forever and redefining what it means in parallel. I may not agree with what you are trying to do, but at least when the FG is redefined, this document can be adjusted accordingly.
I will need help to see how this suggestion can be done and at the same time keep all the rest of this proposal relevant to the goals set.
[1] https://kubevirt.io/user-guide/operations/activating_feature_gates/#activating-feature-gates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lyarwood @vladikr IIUIC then the friction arises from the question around todays FGs, is this correct?
Assumgin Yes to my question above, then a few notes:
IOW
Decision diagram wise it would look like this:
This should all be decided before GA.
Configuration
andOpt-In
/Opt-Out
can however even be added post-GA.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup apologies by attempting to break the deadlock here I've just made it worse. /o\
My intent with the
Optional
suggestion was to defer making any concrete decisions on GA FGs until we also talk about reworking the current API but I see that's not helpful in moving this proposal forward.Given the already included warning about FGs not replacing cluster configurables for features admins might need to configure and/or disable I'm personally fine keeping the requirement to drop new feature FGs as they graduate to GA for now.
The behaviour and IMHO abuse (using FGs to control feature configuration and not just visibility in the cluster) of the current implementation yes.
The future feature lifecycle given the current FG implementation yes.
Yes that's my understanding.
They don't define a feature as stable or always available, quiet the opposite in fact given the current proposal.
Yup agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, bad wording on my side in that regards. You are right :)
Thus in general agremeent, that's good, thanks for the quick feedback.
Let's wait for Vlaidk, and others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following up on @vladikr 's concerns.
The main concern (Vladik, kepe me honest) is that there is a technical redundancy to toggle a feature.
A FG is technically allowing an admin to enable or disable a feature.
A enable/disable config item is also allowing an admin to enable or disable a feature.
Thus the question is: If we have the FG, why should we introduce a config toogle for the same functionality?
On the technical side there is not much that I have to say, maybe besides the fact that there are possibly slightly different nuances between "not present" and "disabled".
However, semantically I'd take the compromise of this technical redundancy in favor of a clear separation between
a) signaling a feature is stable (enabled by default, no FG)
b) admin is permitted to disable/enable certain features using a configuration
Thus even given the concerns, I'm advocating to move forward and keep this separation between feature life-cycle and feature configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also need to align this effort with an actual release cycle as we are trying to do with the new and slightly overlapping enhancement process?
IOW saying that new features landing in 1.4 will need to adhere to this lifecycle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's revisit the timeline once we have agreement on the proposal.
After all to me 1.4 so far looks like a reasonable goal.