Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

design-proposal: Feature lifecycle #251

Merged
merged 1 commit into from
May 10, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
327 changes: 327 additions & 0 deletions design-proposals/feature-lifecycle.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,327 @@
# KubeVirt Feature Lifecycle

## Summary

KubeVirt requires a clear policy on how features are introduced,
evaluated and finally graduated or removed.

This proposal defines the steps and policies to follow in order
to manage a feature and its lifecycle in KubeVirt.

The proposal is focusing on introducing features in
a stable API (CRD) version, e.g. `kubevirt.io/v1`.

## Overview

### Motivation
KubeVirt has grown into a matured virtualization management solution
with a large set of features.
EdDev marked this conversation as resolved.
Show resolved Hide resolved

New features are being proposed and added regularly to its portfolio.

With time, the challenge of supporting and maintaining such a large
set of features raised the need to re-examine their relevance.
It also raised the need to examine with more care features graduation.
vladikr marked this conversation as resolved.
Show resolved Hide resolved

The KubeVirt community has tried to control the flow of features
informally through feature-gates, similar to Kubernetes.
However, as time passed, several challenges presented themselves:
- Evaluated features rarely got graduated to GA or removed, causing
feature consumption to be risky for users and a maintenance burden
for the project contributors.
EdDev marked this conversation as resolved.
Show resolved Hide resolved
EdDev marked this conversation as resolved.
Show resolved Hide resolved
> **Note**: As of this writing (pre v1.2), there are 37 FGs,
> out of which 5 GA-ed and 2 marked for deprecation.
> For more information, explore the
> [source](https://github.com/kubevirt/kubevirt/blob/5ff12ae931cefd81514ec96f97a189ab2c179ad7/pkg/virt-config/feature-gates.go).
- We do not have agreed-upon procedures on how and when features
graduate or discontinue.
EdDev marked this conversation as resolved.
Show resolved Hide resolved
This causes each feature to take different approaches, possibly
surprising users.

We conclude that the current use of FGs is insufficient
due to the lack of well-defined processes and policy on how a feature
should progress in its lifecycle.

Eventually we would like to see features being evaluated carefully
before they are introduced, while they are experimented with and
proven to be actually in use (and useful) before graduating.
Comment on lines +46 to +47
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not always easy to confirm whether a feature is actively being used. Furthermore, certain features, like recovery tools, might not see frequent use, but they can be crucial in critical situations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If features are added then they must be used to be evaluated.
Without evaluation we simply do not know (don't get challenged) if a feature is useful to the receipient/consumer.

After a feature graduated, then features such as backup related, might not be regularly used (although I hope this is the case with backup), however, somebody shoudl try a feature before it graduates. If nobody is trying it, then we should question ourselves if the feature is valueable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I understand this point. The reason I raised this is because, as mentioned here, "If a feature is not able to transition to the next stage in the defined period, it should be removed automatically."

I think that it might take a while until these important features are evaluated, and they might get removed, if I understand correctly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature owner needs to consider this timeline. There is a balance that needs to be kept between adding features, keeping them in the queue for examination and finally graduating.
The motivation is describing that exact problem.

If the owner cannot prove and convince the maintainers the feature is ready for GA, it means it needs to drop. Kubernetes had the same problem of features remaining in Beta with no limit, which causes a maintenance mess (one one hand you do not commit to your users with a final contract and on the other hand keep the code in and cause all other to consider it in any integration).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The statement in the design proposal about removing features if they don't progress within a set time frame might be too strict. IMO if we're going to enforce it, we should at least have answers to these questions:

  1. How can the feature owner or someone else prove whether the feature is being used or not?
  2. What's a reasonable amount of time for a feature to stay in testing phases like Alpha or Beta?
  3. Should the time it takes to move from one phase to another be the same as the time given before removing unused features?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • How can the feature owner or someone else prove whether the feature is being used or not?

This is out of scope.
But plain reasoning should work fine.

  • What's a reasonable amount of time for a feature to stay in testing phases like Alpha or Beta?

This proposal defines it in a table below and exceptions are possible with enough maintainer votes.

  • Should the time it takes to move from one phase to another be the same as the time given before removing unused features?

Again, it is defined in this proposal.

I do not understand why you comment about this at the motivation section.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This proposal defines it in a table below and exceptions are possible with enough maintainer votes.

Thanks, that makes sense. I missed it.

But plain reasoning should work fine.

Could you provide a concrete example to help clarify?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example for plain reasoning:

We introduced super fast Live Migration due to KI support. Over the past 2 releases we have fixed several issues related to KI halluciantions and bad mood, those were fixed for example in #42 #4242 and #424242. Is this mileage enoug in order to now graduate this feature?

or

2 releases ago we have proposed in this mailinglist thread https://secure.mailinglist.link.example.com/for/kubevirt/foobarbazbar and in the issue #4. @non-existsnat-user has shared that this feature is working well in their harry-potter-as-a-service backed by KubeVirt 's KI invented appliance, reworded by chatgpt 4.5, with CVE descriptions provided by Bard and Foo. It's now seen strong adoption by @hogwarts_poetry_slam_club which is now seeking the GA of this appliance. Is this usage sufficient?

or

Users reported 24 issues on Live Migration Barriers and we've fixed all of them and think the feature has seen enopugh soak time over the past 2 releases. Can we graduate it?

tldr: Find data that indicates that somebody was using a feature


EdDev marked this conversation as resolved.
Show resolved Hide resolved
> **Note**: Once a feature graduates, it is included in a
> General Availability (GA) release with its functionality available
> to all users. GA features need to comply with [semver](https://semver.org/)
> which add constraints on their ability to change (including deprecation).

### Goals
- Define the process a feature needs to pass in order to be
Generally Available.
- Define the process a feature needs to pass in order to be removed.
- Provide policies and rules on how to manage a feature during its
lifetime.

EdDev marked this conversation as resolved.
Show resolved Hide resolved
### Non Goals
- Implement enforcement tooling to keep features in sync with
the lifecycle rules.

### Definition Of Users
- Development contributors.
- Cluster operators.

### User Stories
- As a KubeVirt contributor, I would like to introduce a new useful
feature and follow it to graduation (GA).
- As a KubeVirt contributor, I would like to remove a feature
that has not yet reached its formal graduation.
- As a KubeVirt contributor, I would like to remove a feature
that has already graduated.
> **Note**: Removal of a GA feature is considered an exception.
> Strong arguments and a wide agreement is required for such an action.
EdDev marked this conversation as resolved.
Show resolved Hide resolved
- As a KubeVirt cluster operator, I would like to experiment with
a newly-proposed ("Alpha") feature in a controlled environment, to see
if it makes sense to me.
- As a KubeVirt cluster operator, I would like to evaluate an
undergraduate ("Beta") feature in a real-life environment with actual users.
- As a KubeVirt cluster operator, I would like to keep using a feature
that got graduated after I used it during the "Beta" evaluation period.
- As a KubeVirt cluster operator, I would like to know that
an experimental ("Alpha") feature is planned to be removed.
- As a KubeVirt cluster operator, I would like to know that
an undergraduate ("Beta") feature is planned to be removed.
- As a KubeVirt cluster operator, I would like to stop using a feature
that got removed.

### Repos
This is a cross repo project policy under the
[kubevirt](https://github.com/kubevirt) organization.

Comment on lines +92 to +95
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we refine the proposal to apply it first within kubevirt/kubevirt before extending it to other repositories. Some projects are still early in development, with limited usage and contributors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The focus is on v1 APIs

The proposal is focusing on introducing features in
a stable API (CRD) version, e.g. kubevirt.io/v1.

As an organization I think everyone should be aware of the expected process for a graduated API.

This proposal is attempting to give a direction, but it does not strongly enforces it on a specific project.
The wording has been intentionally made in a manner that leaves room for movement.

E.g.

A feature is expected to pass in the following order through the following stages:

If others will push in this direction, to reduce the scope, I will adjust.
@fabiand , any thoughts about this from your side?

Copy link
Member

@Barakmor1 Barakmor1 Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused because you said we treat the word "feature" like a black box, but it seems that now, it actually has a stable API version inside.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this definition to define how graduation should work.
Then I'm expecting that this becomes more relevant for some repos (i.e. kubevirt/kuevirt) then for others i.e. AAQ.
But generally speaking we need this document in order to lay out a clear path of his probem can be tackled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should start by applying this new approach in 'kubevirt/kubevirt' first. If it proves useful (which I am optimistic about), other repositories may also choose to adopt it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is how I understood this document.

## Proposal Design
The proposal on how to define a feature lifecycle is influenced by
processes and policies from the Kubernetes project.
These sources are scattered around, each focusing on different
aspects of a feature:
- [Feature Gates](https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/)
- [Graduation](https://kubernetes.io/blog/2020/08/21/moving-forward-from-beta/)
- [Changing the API](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api_changes.md)
- [Deprecation](https://kubernetes.io/docs/reference/using-api/deprecation-policy/)

The proposal takes the top-down approach, starting with the high level
flow that a common feature will traverse through.
Continuing with actions that need to be taken and timeline suggestions.

Both feature graduation and discontinuation flows are covered.
Including the implications on users.

Depending on individual topics, follow-up proposals may extend the basic
points raised in this proposal.

EdDev marked this conversation as resolved.
Show resolved Hide resolved
### Terminology
Feature Gates and feature configuration are often used interchangeable or not differentiated.
- Feature Gate: A flag that controls the presence or availability of a feature in the cluster.
- Feature configuration: Cluster or workload level configuration that allows an admin or user
(depending on the feature) to control aspects of a feature operation.
A common usage is to determine if features are opt-in or opt-out by default.

### Feature Stages
A feature is expected to pass in the following order through the following stages:
1. Enhancement proposal.
EdDev marked this conversation as resolved.
Show resolved Hide resolved
2. Implementation.
EdDev marked this conversation as resolved.
Show resolved Hide resolved
3. Release as Alpha (experimental).
4. Release as Beta (pre-release for evaluation).
5. Release as General Availability (graduation).
6. Removal.
fabiand marked this conversation as resolved.
Show resolved Hide resolved

Starting from the Alpha release, it can be removed with restrictions that
depend on the release stage (Alpha, Beta, GA).

[Removal](#removal) of features is widely discussed later
in this proposal.

#### Enhancement proposal
As the first step for introducing a new feature, a formal proposal is
EdDev marked this conversation as resolved.
Show resolved Hide resolved
expected to be shared for public review via mailinglist and
a [design proposal](https://github.com/kubevirt/community/tree/main/design-proposals).
EdDev marked this conversation as resolved.
Show resolved Hide resolved

This is the first opportunity to evaluate a new feature.
The proposal needs to include motivation, goals, implementation details
and phases. Review the [proposal template](https://github.com/kubevirt/community/blob/main/design-proposals/proposal-template.md)
EdDev marked this conversation as resolved.
Show resolved Hide resolved
for more information.

#### Implementation
The development work on the feature is expected to include coding,
testing, integration and documentation.

#### Releases
- **Alpha**:
EdDev marked this conversation as resolved.
Show resolved Hide resolved
An initial release of the feature for experimental purposes.
Recommended for non-production usages, evaluation or testing.

The API is considered unstable and may change significantly.
fabiand marked this conversation as resolved.
Show resolved Hide resolved
EdDev marked this conversation as resolved.
Show resolved Hide resolved
There are no backward compatability considerations and it can
be removed at any time.

The period in which a feature can remain in Alpha is limited,
fabiand marked this conversation as resolved.
Show resolved Hide resolved
assuring features are not piling up without control.
See [release stage transition table](#release-stage-transition-table)
for more information.

The feature presence is controlled using a Feature-Gate (FG) during
runtime. It must be specified for the feature to be active.
EdDev marked this conversation as resolved.
Show resolved Hide resolved

- **Beta**:
EdDev marked this conversation as resolved.
Show resolved Hide resolved
The first release that can be evaluated with care in production.
Acting as a pre-release, its main objective is to collect feedback
from users to assure its usefulness and readiness for graduation.
If there is no confidence of usage or usefulness, it may remain in
this stage for some time.

However, the period in which a feature can remain in Beta is limited,
fabiand marked this conversation as resolved.
Show resolved Hide resolved
assuring features are not piling up without control.
See [release stage transition table](#release-stage-transition-table)
for more information.

The API is considered stable with care not to break backward compatibility
with previous beta releases.
This implies that fields may only be added during this stage,
not removed or renamed.

The feature presence is controlled using a Feature-Gate (FG) during
runtime. It must be specified for the feature to be active.
fabiand marked this conversation as resolved.
Show resolved Hide resolved

- **GA**:
The feature graduated to general-availability (GA) and is now part of
the core features.

The API is considered stable with care not to break backward compatibility
with the previous releases.

The feature functionality is no longer controlled by a FG.
EdDev marked this conversation as resolved.
Show resolved Hide resolved

> **Warning**: A Feature Gate flag is solely intended to control
> feature lifecycle. It should not be confused and used as a cluster
> configurable enablement of the functionality.
> In cases where the cluster admin should control a functionality,
> regardless to the feature stage, dedicated configuration field/s
> should be included.
fabiand marked this conversation as resolved.
Show resolved Hide resolved
EdDev marked this conversation as resolved.
Show resolved Hide resolved

#### Removal
If a feature is targeted for deprecation and retirement,
it needs to pass a deprecation process, depending on its current
release stage (Alpha, Beta, GA).

For more details, see [here](#deprecation-and-removal).

#### Release Stage Transition Table
fabiand marked this conversation as resolved.
Show resolved Hide resolved
The following table summarized the different release stages with their
transition requirements and restrictions.

| Stage | Period range | F.Gate | Removal Availability |
|----------------|-----------------|--------|----------------------------|
| Alpha | 1 to 2 releases | YES | Between **minor** releases |
| Beta | 1 to 3 releases | YES | Between **minor** releases |
EdDev marked this conversation as resolved.
Show resolved Hide resolved
| GA | - | NO | Between **major** releases |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some features may need to keep their feature gates. The default, however, would need to be set to the relevant value. If you look at k8s, there are many features that are already GA'ed but continue to have a feature gate.
This is needed for the cluster admin to control which functionality is provided in the cluster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, once K8S declares a feature as GA, the FG is hard-coded enabled, with no option to disable it.
The only reason the fields still exists, is to allow downgrades and keep the DB intact and not introduce in the future a new FG with the same name.

Ref: https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-stages

A General Availability (GA) feature is also referred to as a stable feature. It means:

  • The feature is always enabled; you cannot disable it.
  • The corresponding feature gate is no longer needed.
  • Stable versions of features will appear in released software for many subsequent versions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CPUManager can be disabled afaik - just one example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes... I missed the The feature is always enabled; you cannot disable it. Not sure why some of the features can still be disabled that way.

I think the main issue for me is that, unlike k8s features, most of KubeVirt's features don't have a secondary config option to disable it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a blocker from my side but an area of concern. We should be able to provide a way for admins to disable a functionality. One of the well known requests is for the admin to disable all VFIO related functionality, which isn't a single feature per se.

Copy link
Member Author

@EdDev EdDev May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the concerns raised here, but I feel we are in a deadlock.

The proposal is based on the need to stop queuing FGs indefinitely and with the understanding that the current observation of what a FG is in Kubevirt [1] is well known and understood.

What you are attempting to argue here is that the current FG definition needs to change, together with it API (that I honestly to not understand how you can change).
To me its sounds like another proposal.

It makes sense to defer things to a follow up discussion, but in this case, I do not understand how we can do it without breaking this whole thing.

The maintainers of the project have the power to use exceptions to keep FGs forever and redefining what it means in parallel. I may not agree with what you are trying to do, but at least when the FG is redefined, this document can be adjusted accordingly.

I will need help to see how this suggestion can be done and at the same time keep all the rest of this proposal relevant to the goals set.

[1] https://kubevirt.io/user-guide/operations/activating_feature_gates/#activating-feature-gates

Copy link
Member

@fabiand fabiand May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given all of the above I'd suggest replacing NO with Optional and moving this discussion into a new document/PR specific to FGs in KubeVirt to unblock this.

@lyarwood @vladikr IIUIC then the friction arises from the question around todays FGs, is this correct?

Assumgin Yes to my question above, then a few notes:

  • This proposal is about our future desired state.
  • Existing FGs should be untouched by this proposal, but can be covered by follow up work. AKA we should not block on the current state
  • FGs are about declaring a feature as stable and always available.
  • NOT every feature requires a configuration. We can simply say that some feature (like LiveMigration) is turned on by default without configuration (enable/disable). Optionally we can say that we want to add configuration to enable/disable it, or configure some detail.

IOW

  • Removing - or having a FG unrevokably enabled - means that we as KubeVirt consider this feature to be useable for all.
  • If we think that the feature should not always be used, then can add a configuration.

Decision diagram wise it would look like this:

flowchart TD
    Start --> QCFG{Requires configuration?}
    QCFG -- Yes --> AddConfig[Add configuration]
    QCFG -- No --> QDEFAULT
    AddConfig --> QDEFAULT

    QDEFAULT{Should be default?}
    QDEFAULT -- Yes --> MakeDefault[Make default] --> QOPTOUT
    QDEFAULT -- No --> QOPTIN

    QOPTIN{Requires Opt-In?} -- Yes --> AddOptIn[Add Opt-In] --> END
    QOPTOUT{Requires Opt-Out?} -- Yes --> AddOptOut[Add Opt-Out] --> END

    QOPTIN & QOPTOUT -- No --> END

    END
Loading

This should all be decided before GA. Configuration and Opt-In/Opt-Out can however even be added post-GA.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the concerns raised here, but I feel we are in a deadlock.

Yup apologies by attempting to break the deadlock here I've just made it worse. /o\

My intent with the Optional suggestion was to defer making any concrete decisions on GA FGs until we also talk about reworking the current API but I see that's not helpful in moving this proposal forward.

Given the already included warning about FGs not replacing cluster configurables for features admins might need to configure and/or disable I'm personally fine keeping the requirement to drop new feature FGs as they graduate to GA for now.

@lyarwood @vladikr IIUIC then the friction arises from the question around todays FGs, is this correct?

The behaviour and IMHO abuse (using FGs to control feature configuration and not just visibility in the cluster) of the current implementation yes.

Assumgin Yes to my question above, then a few notes:

  • This proposal is about our future desired state.

The future feature lifecycle given the current FG implementation yes.

  • Existing FGs should be untouched by this proposal, but can be covered by follow up work. AKA we should not block on the current state

Yes that's my understanding.

  • FGs are about declaring a feature as stable and always available.

They don't define a feature as stable or always available, quiet the opposite in fact given the current proposal.

  • NOT every feature requires a configuration. We can simply say that some feature (like LiveMigration) is turned on by default without configuration (enable/disable). Optionally we can say that we want to add configuration to enable/disable it, or configure some detail.

IOW

  • Removing - or having a FG unrevokably enabled - means that we as KubeVirt consider this feature to be useable for all.
  • If we think that the feature should not always be used, then can add a configuration.

Yup agreed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They don't define a feature as stable or always available, quiet the opposite in fact given the current proposal.

Yes, bad wording on my side in that regards. You are right :)

Thus in general agremeent, that's good, thanks for the quick feedback.
Let's wait for Vlaidk, and others.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following up on @vladikr 's concerns.
The main concern (Vladik, kepe me honest) is that there is a technical redundancy to toggle a feature.
A FG is technically allowing an admin to enable or disable a feature.
A enable/disable config item is also allowing an admin to enable or disable a feature.

Thus the question is: If we have the FG, why should we introduce a config toogle for the same functionality?

On the technical side there is not much that I have to say, maybe besides the fact that there are possibly slightly different nuances between "not present" and "disabled".

However, semantically I'd take the compromise of this technical redundancy in favor of a clear separation between
a) signaling a feature is stable (enabled by default, no FG)
b) admin is permitted to disable/enable certain features using a configuration

Thus even given the concerns, I'm advocating to move forward and keep this separation between feature life-cycle and feature configuration.


Through Alpha and Beta feature releases, a FG must be set in order
for the feature to function.
By default, no FG is specified, therefore the feature is disabled.

If a feature is not able to transition to the next stage in the defined period,
it should be removed automatically.
EdDev marked this conversation as resolved.
Show resolved Hide resolved

> **Note**: Exceptions to the period range may apply
> if 2/3 of active maintainers come to agreement to prolong
fabiand marked this conversation as resolved.
Show resolved Hide resolved
> a specific feature.

### Deprecation and Removal
One reason for features to go through the Alpha and Beta stages,
is the opportunity to examine their usefulness and adoption.
Same goes with major releases that intentionally allow breaking
backward compatibility (as specified by [semver](https://semver.org/)).

Therefore, it is only natural that some features will not graduate
between the stages, or will be found irrelevant after some time and be
removed when transitioning between major releases.

#### Major Releases
KubeVirt follows semver versioning, in which major versions may
break API compatibility. Therefore, discontinuation of features
is somehow simpler when incrementing the major version.

However, this is not without a cost.
When a new major release is introduced, the previous one is still maintained
and supported, something that does not exist with minor releases.
fabiand marked this conversation as resolved.
Show resolved Hide resolved

#### The Deprecation Flow (for Minor releases)
Only Alpha and Beta features can be removed during a minor release.

These are the steps needed to deprecate & remove a feature:
- Proposal: Prepare a proposal to remove a feature with proper
reasoning, phases, exact timelines and functional alternatives (if any).
The proposal should be reviewed and approved.
- Notification: Notify the project community about the feature
discontinuation based on the approved proposal.
All details of the plan should be provided to allow users and possibly
down-stream projects to adjust.
Use all community media options to distribute this information
(e.g. mailing list, slack channel, community meetings).
- Deprecation warnings: Add deprecation warnings at runtime to warn users
that the feature is planned to be removed.
Warnings should be raised when:
- Feature API fields are accessed.
EdDev marked this conversation as resolved.
Show resolved Hide resolved
- Feature FG is activated.
- Behavior related to the feature is detected (optional).
- Removal: Feature removal involves removing the core functionality
of a feature and its exposed API.
- The core implementation can be removed in two steps:
- The FG is removed by assuring it is never reported as set
(i.e. even if it is left by the operator configured, internally
it is ignored).
At this stage, the core implementation will follow the FG conditions
and therefore from the outside the feature is inactive.
- In case there are no side effects, the core implementation code can
be removed.
- The API types are not to be removed, as it may have implications
with the underlying storage which has already persisted them.
Kubernetes has not removed fields, it just kept them around with the
warning that they have been deprecated and no longer available.
EdDev marked this conversation as resolved.
Show resolved Hide resolved

While keeping fields around for a period of a release or two makes
sense, beyond a limited period it adds a burden on dragging leftover
fields around to eternity.
EdDev marked this conversation as resolved.
Show resolved Hide resolved

> **Note**: The only reference seen on why fields should not be removed
> was mentioned [here](https://github.com/kubernetes/kubernetes/issues/52185).
> But it is unclear if this is relevant for Alpha stage features.
fabiand marked this conversation as resolved.
Show resolved Hide resolved
> Starting with a strict policy, similar to Kubernetes is recommended,
> i.e. once fields are introduced, they should not be removed no matter
> the feature release stage.
> Per need, the topic can be revisited in follow-up adjustments.

### Exceptions
While the project strives to maintain a stable contract with its users,
there may be scenarios where the policy described here will not be a fit.

Therefore, it should be acceptable to have exceptions from time to time
given a very good reasoning and an agreement from 2/3 of the project
maintainers (also known as "approvers").

## Implementation Phases
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also need to align this effort with an actual release cycle as we are trying to do with the new and slightly overlapping enhancement process?

IOW saying that new features landing in 1.4 will need to adhere to this lifecycle?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's revisit the timeline once we have agreement on the proposal.
After all to me 1.4 so far looks like a reasonable goal.

- Add a section in the
[design proposal template](https://github.com/kubevirt/community/blob/main/design-proposals/proposal-template.md)
that describes the planned timelines for the feature stages.
- Add a reference to the feature-lifecycle documentation to assure contributors
know the process and policy.
- Prepare a user-facing document that describes the usability implications of
this feature lifecycle.

## Miscellaneous

- New features are to be introduced to major and minor release versions only.
EdDev marked this conversation as resolved.
Show resolved Hide resolved
For clarification, this implies that new features are **not** to be backport.
- CI:
- Alpha stage features should not be gating on CI.
- Beta and GA features should be gating on CI.
- API fields may be marked with the following information:
- Description
- Release stage (alpha/beta/ga) and release version.

It is left to a follow-up implementation proposal to define the exact format and
required/optional information.
EdDev marked this conversation as resolved.
Show resolved Hide resolved