Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal:rollout design #489

Merged
merged 8 commits into from
Dec 7, 2023

Conversation

LiZhenCheng9527
Copy link
Contributor

@LiZhenCheng9527 LiZhenCheng9527 commented Nov 27, 2023

What type of PR is this?
/kind documentation

What this PR does / why we need it:
a part of #435

Signed-off-by: LiZhenCheng9527 <lizhencheng6@huawei.com>
Copy link

netlify bot commented Nov 27, 2023

Deploy Preview for kurator-dev ready!

Name Link
🔨 Latest commit c60d515
🔍 Latest deploy log https://app.netlify.com/sites/kurator-dev/deploys/6571653e5026aa00083808c7
😎 Deploy Preview https://deploy-preview-489--kurator-dev.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@kurator-bot
Copy link
Collaborator

@LiZhenCheng9527: The label(s) kind/design cannot be applied, because the repository doesn't have them.

In response to this:

What type of PR is this?
/kind documentation
/kind design
/kind feature

What this PR does / why we need it:
a part of #435

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

// If set it to ture, user need to install the testload himself.
// If set it to false or leave it blank, Kurator will install the flagger's testload.
// +optional
TestLoad bool `json:"testLoad"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, omitempty

// Destination defines the destination for the artifact.
// If specified, it will override the destination specified at Application level.
// +optional
Destination *ApplicationDestination `json:"destination"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, Destination may be refactored in the future. See #439.


// Kind of the referent
// +optional
Kind string `json:"kind,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If 'kind' is optional, how can we specify or identify an object

Port int32 `json:"port"`

// Port name of the generated Kubernetes service.
// Defaults to http
Copy link
Contributor

@Xieql Xieql Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does "Defaults to http" means? the defaults PortName is "http"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. More information you can refer to canary-service

// +optional
Hosts []string `json:"hosts,omitempty"`

// If enabled, Flagger would generate Istio VirtualServices without hosts and gateway,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, Annotations should start with the parameter name, followed by its description. For example, 'Delegation [description of Delegation]'

// +optional
TrafficPolicy *istiov1alpha3.TrafficPolicy `json:"trafficPolicy,omitempty"`

// URI match conditions for the generated service.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// URI match conditions for the generated service.
// Match specifies URI match conditions for the generated service.

}
```

Kurator generates horizontalpodautoscaler.autoscaling resources based on `autoscale` configurations as stable and to-be-verified versions. For instance tuning during testing
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Kurator generates horizontalpodautoscaler.autoscaling resources based on `autoscale` configurations as stable and to-be-verified versions. For instance tuning during testing
Kurator generates horizontalpodautoscaler.autoscaling resources based on `autoscale` configurations as stable and to-be-verified versions. Make adjustments during testing

A good summary is probably at least a paragraph in length.
-->

Kurator, as an open-source distributed cloud-native platform, has been pivotal in aiding users to construct their distributed cloud-native infrastructure, thereby facilitating enterprise digital transformation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not repeat this, too general.


By integrating Flagger, we aim to provide users with reliable, fast and unified Continuous Delivery, enabling them to easily distribute code across multiple clusters.

Base on Flagger, Kurator also offers A/B, Blue/Green and Canary distribution options. Meet the needs of the test verification release.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test verification release is this a commonly used phrase?


Unified configuration distribution only requires the user to declare the desired API state in one place, and Kurator will automatically handle all subsequent operations.

Kurator deploys different daemons on different nodes in the cluster with different rules for configuration distribution based on the declarations in the Spec.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? can not get the intention here


- **unified Continuous Delivery**
- Supports unified configuration of releases for multiple clusters. Achieve the deployment configuration of the application to be distributed to the specified single or multiple clusters.
- Support A/B, Blue/Green, Canary three release validation rules configuration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three are all deploy strategies, donot mention validation? It is confusing to users

and make progress.
-->

- **Traffic distribution tools that Flagger can use other than istio are not supported** While Flagger is able to support a wide range of traffic distribution tools including istio, nginx for grey scale releases. However, Kuraotr currently only supports unified installation of istio in multiple clusters, and Kurator may implement unified installation of other traffic distribution tools in the future.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Traffic distribution tools that Flagger can use other than istio are not supported** While Flagger is able to support a wide range of traffic distribution tools including istio, nginx for grey scale releases. However, Kuraotr currently only supports unified installation of istio in multiple clusters, and Kurator may implement unified installation of other traffic distribution tools in the future.
- **Traffic distribution tools other than istio are not supported** While Flagger is able to support a wide range of traffic distribution tools including istio, nginx for grey scale releases. However, Kurator currently only supports unified management of istio across clusters. Kurator may implement other traffic distribution tools in the future.

-->
The purpose of this proposal is to introduce a unified Continuous Delivery for Kurator that supports A/B, Blue/Green, and Canary.The main objectives of this proposal are as follows:

Custom Resource Definitions (CRDs): Design CRDs to enable Uniform Continuous Delivery These CRDs will provide a structured approach to defining clusters and different configuration distribution rules to enable uniform configuration distribution.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API Design

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRD is just a present type


Custom Resource Definitions (CRDs): Design CRDs to enable Uniform Continuous Delivery These CRDs will provide a structured approach to defining clusters and different configuration distribution rules to enable uniform configuration distribution.

Fleet-Manager Implementation: The Cluster Manager component will be responsible for monitoring the CRDs and performing the defined functions. It will install Flagger on the clusters and handle potential errors or anomalies to ensure smooth operation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fleet-manager implementation. You are implementing CD, IIRC, you want to introduce a new component. So it is the implement of XXXX

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel Cluster Manager is not a good name. I do not have a good suggestion now, maybe RollOut Engine or Delivery Engine, i am not sure.

// Delivery defines the Continuous Delivery Configurations to be used.
// If specified, a uniform Continuous Delivery policy is configured for this installed object.
// +optional
Delivery *DeliveryConfig `json:"deliveryPolicy"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call Rollout?

Delivery *DeliveryConfig `json:"deliveryPolicy"`
}

type DeliveryConfig struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RollOutConfig

// +optional
Autoscale *AutoscaleRef `json:"autoScale,omitempty"`

// Whether or not the testload is installed by the user. The default is false.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Whether or not the testload is installed by the user. The default is false.
// Whether or not the testloader is installed by the user. The default is false.

Autoscale *AutoscaleRef `json:"autoScale,omitempty"`

// Whether or not the testload is installed by the user. The default is false.
// If set it to ture, user need to install the testload himself.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If set it to ture, user need to install the testload himself.
// If set to true, user needs to install the testloader himself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the semantics is negative.

We can make it: testLoader will be installed when this is true

// +optional
TestLoad bool `json:"testLoad"`

// DeliveryPolicy defines the Release Strategy of Object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

``

Suggested change
// DeliveryPolicy defines the Release Strategy of Object
// DeliveryPolicy defines the release strategy of XXXX

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not object, it is deployment

type DeliveryConfig struct {
// Autoscale defines minreplicas and maxreplicas of horizontal Pod Autoscaler.
// +optional
Autoscale *AutoscaleRef `json:"autoScale,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear, how this will influence the rolling out


// TargetObject specifies what object to deploy the test to.
// Objects of type deployment or daemonSet.
TargetObject *TargetObjectReference `json:"targetObject"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to the upper struct DeliveryConfig?

This must be specified

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/TargetObject/WorkloadRef ?

```console
// Note: refer to https://github.com/fluxcd/flagger/blob/main/pkg/apis/flagger/v1beta1/canary.go
type ServiceConfig struct {
// Name of the virtual Kubernetes generated by Flagger.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

virtual Kubernetes means what?


```console
// Note: refer to https://github.com/fluxcd/flagger/blob/main/pkg/apis/flagger/v1beta1/canary.go
type ServiceConfig struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this used for canary svc generation? If so why does not it use the original svc's spec.


// PortDiscovery adds all container ports to the generated Kubernetes service.
// Defaults to true
PortDiscovery bool `json:"portDiscovery"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does it mean

Comment on lines 394 to 398
Primary *CustomMetadata `json:"primary,omitempty"`

// Canary is the metadata to add to the canary service.
// +optional
Canary *CustomMetadata `json:"canary,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure i understand. Is this annotation?

Signed-off-by: LiZhenCheng9527 <lizhencheng6@huawei.com>
// If set true, Kurator will remove Testloader when delete
// Application.Spec.ApplicationSyncPolicy.Rollout.
// Defines to false.
RemoveTestloaderOnDelete *bool `json:"removeTestloaderOnDelete,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assume this can be set simultaneously with TestLoader.

How about adding a struct for testLoader

type TestLoader struct{
  Enable bool
  RemoveOnDelete bool
}

// the presence or absence of content in the rolloutAnalysisCheckTimes field.
// So can't configure rolloutAnalysisCheckTimes and canaryStrategy at the same time.
// +optional
RolloutAnalysisCheckTimes *int `json:"analysisCheckTimes,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/RolloutAnalysisCheckTimes/AnalysisTimes

We have the context in the higher level struct

@@ -300,8 +288,39 @@ type TrafficRoutingConfig struct {
// +optional
CorsPolicy *istiov1alpha3.CorsPolicy `json:"corsPolicy,omitempty"`

// RolloutAnalysisCheckTimes defines the number of traffic analysis checks to run for A/B Testing and Blue/Green
// If set "rolloutAnalysisCheckTimes: 10". It means Kurator will checks the preview service ten times.
// Note: Kurator determines whether Blue/Green and A/B or canary related processing is required based on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer making use of CanaryStrategy to indicate if it is canary release

// the presence or absence of content in the rolloutAnalysisCheckTimes field.
// So can't configure rolloutAnalysisCheckTimes and canaryStrategy at the same time.
// +optional
RolloutAnalysisCheckTimes *int `json:"analysisCheckTimes,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure you want to make default value not equal to 0 before using pointer int

// +optional
RolloutAnalysisCheckTimes *int `json:"analysisCheckTimes,omitempty"`

// Match conditions of A/B HTTP header.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A/B testing

@@ -300,8 +288,39 @@ type TrafficRoutingConfig struct {
// +optional
CorsPolicy *istiov1alpha3.CorsPolicy `json:"corsPolicy,omitempty"`

// RolloutAnalysisCheckTimes defines the number of traffic analysis checks to run for A/B Testing and Blue/Green
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check all Blue/Green and A/B, usually it is with A/B testing ·Blue/Green Deployment

@@ -386,12 +398,57 @@ type Metric struct {
ThresholdRange *CanaryThresholdRange `json:"thresholdRange,omitempty"`

// TemplateRef references a metric template object
// The metric can be custom by users.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

customized

Comment on lines 415 to 424
Name string `json:"name,omitempty"`

// Namespace of MetricTemplate
// More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces
Namespace string `json:"namespace,omitempty"`

// MetricProvider defines the metric provider of metricTemplate.
// Kurator now only supports `Prometheus` as metricProvider.
// Default is prometheus
MetricProvider string `json:"metricProvider,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear how these fields work, i would suggest removing MetricTemplate ATM

// The traffic analysis can be extended with webhooks.
// Kurator will call each webhook URL and determine from the response status code (HTTP 2xx)
// if the test is failing or not.
// Kurator calls the testloader via webhook to generate a load of traffic accessing service.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a load of traffic accessing service >

Signed-off-by: LiZhenCheng9527 <lizhencheng6@huawei.com>
@@ -215,7 +224,7 @@ type RolloutPolicy struct {
// preview deployment to make progress before it is considered to be failed.
// Defaults to 600.
// +optional
RolloutTimeoutSeconds *int `json:"rolloutTimeoutSeconds,omitempty"`
RolloutTimeoutSeconds int `json:"rolloutTimeoutSeconds,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to make it default to 600s, this should be *int

Comment on lines 301 to 302
// Note: Kurator determines whether A/B Testing and Blue/Green Deployment or Canary Deployment
// related processing is required based on the presence or absence of content in the analysisTimes field.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please reword this sentence. can you try totranslate into Chinese using dictionary

@@ -387,68 +397,19 @@ type TrafficAnalysis struct {

type Metric struct {
// Name of the metric.
// User also can use name point to custom metric checks.
// The metrics now used in kurator are `request-success-rate` and `request-duration`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently supported metric are xxx and xxx

@@ -387,68 +397,19 @@ type TrafficAnalysis struct {

type Metric struct {
// Name of the metric.
// User also can use name point to custom metric checks.
// The metrics now used in kurator are `request-success-rate` and `request-duration`
// `request-success-rate` calculates the ratio of successful request to total request for a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request-success-rate indicates the successful request ratio during this checking interval

// The metrics now used in kurator are `request-success-rate` and `request-duration`
// `request-success-rate` calculates the ratio of successful request to total request for a
// checkIntervalSeconds time. It returns a value from 0 to 100.
// `request-duration` returns the elapsed time for the longest request within a CheckIntervalSeconds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the longest request, i am sure this is not right

CheckIntervalSeconds is not consitent with below IntervalSeconds

Signed-off-by: LiZhenCheng9527 <lizhencheng6@huawei.com>
// Currently supported metric are `request-success-rate` and `request-duration`.
// `request-success-rate` indicates the successful request ratio during this checking intervalSeconds.
// It returns a value from 0 to 100.
// `request-duration` returns the P99 latency within intervalSeconds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request-duration indicates P99 latency of the requests during the check interval

// Defaults to 60.
IntervalSeconds *int `json:"intervalSeconds,omitempty"`

// Range value accepted for this metric
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Range value accepted for this metric
// Valid range accepted for this metric

Signed-off-by: LiZhenCheng9527 <lizhencheng6@huawei.com>
Copy link
Member

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@kurator-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hzxuzhonghu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kurator-bot kurator-bot merged commit 366a0cd into kurator-dev:main Dec 7, 2023
11 checks passed
@LiZhenCheng9527 LiZhenCheng9527 deleted the proposal-of-kuratorCD branch January 9, 2024 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants