Replica set conditions API #33905

0xmichalis · 2016-10-03T10:39:19Z

Partially addresses #32863

@kubernetes/sig-apps

This change is

0xmichalis · 2016-10-03T10:58:44Z

Still needs a test

0xmichalis · 2016-10-03T10:59:21Z

And #33092 would be a nice-to-have.

k8s-ci-robot · 2016-10-03T11:21:17Z

Jenkins Kubemark GCE e2e failed for commit 7a20df8c2c60f2088b5f7ce08255c29d76aae198. Full PR test history.

The magic incantation to run this job again is @k8s-bot kubemark e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

smarterclayton · 2016-10-03T17:23:42Z

pkg/api/types.go

@@ -1753,6 +1753,32 @@ type ReplicationControllerStatus struct {

 	// ObservedGeneration is the most recent generation observed by the controller.
 	ObservedGeneration int64 `json:"observedGeneration,omitempty"`
+


This API change LGTM - it is consistent with other objects that have conditions.

@kubernetes/api-review-team any disagreement?

See my other remark about LastProbeTime.

smarterclayton · 2016-10-03T17:24:11Z

I would recommend splitting the change to actually set conditions to a separate PR, and have this only be the API change to support conditions.

0xmichalis · 2016-10-04T12:06:14Z

I would recommend splitting the change to actually set conditions to a separate PR, and have this only be the API change to support conditions.

Sure, do you want something similar for the perma-failed PR?

0xmichalis · 2016-10-04T14:36:12Z

Updated to include just the API changes

k8s-ci-robot · 2016-10-04T14:54:56Z

Jenkins unit/integration failed for commit 9b21223e8e0b00527af04003cc83132f13110323. Full PR test history.

The magic incantation to run this job again is @k8s-bot unit test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

k8s-ci-robot · 2016-10-04T14:58:26Z

Jenkins verification failed for commit 9b21223e8e0b00527af04003cc83132f13110323. Full PR test history.

The magic incantation to run this job again is @k8s-bot verify test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

smarterclayton · 2016-10-04T17:29:48Z

Thanks for splitting - I have no objections to adding this field to ReplicaSets but needs a sign off to @kubernetes/api-review-team

soltysh · 2016-10-06T08:28:20Z

pkg/api/v1/types.go

+	// Status of the condition, one of True, False, Unknown.
+	Status ConditionStatus `json:"status"`
+	// The last time the condition transitioned from one status to another.
+	LastTransitionTime unversioned.Time `json:"lastTransitionTime,omitempty"`


Usually, there's also LastProbeTime unversioned.Time which is the last time a check was actually performed. See PodCondition or JobCondition.

But does LastProbeTime make sense for the types of conditions we have for replica sets? Both Ready and ReplicaFailure conditions are updated only once their status changes. See #19343 (comment) for more info on this.

That comment and further discussion doesn't convince me to have just single timestamp. Having both seems more flexible, imho.

Updated with both types of timestamps

soltysh · 2016-10-06T08:28:52Z

pkg/api/types.go

@@ -1753,6 +1753,32 @@ type ReplicationControllerStatus struct {

 	// ObservedGeneration is the most recent generation observed by the controller.
 	ObservedGeneration int64 `json:"observedGeneration,omitempty"`
+


See my other remark about LastProbeTime.

soltysh · 2016-10-06T08:29:38Z

pkg/api/v1/types.go

+const (
+	// ReplicationControllerReplicaFailure is added in a replication controller when one of its pods
+	// fails to be created or deleted.
+	ReplicationControllerReplicaFailure ReplicationControllerConditionType = "ReplicaFailure"


Just one condition? What about running or similar, at least?

I can add an additional Ready condition type.

I'm not convinced Ready is meaningful for an RC.

0xmichalis · 2016-10-06T12:00:37Z

@kubernetes/kube-api ptal

k8s-ci-robot · 2016-10-07T14:42:16Z

Jenkins GKE smoke e2e failed for commit 8ff78aa. Full PR test history.

The magic incantation to run this job again is @k8s-bot gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

soltysh

LGTM

thockin · 2016-10-07T16:27:38Z

pkg/api/types.go

+	// ReplicationControllerReplicaFailure is added in a replication controller when one of its pods
+	// fails to be created or deleted.
+	ReplicationControllerReplicaFailure ReplicationControllerConditionType = "ReplicaFailure"
+	// ReplicationControllerReady denotes whether all of the replicas for the controller are ready or not.


Is the intention that this flips if any replica is not yet ready? That can happen sort of arbitrarily - why is this useful?

Yes, this will flip. Actually you got me thinking that this information already exists in the resource printer of a replica set (READY pods). I wanted to make replica sets more usable but this condition doesn't add much. Removing...

Agree Ready doesn't belong here.

thockin · 2016-10-07T16:28:19Z

pkg/api/types.go

+// These are valid conditions of a replication controller.
+const (
+	// ReplicationControllerReplicaFailure is added in a replication controller when one of its pods
+	// fails to be created or deleted.


Can we explain how/why this might happen? E.g. is it just bugs?

#32863 (comment)

Eventually I want to surface ImagePull errors and crashlooping pods under this (or maybe a new one) Condition too.

thockin · 2016-10-07T16:33:13Z

pkg/api/v1/types.go

+const (
+	// ReplicationControllerReplicaFailure is added in a replication controller when one of its pods
+	// fails to be created or deleted.
+	ReplicationControllerReplicaFailure ReplicationControllerConditionType = "ReplicaFailure"


I'm not convinced Ready is meaningful for an RC.

thockin · 2016-10-07T16:33:38Z

pkg/apis/extensions/types.go

+	// fails to be created or deleted.
+	ReplicaSetReplicaFailure ReplicaSetConditionType = "ReplicaFailure"
+	// ReplicaSetReady denotes whether all of the replicas for the replica set are ready or not.
+	ReplicaSetReady ReplicaSetConditionType = "Ready"


Same questions as RC

krmayankk · 2016-10-09T06:34:37Z

api/openapi-spec/root_swagger.json

@@ -28176,6 +28176,39 @@
     }
    }
   },
+   "v1.ReplicationControllerCondition": {
+    "description": "ReplicationControllerCondition describes the state of a replication controller at a certain point.",
+    "required": [


i have been confused about the word Condition until i found this PR. I was thinking something like a predicate or cirumstance that led to something. Should we rather call it ReplicationControllerState ?

Sorry, that ship has sailed. The convention is "condition".

noun 1. the state of something, especially with regard to its appearance, quality, or working order.

"State" is equally ambiguous.

krmayankk · 2016-10-09T06:35:22Z

api/openapi-spec/root_swagger.json

@@ -28232,6 +28265,13 @@
      "type": "integer",
      "format": "int32"
     },
+     "conditions": {


call this states

krmayankk · 2016-10-09T06:37:26Z

@kubernetes/sig-apps does anyone else feel confused about the use of the word condition, should we rather calle it ReplicaSetState and states ?

smarterclayton · 2016-10-09T16:36:44Z

Condition is what we use in this context in the API. It's an array of conditions on the object. See Pod and Node.

thockin · 2016-10-10T05:01:22Z

"state" implies a state-machine, and this is designed to avoid a
state-machine. State-machines, as an API abstraction, are very rigid, and
have almost no room for movement. Conditions are orthogonal ideas.

We do not, however, seem to have a guideline for whether all conditions
should exist on all instances at all times, or whether they should only be
present if the logical condition is of a particular polarity (e.g. phrase
conditions as default-false and their presence implies true - "Unready" vs
"Ready".)

On Sun, Oct 9, 2016 at 9:37 AM, Clayton Coleman notifications@github.com
wrote:

Condition is what we use in this context in the API. It's an array of
conditions on the object. See Pod and Node.

—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
#33905 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVBVhHPrFuvhk-gktw77tNWbaJSWjks5qyRg6gaJpZM4KMcca
.

soltysh · 2016-10-10T10:11:08Z

@krmayankk there's #7856 as a background for your doubts.

0xmichalis · 2016-10-10T10:29:22Z

We do not, however, seem to have a guideline for whether all conditions should exist on all instances at all times, or whether they should only be present if the logical condition is of a particular polarity (e.g. phrase conditions as default-false and their presence implies true - "Unready" vs "Ready".)

This is something I came across both here and in Deployment conditions with ReplicaFailure. Do we want to show it up by default to False and transition to True once we have a ReplicaFailure or do we want it to show up only on failures? Should we update api conventions about what is the right approach to authoring Conditions?

thockin · 2016-10-10T15:33:32Z

Left to my own devices I would have probably made it an implies true, but
it seems not to be. I don't have enough context on the existing Conditions
to say for sure - @bgrant0607

On Mon, Oct 10, 2016 at 3:29 AM, Michail Kargakis notifications@github.com
wrote:

We do not, however, seem to have a guideline for whether all conditions
should exist on all instances at all times, or whether they should only be
present if the logical condition is of a particular polarity (e.g. phrase
conditions as default-false and their presence implies true - "Unready" vs
"Ready".)

This is something I came across both here and in Deployment conditions
with ReplicaFailure. Do we want to show it up by default to False and
transition to True once we have a ReplicaFailure or do we want it to show
up only on failures? Should we update api conventions about what is the
right approach to authoring Conditions?

—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
#33905 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVGToSJNzgSH12ffjECphTfGjJFnAks5qyhOhgaJpZM4KMcca
.

bgrant0607 · 2016-10-10T16:26:18Z

pkg/apis/extensions/v1beta1/types.go

+
+// These are valid conditions of a replica set.
+const (
+	// ReplicaSetReplicaFailure is added in a replication controller when one of its pods fails


s/replication controller/replica set/

bgrant0607 · 2016-10-10T16:33:05Z

@thockin @Kargakis

Initially we started with conditions like Ready, Available, Reachable, etc. -- true always meant "known good". False would assumed by default, though we always populated the conditions.

However, most node conditions we wanted to report were problems, so "known bad" conditions were introduced. These conditions we didn't always populate. Even in this case, false could be assumed by default, however. ReplicaFailed appears to match this convention. I'm fine with it.

I assume that ReplicaFailed would be present when the controller was unable to make current state match desired state.

bgrant0607 · 2016-10-10T16:33:14Z

ok to test

bgrant0607 · 2016-10-10T16:34:24Z

@Kargakis Documenting the convention would be useful.

0xmichalis · 2016-10-10T16:41:12Z

I assume that ReplicaFailed would be present when the controller was unable to make current state match desired state.

In the first pass, ReplicaFailure will be added only once a CREATE or DELETE on a pod fails. So the answer is yes. Long-term though I would like to surface conditions like ImagePullBackOff or CrashLoopBackOff which means that the current state (rs.status.replicas) will match the desired state (rs.spec.replicas) but we will still have pod failures. This will be very useful for infant mortality detection: #18568

0xmichalis · 2016-10-11T09:25:02Z

@bgrant0607 @thockin can we move forward with this? I would like to have it in 1.5

bgrant0607 · 2016-10-11T19:06:05Z

@Kargakis I'm ok with the API change.

Did @soltysh review the implementation?

cc @kubernetes/deployment

bgrant0607 · 2016-10-11T19:07:04Z

cc @erictune

0xmichalis · 2016-10-11T19:19:37Z

Did @soltysh review the implementation?

This PR contains only the API changes as per @smarterclayton's request. I will open a follow-up with the implementation as soon as this merges.

thockin · 2016-10-12T06:44:59Z

API LGTM

soltysh · 2016-10-12T10:35:54Z

Did @soltysh review the implementation?

Current yes, further haven't seen yet.

k8s-github-robot · 2016-10-12T11:31:28Z

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

k8s-github-robot · 2016-10-12T12:10:45Z

Automatic merge from submit-queue

@smarterclayton

Automatic merge from submit-queue Replica set conditions controller changes Follow-up to #33905, partially addresses #32863. @smarterclayton @soltysh @bgrant0607 @mfojtik I just need to add e2e tests

googlebot added the cla: yes label Oct 3, 2016

k8s-github-robot assigned smarterclayton Oct 3, 2016

k8s-github-robot added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. release-note-label-needed labels Oct 3, 2016

smarterclayton reviewed Oct 3, 2016

View reviewed changes

0xmichalis changed the title ~~Replica set conditions~~ Replica set conditions API Oct 4, 2016

soltysh requested changes Oct 6, 2016

View reviewed changes

soltysh approved these changes Oct 7, 2016

View reviewed changes

thockin reviewed Oct 7, 2016

View reviewed changes

smarterclayton mentioned this pull request Oct 8, 2016

Communicate replica set and deployment status via conditions kubernetes/enhancements#120

Closed

22 tasks

krmayankk reviewed Oct 9, 2016

View reviewed changes

api/openapi-spec/root_swagger.json

@@ -28232,6 +28265,13 @@

"type": "integer",

"format": "int32"

},

"conditions": {

Copy link

krmayankk Oct 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call this states

bgrant0607 added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Oct 10, 2016

bgrant0607 reviewed Oct 10, 2016

View reviewed changes

0xmichalis added 3 commits October 10, 2016 18:34

api: add Conditions in replication controllers

0701232

extensions: add Conditions in replica sets

63a6ce3

Generated code for RS/RC conditions

5589469

bgrant0607 assigned soltysh and unassigned smarterclayton Oct 11, 2016

0xmichalis added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 12, 2016

k8s-github-robot merged commit f9e8ee8 into kubernetes:master Oct 12, 2016

0xmichalis deleted the replica-set-conditions branch October 12, 2016 13:21

0xmichalis mentioned this pull request Oct 12, 2016

Replica set conditions controller changes #34645

Merged

smarterclayton mentioned this pull request Oct 25, 2016

hack/verify-generated-protobuf.sh is not working in the build guards #35486

Closed

		@@ -1753,6 +1753,32 @@ type ReplicationControllerStatus struct {

		// ObservedGeneration is the most recent generation observed by the controller.
		ObservedGeneration int64 `json:"observedGeneration,omitempty"`

Replica set conditions API #33905

Replica set conditions API #33905

Conversation

0xmichalis commented Oct 3, 2016 • edited by k8s-oncall Loading

0xmichalis commented Oct 3, 2016

0xmichalis commented Oct 3, 2016

k8s-ci-robot commented Oct 3, 2016

Choose a reason for hiding this comment

soltysh Oct 6, 2016 • edited Loading

Choose a reason for hiding this comment

smarterclayton commented Oct 3, 2016

0xmichalis commented Oct 4, 2016

0xmichalis commented Oct 4, 2016

k8s-ci-robot commented Oct 4, 2016

k8s-ci-robot commented Oct 4, 2016

smarterclayton commented Oct 4, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

soltysh Oct 6, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0xmichalis commented Oct 6, 2016

k8s-ci-robot commented Oct 7, 2016

soltysh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krmayankk commented Oct 9, 2016

smarterclayton commented Oct 9, 2016 via email

thockin commented Oct 10, 2016

soltysh commented Oct 10, 2016

0xmichalis commented Oct 10, 2016

thockin commented Oct 10, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bgrant0607 commented Oct 10, 2016

bgrant0607 commented Oct 10, 2016

bgrant0607 commented Oct 10, 2016

0xmichalis commented Oct 10, 2016

0xmichalis commented Oct 11, 2016

bgrant0607 commented Oct 11, 2016

bgrant0607 commented Oct 11, 2016

0xmichalis commented Oct 11, 2016

thockin commented Oct 12, 2016

soltysh commented Oct 12, 2016

k8s-github-robot commented Oct 12, 2016

k8s-github-robot commented Oct 12, 2016

0xmichalis commented Oct 3, 2016 •

edited by k8s-oncall

Loading

soltysh Oct 6, 2016 •

edited

Loading

soltysh Oct 6, 2016 •

edited

Loading