Fix invalid helloworld example #4780

nak3 · 2019-07-17T09:09:20Z

This patch makes a tiny fix which removes invalid setting in
configuration example.

After #4731, periodSeconds
needs to be set with failureThreshold and timeoutSeconds. This
patch simply removes periodSeconds from the config.

Current error:

$ ko apply -f test/test_images/helloworld/helloworld.yaml 
...
2019/07/17 17:58:13 Published gcr.io/gcp-compute-engine-223401/helloworld-edca531b677458dd5cb687926757a480@sha256:88311e84da104e258959a2417f067a81009065aad93bab2240bfc79969e94056
route.serving.knative.dev/route-example configured
Error from server (InternalError): error when creating "STDIN": Internal error occurred: admission webhook "webhook.serving.knative.dev" denied the request: mutation failed: expected 1 <= 0 <= 2147483647: spec.template.spec.containers[0].readinessProbe.failureThreshold, spec.template.spec.containers[0].readinessProbe.timeoutSeconds
2019/07/17 17:58:14 error executing 'kubectl apply': exit status 1

Proposed Changes

Remove periodSeconds from config.

Release Note

NONE

This patch makes a tiny fix which removes invalid setting in configuration example. After knative#4731, `periodSeconds` needs to be set with `failureThreshold` and `timeoutSeconds`. This patch simply removes `periodSeconds` from the config.

knative-prow-robot · 2019-07-17T09:09:35Z

Hi @nak3. Thanks for your PR.

I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

markusthoemmes · 2019-07-17T09:28:56Z

/ok-to-test

That sounds like a breaking change though! Should we revisit that logic and make sure we apply defaulting instead of potentially breaking users?

joshrider · 2019-07-17T14:24:39Z

I think it makes sense to apply k8s's probe defaults for TimeoutSeconds and FailureThreshold in the cases where PeriodSeconds != 0.

nak3 · 2019-07-17T14:32:05Z

Please allow me to confirm. Not only PeriodSeconds != 0 case, but also #4731 had some other potentially breaking users.
For example, following configs do not work anymore.

            readinessProbe:
              httpGet:
                path: /
              timeoutSeconds: 1

            readinessProbe:
              httpGet:
                path: /
              failureThreshold: 1

            readinessProbe:
              httpGet:
                path: /
              failureThreshold: 1
              timeoutSeconds: 1

(Below may not need defaulting, but I think there are many users had port in probe and it got broken.)

            readinessProbe:
              httpGet:
	        port: 80
                path: /

Should we apply defaulting to cover all of them?

joshrider · 2019-07-17T14:47:16Z

In the cases where failureThreshold or timeoutSeconds are greater than 0, but periodSeconds is unset, I could see assuming that the user wants a standard k8s-style probe and setting the normal k8s defaults for whichever of periodSeconds, failureThreshold, and timeoutSeconds are unset.

I guess we'd then reserve our aggressive probe for when periodSeconds, failureThreshold, and timeoutSeconds are all 0.

Thoughts? @mattmoor @dgerd @markusthoemmes

Also, did the 3rd example work before? I thought we always disallowed port.

mattmoor · 2019-07-17T15:55:01Z

I think my concern is that specifying these are a subtle way of degrading your cold start time.

I feel like erroring our and simply requiring a value for periodSeconds seemed reasonable. Happy to discuss more, but that's my $0.02

dgerd · 2019-07-17T17:03:38Z

So K8s defaulting behavior is:

periodSeconds: Default 10, Min 1
timeoutSeconds: Default 1, Min 1
failureThreshold: Default 3, Min 1
successThreshold: Default 1, Min 1

In #4731 we allow 0 for period seconds and changed the Knative periodSeconds default to 0. Given that the K8s default for timeoutSeconds and failureThresholdare not compatible with our newperiodSeconds` default there is no way to get around this change being potentially breaking.

Given where we are in the stage of our project, I would like to make sure that our top goal here is to select behavior that is unsurprising to customers specifying probes. Secondary to this goal we should minimize impact to users today.

nak3 · 2019-07-22T01:55:38Z

In the cases where failureThreshold or timeoutSeconds are greater than 0, but periodSeconds is unset, I could see assuming that the user wants a standard k8s-style probe and setting the normal k8s defaults for whichever of periodSeconds, failureThreshold, and timeoutSeconds are unset.

I see. I have opened for the PR here #4864

Also, did the 3rd example work before? I thought we always disallowed port.

Sorry, it was my mistake!

dgerd · 2019-07-24T17:10:03Z

To facilitate a better discussion today at the API WG adding a few options here to this issue with an attempt to lay out the pros/cons.

Option 1 - Dynamic defaults

This is the change in #4864. If we set fields that are incompatible with our default probe then assume a user wants a regular probe and adjust.

Pros:

Enables current behavior of provide least amount of information necessary (i.e. only specify failureThreshold)
Provides most intelligence around interpreting intent
Most K8s probes get translated over correctly to Knative

Cons:

Changing one setting has a side-effect on other settings (can cascade if we do this behavior elsewhere)
More difficult to document (Cannot just document min, max, default)

Option 2 - Bimodal defaults (What we do in HEAD)

If periodSeconds == 0 then we have one set of defaults. If periodSeconds != 0 then we have another set of defaults. This is similar to Option 1, but trying to make less assumption about user intent.

Pros:

K8s style probes that specify periodSeconds work without breakage
Easy to document in a 2 column table

Cons:

Requires users to always set periodSeconds to use kubernetes style probes
Changing periodSeconds has a side-effect on other settings (can cascade if we do this behavior elsewhere)

Option 3 - Unspecified is special (not defaulted)

This option is to take a step back on our defaulting behavior. Instead of periodSeconds being defaulted to 0 and shown in the API we interpret an unspecified probe to mean full defaults on readinessProbe. With this users cannot customize successThreshold or other settings as otherwise we would slip back into the two cases above. :(

Pros:

Keeps API very similar to previous releases

Cons:

Less visibility of probing in our API
Less customizable probing and undoes some of our work on improving probing here

Option 4 - Introduce new fields

See kubernetes/kubernetes#76951

I would not recommend this unless we can eventually get the behavior in upstream K8s, but we could add periodMilliseconds or similar here instead of extending the interpretation of exiting fields.

Pros:

Keeps us synced with upstream K8s
Can eventually remove our queue-proxy probe

Cons:

Uncertainty of reception or feasibility
Will have to maintain fork until our client catches up to upstream K8s

dgerd · 2019-07-26T23:05:00Z

@joshrider @nak3 @shashwathi @markusthoemmes

Wanted to get your thoughts on taking a step back in functionality and proceeding with the following:

Only unspecified readinessProbes means "probe aggressively/sub-second" with system defined defaults. (i.e. our spec does not show '0' for periodSeconds nor does it allow it)
User defined probes continue to be passed through as added in Use user-defined readinessProbe in queue-proxy #4731
If/When Allow probes to run on a more granular timer. kubernetes/kubernetes#76951 lands switch to using that instead of our own implementation

I believe that takes us a step back in:

Sub-second probes are not "visible" in the API (We could potentially address this in other ways)
successThreshold cannot be customized for sub-second probes. (How important is this do you guys think?)

It moves us forward in that:

We break none of the examples that @nak3 posted above
Migration to a K8s defined sub-second probe will be easier

This keeps the value of:

All probes get re-written through the queue-proxy
Exec probe in queue-proxy still allows more granular probing of user container

Anything else missing here?

joshrider · 2019-07-30T14:22:20Z

I think that covers most of it.

One thing I'd add to the "takes us a step back" pile is that we lose the option of using an httpGet instead of tcpSocket in the sub-second probe. I'm not sure how important being able to customise successThreshold on a sub-second probe will be, but maybe having the option to pick between TCP/HTTP is more relevant?

Overall, I don't have much of a lean either way.

With regard to rolling back some of the functionality, I can see value in not breaking anyone's existing stuff. If minimising surprises for users is a top priority, having a user's probe simulate K8S's semantics seems desirable. There didn't seem to be much enthusiasm for kubernetes/kubernetes#76951, so I'm unsure of how much value we get out of smoothing the way for an eventual adoption of a sub-second K8S probe.

On the side of leaving things mostly intact, I can appreciate wanting to trim down startup time wherever possible.

In summation, 🤷‍♂️

dgerd · 2019-07-30T16:50:41Z

/cc @mattmoor

mattmoor · 2019-07-30T19:38:30Z

Only unspecified readinessProbes means "probe aggressively/sub-second"

This means we can't support aggressively probing with httpGet, which feels problematic.

markusthoemmes · 2019-08-21T10:51:19Z

@dgerd did we have any closure on the decisionmaking above? Might've fallen through the cracks or I've just forgotten the resolution.

markusthoemmes · 2019-08-21T10:52:22Z

I vaguely recall something like: Well it's a very small break and we'd rather take this breaking change than trying to work around it at this point? Seems like we should get a closure on this pre v1.

markusthoemmes · 2019-08-21T10:52:43Z

/assign @dgerd
/assign @mattmoor

markusthoemmes · 2019-09-23T11:07:18Z

@dgerd Pingeroo.

dgerd · 2019-09-23T18:18:17Z

We decided to take Option #2 with improved error messages.

That said this change looks good.

/approve
/lgtm

knative-prow-robot · 2019-09-23T18:18:30Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dgerd, nak3

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~test/OWNERS~~ [dgerd]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

knative-test-reporter-robot · 2019-09-23T18:46:39Z

The following jobs failed:

Test name	Triggers	Retries
pull-knative-serving-unit-tests	pull-knative-serving-unit-tests	1/3

Automatically retrying due to test flakiness...
/test pull-knative-serving-unit-tests

nak3 · 2019-09-29T05:08:21Z

/test pull-knative-serving-unit-tests

Fix invalid helloworld example

c0da733

This patch makes a tiny fix which removes invalid setting in configuration example. After knative#4731, `periodSeconds` needs to be set with `failureThreshold` and `timeoutSeconds`. This patch simply removes `periodSeconds` from the config.

googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Jul 17, 2019

knative-prow-robot requested review from adrcunha and markusthoemmes July 17, 2019 09:09

knative-prow-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jul 17, 2019

knative-prow-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. area/test-and-release It flags unit/e2e/conformance/perf test issues for product features labels Jul 17, 2019

knative-prow-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 17, 2019

adrcunha requested review from joshrider and removed request for adrcunha July 17, 2019 14:31

nak3 mentioned this pull request Jul 22, 2019

Set k8s normal default when assuming non aggressive probe is used #4864

Closed

knative-prow-robot requested a review from mattmoor July 30, 2019 16:50

knative-prow-robot assigned dgerd Aug 21, 2019

knative-prow-robot assigned mattmoor Aug 21, 2019

knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 23, 2019

knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 23, 2019

knative-prow-robot merged commit c850158 into knative:master Sep 29, 2019

markusthoemmes mentioned this pull request Oct 7, 2019

Default timeoutSecond and failureThreshold values when the user opts into K8s probing #5741

Closed

dgerd mentioned this pull request Oct 15, 2019

failureThreshold and timeoutSeconds parameter should be optional as per k8s doc #5732

Closed

nak3 deleted the fix-invalid-example branch October 20, 2019 11:44

antoineco mentioned this pull request May 27, 2020

Remove periodSeconds from examples knative/docs#2515

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix invalid helloworld example #4780

Fix invalid helloworld example #4780

nak3 commented Jul 17, 2019

knative-prow-robot commented Jul 17, 2019

markusthoemmes commented Jul 17, 2019

joshrider commented Jul 17, 2019

nak3 commented Jul 17, 2019 •

edited

Loading

joshrider commented Jul 17, 2019 •

edited

Loading

mattmoor commented Jul 17, 2019

dgerd commented Jul 17, 2019

nak3 commented Jul 22, 2019

dgerd commented Jul 24, 2019 •

edited

Loading

dgerd commented Jul 26, 2019

joshrider commented Jul 30, 2019

dgerd commented Jul 30, 2019

mattmoor commented Jul 30, 2019

markusthoemmes commented Aug 21, 2019

markusthoemmes commented Aug 21, 2019

markusthoemmes commented Aug 21, 2019

markusthoemmes commented Sep 23, 2019

dgerd commented Sep 23, 2019

knative-prow-robot commented Sep 23, 2019

knative-test-reporter-robot commented Sep 23, 2019

nak3 commented Sep 29, 2019

Fix invalid helloworld example #4780

Fix invalid helloworld example #4780

Conversation

nak3 commented Jul 17, 2019

Current error:

Proposed Changes

knative-prow-robot commented Jul 17, 2019

markusthoemmes commented Jul 17, 2019

joshrider commented Jul 17, 2019

nak3 commented Jul 17, 2019 • edited Loading

joshrider commented Jul 17, 2019 • edited Loading

mattmoor commented Jul 17, 2019

dgerd commented Jul 17, 2019

nak3 commented Jul 22, 2019

dgerd commented Jul 24, 2019 • edited Loading

Option 1 - Dynamic defaults

Option 2 - Bimodal defaults (What we do in HEAD)

Option 3 - Unspecified is special (not defaulted)

Option 4 - Introduce new fields

dgerd commented Jul 26, 2019

joshrider commented Jul 30, 2019

dgerd commented Jul 30, 2019

mattmoor commented Jul 30, 2019

markusthoemmes commented Aug 21, 2019

markusthoemmes commented Aug 21, 2019

markusthoemmes commented Aug 21, 2019

markusthoemmes commented Sep 23, 2019

dgerd commented Sep 23, 2019

knative-prow-robot commented Sep 23, 2019

knative-test-reporter-robot commented Sep 23, 2019

nak3 commented Sep 29, 2019

nak3 commented Jul 17, 2019 •

edited

Loading

joshrider commented Jul 17, 2019 •

edited

Loading

dgerd commented Jul 24, 2019 •

edited

Loading