Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UPSTREAM: 33958: add global timeout flag #11104

Merged

Conversation

juanvallejo
Copy link
Contributor

@juanvallejo juanvallejo commented Sep 26, 2016

UPSTREAM: kubernetes/kubernetes#33958

Related Trello card:
https://trello.com/c/5YkspaNk/711-5-oc-global-timeout-ux-p4

Related Bugzilla RFE: https://bugzilla.redhat.com/show_bug.cgi?id=1358393

This patch adds a global timeout flag (viewable with oc options) with
a default value of 0s (meaning no timeout). It renames all local timeout flags
previously declared in specific commands, such as oc delete and oc scale, to avoid conflict with the global timeout flag.

TODO

To Resolve

  • Not sure what to do with commands that already implement a local --timeout flag with custom values / descriptions.

Example

$ oc whoami # no timeout flag set - default to 0s (which means no timeout)
system:admin

$ oc whoami --request-timeout=0 # zero means no timeout
system:admin

$ oc whoami --request-timeout=1ms
Unable to connect to the server: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

cc @openshift/cli-review @smarterclayton @jwforres @deads2k @enj

@@ -95,7 +94,6 @@ func NewCmdRollingUpdate(f *cmdutil.Factory, out io.Writer) *cobra.Command {
}
cmd.Flags().Duration("update-period", updatePeriod, `Time to wait between updating pods. Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".`)
cmd.Flags().Duration("poll-interval", pollInterval, `Time delay between polling for replication controller status after the update. Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".`)
cmd.Flags().Duration("timeout", timeout, `Max time to wait for a replication controller to update before giving up. Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".`)
Copy link
Contributor Author

@juanvallejo juanvallejo Sep 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fabianofranz I am unsure how we would handle cases where timeout has a specific description (or default value) for a command. For example, newtoken.go has a timeout of 30s.

@juanvallejo juanvallejo force-pushed the jvallejo/add-global-timeout-flag branch 3 times, most recently from 4f6c234 to c0b117b Compare September 29, 2016 14:27
@smarterclayton
Copy link
Contributor

Timeout 10s is too short for a default. I may be missing context, but we
explicitly don't have a timeout in rhc for this reason - it may not be
appropriate to set a default timeout.

On Thu, Sep 29, 2016 at 10:02 AM, Juan Vallejo notifications@github.com
wrote:

Related Trello card:
https://trello.com/c/5YkspaNk/711-5-oc-global-timeout-ux-p4

This patch adds a global timeout flag (viewable with oc options) with
a default value of 10s. It removes all local timeout flags
previously declared in specific commands, such as oc delete and oc
scale.

TODO

To Resolve

  • Not sure what to do with commands that already implement a local
    --timeout flag with custom values / descriptions.
    • These commands could overwrite the global --timeout flag by
      default with their current duration values.
    • Custom description for the --timeout flag could maybe be added
      somewhere in the usage output.

@openshift/cli-review https://github.com/orgs/openshift/teams/cli-review

@smarterclayton https://github.com/smarterclayton

You can view, comment on, or merge this pull request online at:

#11104
Commit Summary

  • UPSTREAM: 0000: add global timeout flag
  • update docs
  • UPSTREAM: 0000: add default timeout config override
  • add default timeout config override of 60s

File Changes

Patch Links:


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#11104, or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p1e4uHm9cft5Uf6evlmCO2JkFBUbks5qu8TggaJpZM4KHCBS
.

@juanvallejo juanvallejo force-pushed the jvallejo/add-global-timeout-flag branch from c0b117b to 3014209 Compare September 29, 2016 17:21
@juanvallejo
Copy link
Contributor Author

@smarterclayton

Timeout 10s is too short for a default. I may be missing context, but we
explicitly don't have a timeout in rhc for this reason - it may not be
appropriate to set a default timeout.

Okay, this makes sense, I'll set the starting value at 0, which would imply an infinite amount of time before timing out.

I was hoping I could get your feedback on how to approach the actual implementation of timing out requests. I'm looking at k8s.io/kubernetes/pkg/client/restclient/request.go#L797 where the current implementation does not currently have a time limit per request, but rather a fixed number of max retries (10 at the moment) and a small timeout in between each, set by the Retry-After header. I was thinking of keeping this implementation exactly as it is, and then adding a "timer" that ends the current request (and ignores any remaining retries) after the global timeout time is exceeded. WDYT?

cc @fabianofranz

@liggitt
Copy link
Contributor

liggitt commented Sep 29, 2016

what does the timeout apply to? dialing? time to first byte? time without data being read? total time?

@juanvallejo
Copy link
Contributor Author

@liggitt

what does the timeout apply to? dialing? time to first byte? time without data being read? total time?

I was originally thinking of it applying to total time. Would it make more sense for it to apply to something else? Maybe the timeout applying to dialing would work better, then once it successfully connects to the server, the timeout "timer" would no longer be in effect.

@juanvallejo juanvallejo force-pushed the jvallejo/add-global-timeout-flag branch from 3014209 to 550e4f9 Compare September 29, 2016 21:12
r.backoffMgr.Sleep(time.Duration(seconds) * time.Second)
return false
// ignore timeout if timeout value is zero
if r.timeout == 0 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL at this implementation, it works as expected, however maybe there is a cleaner way to "time out" requests?

@fabianofranz
Copy link
Member

This timeout applies to total time, the same way the higher-level Timeout field of http.Client works, which is what users perceives as "timeout" as in the time it takes for me to get a response.

About defaulting, in case a timeout was not explicitly provided then you just don't set it (not even to 0, forever, etc) and let it use whatever the API defaults to, which is what happens today AFAIK. I believe the defaults are what's in DefaultTransport here, with different values specifically set for dialing, handshaking, etc.

@fabianofranz
Copy link
Member

Also make sure the timeouts work correctly when you are under a proxy with HTTP_PROXY and HTTPS_PROXY set. You may need to tweak the round tripper in pkg/util/httpstream/spdy. Same for commands that do streaming, http upgrade (kubectl exec for example).

@liggitt
Copy link
Contributor

liggitt commented Sep 30, 2016

I'm really unsure about this... many command line commands are made up of multiple API requests, but this is globally hooking the timeout to the client

@enj
Copy link
Contributor

enj commented Sep 30, 2016

Based on the implementation, the timeout is essentially from when the user hits Enter to when they get some output on the CLI (this is not technically not the case but it is close). Thus if a user sets it to 10 seconds, all requests (x10 for the hard-coded retry logic) together should take less than that timeout. Thus the timeout in http.Client will usually not be the one that timesout; it would be the external one in the select. The http.Client is basically being used to curry the timeout value through the code (this probably deserves a harder look).

@juanvallejo
Copy link
Contributor Author

Another idea would be to remove the select timeout, but still set the timeout in the http.Client. That way requests still timeout, should a user want them to, without ignoring the 10 request retries. Although it would no longer be a "total time" timeout in this case, it would still allow for a consistent maximum time when making requests.

@juanvallejo juanvallejo force-pushed the jvallejo/add-global-timeout-flag branch 3 times, most recently from a1ae8f0 to b9030fe Compare September 30, 2016 20:36
@juanvallejo
Copy link
Contributor Author

Current working example:

$ oc whoami # no timeout flag set - default to 0s (which means no timeout)
system:admin

$ oc whoami --timeout=0 # zero means no timeout
system:admin

$oc whoami --timeout=1ms
Unable to connect to the server: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

@@ -185,6 +189,9 @@ func RESTClientFor(config *Config) (*RESTClient, error) {
var httpClient *http.Client
if transport != http.DefaultTransport {
httpClient = &http.Client{Transport: transport}
if config.Timeout != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you allow negatives?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this here: 4a7769f

@@ -89,7 +89,6 @@ func NewCmdScale(f *cmdutil.Factory, out io.Writer) *cobra.Command {
cmd.Flags().Int("current-replicas", -1, "Precondition for current size. Requires that the current size of the resource match this value in order to scale.")
cmd.Flags().Int("replicas", -1, "The new desired number of replicas. Required.")
cmd.MarkFlagRequired("replicas")
cmd.Flags().Duration("timeout", 0, "The length of time to wait before giving up on a scale operation, zero means don't wait. Any other values should contain a corresponding time unit (e.g. 1s, 2m, 3h).")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This timeout means something different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in faf8b4a

@@ -110,9 +110,9 @@ func NewCmdDelete(f *cmdutil.Factory, out io.Writer) *cobra.Command {
cmd.Flags().Bool("cascade", true, "If true, cascade the deletion of the resources managed by this resource (e.g. Pods created by a ReplicationController). Default true.")
cmd.Flags().Int("grace-period", -1, "Period of time in seconds given to the resource to terminate gracefully. Ignored if negative.")
cmd.Flags().Bool("now", false, "If true, resources are force terminated without graceful deletion (same as --grace-period=0).")
cmd.Flags().Duration("timeout", 0, "The length of time to wait before giving up on a delete, zero means determine a timeout from the size of the object")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This timeout means something different

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be okay to rename these timeouts to something like --delete-timeout and --scale-timeout, or would it make more sense to rename the global timeout to --global-timeout?

@juanvallejo juanvallejo changed the title [WIP] UPSTREAM: 0000: add global timeout flag UPSTREAM: 0000: add global timeout flag Oct 3, 2016
@deads2k
Copy link
Contributor

deads2k commented Oct 3, 2016

would it be okay to rename these timeouts to something like --delete-timeout and --scale-timeout, or would it make more sense to rename the global timeout to --global-timeout?

Renaming flags is a pretty big deal for scripts. That's why we have @openshift/api-review take a look at new flags. --request-timeout maybe?

@juanvallejo juanvallejo changed the title UPSTREAM: 0000: add global timeout flag UPSTREAM: 33958: add global timeout flag Oct 3, 2016
@juanvallejo
Copy link
Contributor Author

@deads2k

Renaming flags is a pretty big deal for scripts. That's why we have @openshift/api-review take a look at new flags. --request-timeout maybe?

Thanks for the feedback, I have updated the global flag's name to --request-timeout, I will do the same for the upstream PR.

@juanvallejo juanvallejo force-pushed the jvallejo/add-global-timeout-flag branch 2 times, most recently from 825b616 to 72c9b2a Compare October 6, 2016 21:48
@juanvallejo juanvallejo force-pushed the jvallejo/add-global-timeout-flag branch 3 times, most recently from f731e36 to 4d84495 Compare October 10, 2016 15:55
@juanvallejo
Copy link
Contributor Author

juanvallejo commented Oct 10, 2016

check, integration, conformance tests flaked on #11292 re[test]

@juanvallejo
Copy link
Contributor Author

check, integration, conformance tests flaked on #11292 re[test]

@juanvallejo
Copy link
Contributor Author

integration test flaked on #11240 re[test]

@juanvallejo
Copy link
Contributor Author

integration flaked on #11240 re[test]

@juanvallejo
Copy link
Contributor Author

#8571 re[test]

@juanvallejo juanvallejo force-pushed the jvallejo/add-global-timeout-flag branch 2 times, most recently from 1b779cf to 19b76ba Compare October 11, 2016 19:31
@juanvallejo
Copy link
Contributor Author

#8571 re[test]

@juanvallejo
Copy link
Contributor Author

integration flaked on #11292 re[test]

@juanvallejo
Copy link
Contributor Author

@fabianofranz upstream merged! kubernetes/kubernetes#33958 PTAL

@fabianofranz
Copy link
Member

LGTM. Hold on to your seats, [merge].

@openshift-bot
Copy link
Contributor

openshift-bot commented Oct 18, 2016

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/10202/) (Image: devenv-rhel7_5206)

Related Trello card:
https://trello.com/c/5YkspaNk/711-5-oc-global-timeout-ux-p4

This patch adds a global timeout flag (viewable with `oc options`) with
a default value of `10s`. It removes all local `timeout` flags
previously declared in specific commands, such as `oc delete` and `oc
scale`.
@fabianofranz
Copy link
Member

@juanvallejo failed on outdated completions, please update and I'll tag again.

@juanvallejo juanvallejo force-pushed the jvallejo/add-global-timeout-flag branch from 19b76ba to ec6ef5f Compare October 18, 2016 19:15
@juanvallejo
Copy link
Contributor Author

@fabianofranz Done! rebased with latest master and updated docs / completions

@fabianofranz
Copy link
Member

[merge]

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to ec6ef5f

@openshift-bot
Copy link
Contributor

Evaluated for origin test up to ec6ef5f

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/10202/) (Base Commit: a941850)

@openshift-bot openshift-bot merged commit dc4ba9f into openshift:master Oct 19, 2016
@juanvallejo juanvallejo deleted the jvallejo/add-global-timeout-flag branch October 19, 2016 18:02
@ncdc ncdc mentioned this pull request Nov 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants