-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry certain errors between server and gateway #944
Retry certain errors between server and gateway #944
Conversation
Codecov Report
@@ Coverage Diff @@
## main #944 +/- ##
==========================================
- Coverage 72.30% 72.14% -0.17%
==========================================
Files 65 65
Lines 8001 8041 +40
Branches 1336 1343 +7
==========================================
+ Hits 5785 5801 +16
- Misses 1810 1834 +24
Partials 406 406
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
You checked in the |
Sorry about that. It should be reverted now. What's the best way to avoid that file being updated? I didn't run |
I don't know of a good way, since it is already under source control. It gets updated whenever the npm scripts are run. It isn't a big deal if we accidentally merge it, but better to do it intentionally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As another Rancher Desktop user, this is working nicely for me! Log excerpt below
[I 2022-08-15 13:22:36.841 ServerApp] Attempting retry (1) against endpoint 'http://localhost:51928/api/kernels/2b5d0896-7bfc-422c-9eea-9ec825cb34f8'. Retried error: 'HTTP 599: Stream closed'
[D 2022-08-15 13:22:37.043 ServerApp] Kernel retrieved: {'id': '2b5d0896-7bfc-422c-9eea-9ec825cb34f8', 'name': 'python_kubernetes', 'last_activity': '2022-08-15T18:22:34.067261Z', 'execution_state': 'starting', 'connections': 0}
Thanks! Should this be backported? |
Thank you for the reviews.
I think this will be generally useful, so a back-port would be appreciated. Thank you @blink1073! |
@meeseeksdev please backport to 1.x |
…nd gateway) (#946) Co-authored-by: Kevin Bates <kbates4@gmail.com>
Recently, after switching from Docker Desktop to Rancher Desktop to deploy Enterprise Gateway in Kubernetes, I found a fair number of requests encountering HTTP errors with codes of 599 (timeout/stream closed) as these were not seen previously, nor when run against other K8s clusters. As such, I believe this to be some issue with the port-forwarding implemented in Rancher Desktop. In looking into this, a common practice is to retry a certain class of HTTP and general connection errors. Typically, these kinds of errors are resolved on the first retry.
While implementing various approaches, I found the
Retry
class fromurllib3.utils
which can be used with the requests library coupled with theirHTTP adapters
. Since it doesn't appear Tornado usesrequests
and because I didn't want to introduce another dependency, I decided to implement aRetryableHTTPClient
that contains anAsyncHTTPClient
instance. This class is configured to retry a certain set of HTTPErrors as well as subclasses ofConnectionError
with backoffs similar to that ofRetry
.When a request has failed and is deemed retryable, an informational message will be logged indicating the retry count, the error triggering the retry, and the endpoint on which the error occurred.
If the error is not retryable or the maximum number of retries has been exceeded, the exception is raised (as was the case prior to this change).
Since this class is particular to interaction with a Gateway server the parameters that determine retries are currently not configurable with the exception of the max retry count (which is only configurable using via env
JUPYTER_GATEWAY_MAX_REQUEST_RETRIES
) and defaults to2
with a max value of10
.