-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PULP-208] Disable download retry with on demand streaming #6087
[PULP-208] Disable download retry with on demand streaming #6087
Conversation
e14d8e5
to
ebcfb34
Compare
ebcfb34
to
76c5484
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm using extra_data
instead of creating a new keyword arg retry=True
because adding a new kwarg breaks some parts of the test. So using extra_data
is the less friction path.
Also, it looks like that's the purpose of extra_data
anyway.
pulpcore/download/http.py
Outdated
|
||
async with self.semaphore: | ||
|
||
@backoff_decorator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't there exceptions we would still like to retry with, like connection errors before we even started sending headers back to the client?
Also as a matter of style, I'd rather conditionally apply the decorator (func = decorator(func)
) instead of creating a noop-decorator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't there exceptions we would still like to retry with, like connection errors before we even started sending headers back to the client?
Yes, you are right.
So I guess the approach here could be passing what exceptions we want to retry (or exclude) from the default list. IHO exclude is more expressive for the probem.
Wdyt?
Our RemoteArtifact streaming uses chunked transfer, so once it has started a response, we can't fix anything anymore in case an error. Because of that, the retry logic (any of the timeout, digest, etc) on the http downloader is just delaying a response that can't be possible right anymore. Closes pulp#5937
ddd4e20
to
caad06c
Compare
@@ -240,6 +242,9 @@ async def run(self, extra_data=None): | |||
SizeValidationError, | |||
) | |||
|
|||
retryable_errors = tuple( | |||
[e for e in default_retryable_errors if e not in disable_retry_list] | |||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has to be a tuple or a single exception.
Passing anything else raises a not-helpful error.
@@ -1138,7 +1138,9 @@ async def finalize(): | |||
original_finalize = downloader.finalize | |||
downloader.finalize = finalize | |||
try: | |||
download_result = await downloader.run() | |||
download_result = await downloader.run( | |||
extra_data={"disable_retry_list": (DigestValidationError,)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about "code in the domain of the problem", I'm wondering if we should just say "This downloader is streaming" here, and let the downloader.run decide decide how that impacts retrying.
At least we would have all the determination of what is retryable and what is not in one place.
run
is an external interface, right? So we kind of need to do it right now...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like the downloader is supposed to be a more generic object.
For example, the downloader doesnt know about the streaming we are doing here so we need to explicit patch and add conditions to handle it as this context requires. If we want to shift approaches (make the downloader context-aware), we should move these to the downloader aswell and make it a bit more complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair point.
Actually we monkeypatch the downloader here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the downloader code is a mess already so this doesn't make it particularly worse. It would be nice to specialize the downloader object in such a way that we don't need any of this directly at the call site. But that's a more involved refactor, so I'd consider it out of scope for a bugfix. Maybe something for the future.
#5937
Testing
I dont know how to write an automated test for that, but I've asserted it through logs by running the
test_remote_content_changed_with_on_demand
, which causes a digest validation error in the on-demand streaming context.Before:
After: