Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PULP-208] Disable download retry with on demand streaming #6087

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGES/5937.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Disable retry logic on the context of content-app on-demand streaming, as we can't recover
from any errors after starting the streaming process (chunked transfer).
4 changes: 3 additions & 1 deletion pulpcore/content/handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -1138,7 +1138,9 @@ async def finalize():
original_finalize = downloader.finalize
downloader.finalize = finalize
try:
download_result = await downloader.run()
download_result = await downloader.run(
extra_data={"disable_retry_list": (DigestValidationError,)}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about "code in the domain of the problem", I'm wondering if we should just say "This downloader is streaming" here, and let the downloader.run decide decide how that impacts retrying.
At least we would have all the determination of what is retryable and what is not in one place.
run is an external interface, right? So we kind of need to do it right now...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like the downloader is supposed to be a more generic object.
For example, the downloader doesnt know about the streaming we are doing here so we need to explicit patch and add conditions to handle it as this context requires. If we want to shift approaches (make the downloader context-aware), we should move these to the downloader aswell and make it a bit more complex.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair point.
Actually we monkeypatch the downloader here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the downloader code is a mess already so this doesn't make it particularly worse. It would be nice to specialize the downloader object in such a way that we don't need any of this directly at the call site. But that's a more involved refactor, so I'd consider it out of scope for a bugfix. Maybe something for the future.

)
except DigestValidationError:
await downloader.session.close()
close_tcp_connection(request.transport._sock)
Expand Down
11 changes: 8 additions & 3 deletions pulpcore/download/http.py
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using extra_data instead of creating a new keyword arg retry=True because adding a new kwarg breaks some parts of the test. So using extra_data is the less friction path.

Also, it looks like that's the purpose of extra_data anyway.

Original file line number Diff line number Diff line change
Expand Up @@ -214,20 +214,22 @@ async def _handle_response(self, response):

async def run(self, extra_data=None):
"""
Run the downloader with concurrency restriction and retry logic.
Run the downloader with concurrency restriction and optional retry logic.

This method acquires `self.semaphore` before calling the actual download implementation
contained in `_run()`. This ensures that the semaphore stays acquired even as the `backoff`
wrapper around `_run()`, handles backoff-and-retry logic.

Args:
extra_data (dict): Extra data passed to the downloader.
extra_data (dict): Extra data passed to the downloader:
disable_retry_list: List of exceptions which should not be retried.

Returns:
[pulpcore.plugin.download.DownloadResult][] from `_run()`.

"""
retryable_errors = (
disable_retry_list = [] if not extra_data else extra_data.get("disable_retry_list", [])
default_retryable_errors = (
aiohttp.ClientConnectorSSLError,
aiohttp.ClientConnectorError,
aiohttp.ClientOSError,
Expand All @@ -240,6 +242,9 @@ async def run(self, extra_data=None):
SizeValidationError,
)

retryable_errors = tuple(
[e for e in default_retryable_errors if e not in disable_retry_list]
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has to be a tuple or a single exception.
Passing anything else raises a not-helpful error.

async with self.semaphore:

@backoff.on_exception(
Expand Down