Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref(download): Retry downloads to S3/GCS/HTTP/local FS sources #485

Merged
merged 4 commits into from
Jul 12, 2021

Conversation

relaxolotl
Copy link
Contributor

@relaxolotl relaxolotl commented Jul 8, 2021

Rudimentary retry logic has been added to requests to download files from all supported sources. The existing retry implementation for Sentry-hosted sources was used as a template for these changes.

This branches off of the changes made in #483, and applies changes made in #395 to all sources. It may be useful to consider whether we want to deploy these together with circuit breaking as implemented in #397, but I'm not sure if I'm the right person to make that call.

Some concerns about the implementation:

  • If the stream encounters any issues and exits early, the current retry logic may be flawed in that it re-attempts both the initial request as well as the stream from the beginning. This would be particularly painful if a stream errors out just before it is complete.
  • The retry counter is shared between both the initial request to the source as well as the stream. It may not be a good idea to do so, and in the future it might be preferable to have different retry counters for those two requests.

I'm unsure if these two issues are blockers for the change though. I don't believe this change necessarily makes it more difficult to fix either of those making them non-blockers, but I'm interested in what others might think about these two problems.

Copy link
Member

@Swatinem Swatinem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have quite little insight into this, but no objections

Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK this now does exactly the same for all the sources. Would it make sense to move the retry logic out of the source-specific code and write it once in mod.rs?

@@ -63,6 +64,38 @@ impl FilesystemDownloader {
&self,
file_source: FilesystemRemoteDif,
dest: PathBuf,
) -> Result<DownloadStatus, DownloadError> {
let retries =
future_utils::retry(|| self.download_source_once(file_source.clone(), dest.clone()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are filesystem failures really worth retrying? The chances of them improving are very slim I'd say. Though it also doesn't do any harm I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point: I figured it'd be better to just make sure these are all consistent instead of making more exceptions for filesystem downloads. It does add some amount of complexity to their implementation, but I think the tradeoff is fine for now.

@flub
Copy link
Contributor

flub commented Jul 9, 2021

  • If the stream encounters any issues and exits early, the current retry logic may be flawed in that it re-attempts both the initial request as well as the stream from the beginning. This would be particularly painful if a stream errors out just before it is complete.

I think this is fine, AFAIK there's nothing that can be done about that though? If the stream fails you always need to make the GET request again from a network point of view i think?

  • The retry counter is shared between both the initial request to the source as well as the stream. It may not be a good idea to do so, and in the future it might be preferable to have different retry counters for those two requests.

I think for retrying its fine to not worry why we are retrying. (though let's see how long before i have to retract that statement 😉 )

@flub
Copy link
Contributor

flub commented Jul 9, 2021

AFAIK this now does exactly the same for all the sources. Would it make sense to move the retry logic out of the source-specific code and write it once in mod.rs?

If you disagree with this and would prefer to merge as-is, I'm not going to object either btw 😃

Base automatically changed from feat/src-dl-metrics to master July 9, 2021 13:59
Copy link
Member

@Swatinem Swatinem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will un-approve, lets rebase this and maybe unify the retry logic and move it into DownloadService::download even?

@relaxolotl
Copy link
Contributor Author

I'm going to rebase this off of #489 since both of these PRs touch the same places in the code.

@relaxolotl relaxolotl changed the base branch from master to feat/streaming-timeout July 9, 2021 17:00
@relaxolotl
Copy link
Contributor Author

Rebased off of #489 and unified the retry logic, as suggested by @flub and @Swatinem 🎉 A final review from either of you would be great, especially since the diff has changed significantly in structure.

I'm going to pull in @loewenheim just to make sure I haven't mucked anything up for him: Could I bug you to take a quick look over the changes here?

RemoteDif::Filesystem(inner) => self.fs.download_source(inner, destination).await,
let result = match &source {
RemoteDif::Sentry(inner) => {
future_utils::retry(|| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the retry call up and around the match. I see no reason why it should be duplicated in each match arm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gotcha to this is that I'm unsure how to coerce the types so that I can wrap the match in the retry. Moving it up a level produces the following error:

`match` arms have incompatible types
while checking the return type of the `async fn`
    expected type `impl futures::Future` (opaque type at <crates/symbolicator/src/services/download/sentry.rs:245:10>)
found opaque type `impl futures::Future` (opaque type at <crates/symbolicator/src/services/download/http.rs:71:10>)
distinct uses of `impl Trait` result in different opaque types

The retry expects a closure that returns a Future, but the futures returned by every specific download_source implementation are all different. In other words, if we were to move the retry up to wrap the match, every branch of that match is returning a different Future implementation.

I could be missing a trick here to make this work. There's future::Either that might work, but it only combines two Futures and I'd have to chain several of them for this particular use case. Thoughts?

Something to note is that the original implementation also awaits on the future inside of every match branch to narrow the type down to just Result<DownloadStatus, DownloadError>, presumably to side-step the same issue.

Base automatically changed from feat/streaming-timeout to master July 12, 2021 11:22
Rudimentary retry logic has been added to requests to download files
from all supported sources. The existing retry implementation for
Sentry-hosted sources was used as a template for these changes.
@relaxolotl relaxolotl merged commit f8ac705 into master Jul 12, 2021
@relaxolotl relaxolotl deleted the ref/retry-source-dls branch July 12, 2021 15:29
relaxolotl added a commit that referenced this pull request Jul 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants