Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Storage]fix retry on large block upload #17909

Merged
merged 1 commit into from
Apr 20, 2021

Conversation

xiafu-msft
Copy link
Contributor

No description provided.

@ghost ghost added the Storage Storage Service (Queues, Blobs, Files) label Apr 8, 2021
@tasherif-msft
Copy link
Contributor

tasherif-msft commented Apr 15, 2021

Just to confirm this fix is related to https://github.com/Azure/azure-sdk-for-python/pull/17078/files? I think its important to give a little bit of description to this PR for future reference.

As far as I understand, core introduced a change to retry when blob content downloading is interrupted. This code path (for the retry) never really worked and a bug was introduced where incomplete downloads can occur without any exception raised.

With the PR I mentioned above core now removed the retries on their side and fixed the exception issue (so now it raises if download is interrupted).
Your PR is now allowing us to do the retries on our end.

Could you please confirm that my description here is accurate?
Related PRs: https://github.com/Azure/azure-sdk-for-python/pull/17078/files and #16783
Related Issues: #16980

@xiafu-msft
Copy link
Contributor Author

@tasherif-msft good suggestion! Also I will make this pr into two since these two commits are not related, then I will quote your description there

@@ -54,7 +54,6 @@ async def send(self, request):
request.context.options.pop('raw_response_hook', self._response_callback)

response = await self.next.send(request)
await response.http_response.load_body()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the point of this line anyway? I assume the line above it already ensures this is done anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is to load all data in stream, even we've already specifies stream=True.

self._encryption_options
)
retry_active = False
except (requests.exceptions.ChunkedEncodingError, requests.exceptions.ConnectionError) as error:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is ContentDecodingError also a possibility here? I saw that it can be raised in

        def generate():
            # Special case for urllib3.
            if hasattr(self.raw, 'stream'):
                try:
                    for chunk in self.raw.stream(chunk_size, decode_content=True):
                        yield chunk
                except ProtocolError as e:
                    raise ChunkedEncodingError(e)
                except DecodeError as e:
                    raise ContentDecodingError(e)
                except ReadTimeoutError as e:
                    raise ConnectionError(e)

Just want to make sure its not applicable here

Copy link
Contributor Author

@xiafu-msft xiafu-msft Apr 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example if you get a partial gzip file and try to decode you will get ContentDecodingError, we want to throw immediately since retry wouldn't help.

# so when we retry we don't want to read from current position of wrapped stream,
# instead we should seek to where we want to read from.
if self._wrapped_stream.tell() != absolute_position:
self._wrapped_stream.seek(absolute_position, SEEK_SET)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self: we could also always seek but this is bad because it wouldn't be as efficient (as Emily explained offline)

Copy link
Contributor

@tasherif-msft tasherif-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)

@xiafu-msft xiafu-msft merged commit 390e17c into Azure:master Apr 20, 2021
openapi-sdkautomation bot pushed a commit to AzureSDKAutomation/azure-sdk-for-python that referenced this pull request Mar 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants