Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewind filehandle request bodies before retrying requests #1444

Merged
merged 1 commit into from
May 23, 2024
Merged

Conversation

jwodder
Copy link
Member

@jwodder jwodder commented May 22, 2024

Closes #1408.

Here's what was going on: Occasionally, requests to S3 to upload a Zarr entry fail with a 500 status due to an internal error on S3's end, causing dandi-cli to retry the request. When it retries the request, it calls session.request() again, passing in the same arguments as before, which include an open filehandle for reading the Zarr entry from disk. However, the filehandle was already read to the end on the initial request (the one that resulted in a 500); thus, although requests's super_len() obtains the correct value for the file's length, it then subtracts the filehandle's current position (the end of the file) from this length to get the number of bytes that would be produced by reading from the current position to the end of file: zero — and, as previously established, when super_len() returns 0, requests falls back to "chunked" transfer encoding, which S3 responds to with a 501 error about "header implies functionality not implemented" (hereafter "HIFNI").

While this patch should eliminate most instances of the HIFNI problem, it is still conceivable that the original hypothesized cause — NFS erroneously reporting filesizes as zero — could occur. @yarikoptic How much of the previously-added infrastructure for dealing with this problem should we keep around after this?

@jwodder jwodder added patch Increment the patch version when merged cmd-upload zarr labels May 22, 2024
@jwodder jwodder requested a review from yarikoptic May 22, 2024 23:25
@jwodder jwodder added the HIFNI Zarr uploads failing with "A header you provided implies functionality that is not implemented" label May 22, 2024
Copy link

codecov bot commented May 22, 2024

Codecov Report

Attention: Patch coverage is 50.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 88.61%. Comparing base (722f1b6) to head (1910e8a).
Report is 54 commits behind head on master.

Files Patch % Lines
dandi/dandiapi.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1444   +/-   ##
=======================================
  Coverage   88.61%   88.61%           
=======================================
  Files          77       77           
  Lines       10563    10565    +2     
=======================================
+ Hits         9360     9362    +2     
  Misses       1203     1203           
Flag Coverage Δ
unittests 88.61% <50.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jwodder jwodder marked this pull request as ready for review May 23, 2024 12:39
@yarikoptic
Copy link
Member

Well, this is quite a serious digging! Thank you @jwodder!

I think we should keep the machinery available for now since IIRC there should be no performance hit, but it might still come handy happen we run into such a situation again for one reason or another.

Should we close #1408 altogether with this PR or you think there would be more?

@@ -233,6 +233,8 @@ def request(
url,
result.text,
)
if data is not None and hasattr(data, "seek"):
data.seek(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could/should we may be add a test which would trigger such a case by e.g. shimming self.session.request and always failing upon first try after reading the file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

responses doesn't seem to support requests where data is a filehandle: getsentry/responses#719

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, if no easy way to test ATM, I am ok to proceed to see it fixed and others to give it a shot. Adding release label as well.

@jwodder
Copy link
Member Author

jwodder commented May 23, 2024

@yarikoptic

Should we close #1408 altogether with this PR or you think there would be more?

Unless you manage to get the error again with this patch (and the run on smaug is currently on its seventh upload without any problems), we can probably close it.

@jwodder jwodder marked this pull request as draft May 23, 2024 15:30
@yarikoptic yarikoptic added the release Create a release when this pr is merged label May 23, 2024
@yarikoptic
Copy link
Member

should we just undraft and merge it?

@jwodder jwodder marked this pull request as ready for review May 23, 2024 20:12
@jwodder
Copy link
Member Author

jwodder commented May 23, 2024

@yarikoptic Undrafted.

@yarikoptic yarikoptic merged commit d9abf1d into master May 23, 2024
27 of 28 checks passed
@yarikoptic yarikoptic deleted the rewind branch May 23, 2024 22:22
Copy link

🚀 PR was released in 0.62.1 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cmd-upload HIFNI Zarr uploads failing with "A header you provided implies functionality that is not implemented" patch Increment the patch version when merged release Create a release when this pr is merged released zarr
Projects
None yet
2 participants