Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fileresponse content-range header #2847

Merged
merged 22 commits into from
Mar 21, 2018
Merged

fileresponse content-range header #2847

merged 22 commits into from
Mar 21, 2018

Conversation

spcharc
Copy link
Contributor

@spcharc spcharc commented Mar 18, 2018

Feel free to close this request if it is not properly written. I'm still learning how to use github

What do these changes do?

  1. Now aiohttp provides Content-Range header for Range requests
  2. Accept-Ranges:bytes for files
  3. Out-of-range start-range requests now cause HTTP 416 Request Range Not Satisfiable according to rfc7233
    for instance, file.size() == 200
    but client gives this header request.header['Range'] == 'bytes=99999-'
  4. Out-of-range tail-range requests would not break FileResponse now
    for instance, file.size() == 200
    but client gives this header request.header['Range'] == 'bytes=-99999'
  5. Two reify(s): (in web_request.py)
    Request.if_unmodified_since
    Request.if_range
    
    Why not one function to handle them all (including Request.if_modified_since)?
    It seems that reify class uses function.__name__ as the key to store results ...
    Their code is just the same with if_modified_since actually
    Maybe I should create a _private_func (of course not a reify decorated one) and use that. Copy and paste the same piece of code is not interesting.
  6. Some more test cases. They send requests with If-Range or If-Unmodified-Since now
    Besides, some existing test functions get updated. They check Content-Range header now.

Are there changes in behavior for the user?

New headers returned
Out-of-range start-range requests result in HTTP 416
these are behavior changes, nothing more

Related issue number

#2844 However it won't be fully solved, as multipart/byteranges is still not supported.
multipart/byteranges support involves:

  1. changing the behavior of request.http_range function.
    It should be able to handle multiple ranges and return a tuple of slices
    This will break existing code.
  2. It seems that MultipartWriter in multipart.py need to be updated to handle multipart/byteranges

Checklist

  • [?] I think the code is well written

I think this question is for you. Everbody considers him/her-self good code writers
I added a few comments though. I think the comments are well written

  • [Y] Unit tests for the changes exist

Old test cases modified + new ones

  • [Y] Documentation reflects the changes

Request.if_range and Request.if_unmodified_since are now reflected in Server->reference page

  • [Y] If you provide code modification, please add yourself to CONTRIBUTORS.txt
    • The format is <Name> <Surname>.
    • Please keep alphabetical order, the file is sorted by names.

I didn't know about this until everything is ready for a pull request. Went back and updated it

  • [Y] Add a new news fragment into the CHANGES folder
    • name it <issue_id>.<type> for example (588.bugfix)
    • if you don't have an issue_id change it to the pr id after creating the pr
    • ensure type is one of the following:
      • .feature: Signifying a new feature.
      • .bugfix: Signifying a bug fix.
      • .doc: Signifying a documentation improvement.
      • .removal: Signifying a deprecation or removal of public API.
      • .misc: A ticket has been closed, but it is not of interest to users.
    • Make sure to use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files."

OK. I went back and added 2844.feature
file content:"Provide Content-Range header for Range requests"
actually it's an improvement for an existing feature, not a new one

Question

In web_fileresponse.py, line 170
self._length_check = False
What is this for? I failed to figure out what it is.
_length_check is in StreamResponse(web_response.py). I commented out this line and everything still worked. That's strange
line 170 is returning an http error.
Returning other http errors in FileResponse do not involve setting _length_check to False.

Edit: Question solved. Actually I had made a grep -r _length_check search and found it in web_response.py before I made this PR. But I failed to understand that users could also set the Content-Length header at that time.

@codecov-io
Copy link

codecov-io commented Mar 18, 2018

Codecov Report

Merging #2847 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2847      +/-   ##
==========================================
+ Coverage   97.97%   97.97%   +<.01%     
==========================================
  Files          39       39              
  Lines        7444     7461      +17     
  Branches     1307     1308       +1     
==========================================
+ Hits         7293     7310      +17     
  Misses         48       48              
  Partials      103      103
Impacted Files Coverage Δ
aiohttp/web_request.py 99.69% <100%> (ø) ⬆️
aiohttp/web_fileresponse.py 98.01% <100%> (+0.14%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4f8ea94...be63235. Read the comment docs.

@spcharc spcharc changed the title Fileresp header fileresponse content-range header Mar 18, 2018
@spcharc
Copy link
Contributor Author

spcharc commented Mar 18, 2018

According to codecov, in web_fileresponse.py:
line 260:
if count:
this line is always true so the return line following it does not do anything ... oh no!
I should have noticed it
actually make cov failed to run on my computer. It kept telling me that it requires python version >= 3.5.3 to run but I'm using 3.6.3. It also raised ImportError 'No module async_generator found' but it is already installed. Weird.

@spcharc
Copy link
Contributor Author

spcharc commented Mar 18, 2018

Sorry about the commits
Now the coverage would be a little better

@asvetlov
Copy link
Member

@spcharc thank you for contribution.
I'll review in my first free time window.

@spcharc
Copy link
Contributor Author

spcharc commented Mar 19, 2018

Thank you!
Feel free to tell me if there's anything wrong. I'll try fixing it ASAP.

@spcharc
Copy link
Contributor Author

spcharc commented Mar 19, 2018

Now Content-Range is also provided when HTTP 416 is returned.
file.size() == 200
client request:headers['Range'] = 'bytes=500-'
server response:headers['Content-Range'] == 'bytes */200' <- tell client the actual size/allowed range
It seems that this is necessary as indicated in rfc7233
However I investegated several sites around the web, some belongs to large companies, and found out they usually do not provide Content-Range with HTTP 416. (Neither do many of them support multipart/byteranges. Some return 416 some even return 403 in this case. 403 Forbidden! what a simple solution. I like it.)

No test case change for this commit (577c81a).

@spcharc
Copy link
Contributor Author

spcharc commented Mar 20, 2018

No more changes of this PR beyond this point. ETag not supported yet (seems unnecessary)
Tested multiple kinds of stream video players / browsers / download managers, but none of them sends multi-range requests. multipart/byteranges support seems unnecessary too

@asvetlov
Copy link
Member

Why not one function to handle them all (including Request.if_modified_since)?
It seems that reify class uses function.name as the key to store results ...
Their code is just the same with if_modified_since actually
Maybe I should create a _private_func (of course not a reify decorated one) and use that. Copy and paste the same piece of code is not interesting.

It's up to you. I'm ok with _private_func for the case (we already have an exception for Content-Type handling.

@asvetlov
Copy link
Member

asvetlov commented Mar 20, 2018

I'm ok with not supporting multi-range requests.

Copy link
Member

@asvetlov asvetlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just small improvements needed.

@@ -166,7 +167,12 @@ def __init__(self, path, chunk_size=256*1024, *args, **kwargs):
modsince = request.if_modified_since
if modsince is not None and st.st_mtime <= modsince.timestamp():
self.set_status(HTTPNotModified.status_code)
self._length_check = False
# self._length_check = False <- why this line is needed?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://github.com/aio-libs/aiohttp/blob/master/aiohttp/web_response.py#L340
Looks like is is important for chunked encoding, I don't recall details honestly.
Maybe @fafhrd91 can shed a light?

Returns :class:`datetime.datetime` or ``None`` if
*If-Unmodified-Since* header is absent or is not a valid
HTTP date.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here should be ..versionadded:: 3.1 tag.

Returns :class:`datetime.datetime` or ``None`` if
*If-Range* header is absent or is not a valid
HTTP date.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. versionadded::

@spcharc
Copy link
Contributor Author

spcharc commented Mar 20, 2018

Thanks for review.
I'll adding version 3.1 tag right now
for content _length_check, if headers['Content-Length'] is set, the flag deletes it. because returning 304 should not contain any body
but content-length header is set after that line. (line 269 in web_fileresponse.py)

@spcharc
Copy link
Contributor Author

spcharc commented Mar 20, 2018

I suddenly realized that content-length could be set by a user!
If you do not use add_static('/static_path/', static_folder) but directly return web.FileResponse(path, headers={'Content-Length':'100'}), this would cause problem.
You are right. I'll modify it

@asvetlov asvetlov merged commit 037ccbc into aio-libs:master Mar 21, 2018
@asvetlov
Copy link
Member

@spcharc thank you, very high quality work!

@lock
Copy link

lock bot commented Oct 28, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs.
If you feel like there's important points made in this discussion, please include those exceprts into that [new issue].
[new issue]: https://github.com/aio-libs/aiohttp/issues/new

@lock lock bot added the outdated label Oct 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 28, 2019
@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Oct 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bot:chronographer:provided There is a change note present in this PR outdated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants