Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose the response's content buffering limit for lines longer than 128KB #4456

Closed
wants to merge 1 commit into from
Closed

Expose the response's content buffering limit for lines longer than 128KB #4456

wants to merge 1 commit into from

Conversation

nolar
Copy link

@nolar nolar commented Dec 19, 2019

What do these changes do?

Solve the problem with "Line is too long" for JSON-lines streaming responses (Kubernetes API and similar), as described in #4453.

Are there changes in behavior for the user?

An extra optional kwarg is exposed for the requests, which affects the response's content buffering:

        response = await session.get('http://xyz/path', limit=1*1024*1024)

The value of 0 means no limit — which can affect the memory footprint:

        response = await session.get('http://xyz/path', limit=0)

The default is 2**16, as it was before, which leads to the buffer size limit of 128 KB (the high-watermark is 2x of the limit itself).

PR notes

I have doubts on the following topics — it would be nice to clarify them:

  • Cython & .pyx files — I have no experience with this, just changed like the surrounding code.
  • Should it be named limit= for the users? Or maybe response_buffer_limit=?
  • What's the purpose of the low-watermark, of the actual memory consumed got up to 2x of that limit, not as would be expected by the users?
  • Monkey-patching of DEFAULT_LIMIT (if it was used by anyone) will break now, as the imports are done by-value into multiple modules. I usually use the Google-style imports (from pkg import mod; mod.VAL), which work with any monkey-patching quite well. I didn't find such cases in aiohttp, so I'm not sure if this will match the code style.

The code is developed and manually tested against v3.6.2. The master-based version is just a rebase, in the assumption that it works. I could not test the master/v4.0.0a1 code, as it break aresponses completely, and the example uses aresponses to have a minimalistic web server locally.

The tests (make test) on v3.6.2 & v4.0.0a1 fail with ValueError: option names {'--aiohttp-fast'} already added — some help would be needed to make them run.

Related issue number

#4453

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
    • The format is <Name> <Surname>.
    • Please keep alphabetical order, the file is sorted by names.
  • Add a new news fragment into the CHANGES folder
    • name it <issue_id>.<type> for example (588.bugfix)
    • if you don't have an issue_id change it to the pr id after creating the pr
    • ensure type is one of the following:
      • .feature: Signifying a new feature.
      • .bugfix: Signifying a bug fix.
      • .doc: Signifying a documentation improvement.
      • .removal: Signifying a deprecation or removal of public API.
      • .misc: A ticket has been closed, but it is not of interest to users.
    • Make sure to use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files."

@codecov-io
Copy link

codecov-io commented Dec 19, 2019

Codecov Report

Merging #4456 into master will increase coverage by 0.03%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4456      +/-   ##
==========================================
+ Coverage   97.48%   97.51%   +0.03%     
==========================================
  Files          43       43              
  Lines        8865     8866       +1     
  Branches     1390     1390              
==========================================
+ Hits         8642     8646       +4     
+ Misses         98       96       -2     
+ Partials      125      124       -1
Impacted Files Coverage Δ
aiohttp/streams.py 97.4% <100%> (ø) ⬆️
aiohttp/client_proto.py 96.66% <100%> (ø) ⬆️
aiohttp/http_parser.py 96.88% <100%> (ø) ⬆️
aiohttp/pytest_plugin.py 97.27% <0%> (+2.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e623222...362cffa. Read the comment docs.

@webknjaz
Copy link
Member

Thanks for the PR! Let's see what Andrew thinks.

@asvetlov
Copy link
Member

I think that for exceptional cases like yours you can always read the whole text and split it by lines explicitly if needed.
Passing the line size limit through the whole API is cumbersome, it is done for very little effect and affects too few people.

@nolar
Copy link
Author

nolar commented Jan 24, 2020

That's the point: I cannot read the whole text. This is a stream of JSON lines, one line per event as soon as these events happen. The stream never ends and the connection never disconnects (in theory; in practice, there are timeouts and unstable networks).

So, I never know in advance how big is the response to read, and I should handle the incoming lines immediately, while the connection can remain alive but contentless even hours after that.

Currently, I have implemented it with my own line reader (https://github.com/zalando-incubator/kopf/pull/276/files), which is fine for me, but it looks like a job of the HTTP client library: the use-case of streaming JSON lines is not so rare, I presume. It is, however, for you to decide on should it be part of the library or not.

@asvetlov
Copy link
Member

Your solution is fine, it is only 20 lines of simple code.
Adding the feature to aiohttp makes the source much more complex and harder to understand, at least with the current design of streams.

@nolar
Copy link
Author

nolar commented Jan 31, 2020

Okay. Then, it makes sense to close this PR. Thank you for your attention.

@nolar nolar closed this Jan 31, 2020
@nolar nolar deleted the 4453-line-is-too-long branch January 31, 2020 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants