Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change flow control for writers #2698

Open
asvetlov opened this issue Jan 30, 2018 · 20 comments
Open

Change flow control for writers #2698

asvetlov opened this issue Jan 30, 2018 · 20 comments

Comments

@asvetlov
Copy link
Member

asvetlov commented Jan 30, 2018

Since writer.write() is a coroutine -- no need for internal writer buffer at all.
On socket's internal send buffer overflow the writer should be paused without trying to fill application level transport buffer -- it makes no sense but wastes extra memory and CPU cycles.

@fafhrd91
Copy link
Member

you should run some benchmarks before 3.0 release

@fafhrd91
Copy link
Member

btw raw results for round 15 techempower benchmarks are ready https://tfb-status.techempower.com/

@fafhrd91
Copy link
Member

fafhrd91 commented Jan 30, 2018

aiohttp consistently outperform sanic :)

despite fact that sanic is super fast

@asvetlov
Copy link
Member Author

@fafhrd91 yes, sure.
As side effect of enforcing write to bufferless coroutine I expect getting rid of reported aiohttp problems like #2669

@asvetlov
Copy link
Member Author

@fafhrd91 is there a page with human readable results?
My brain does not parse JSON easy.

@fafhrd91
Copy link
Member

fafhrd91 commented Jan 30, 2018

write buffer can help with performance.

what I do in actix is write everything to buffer and flush this buffer only once per event loop iteration. so for example if you have multiple request in processing queue, you can write all response once before pushing data to a socket. only with this optimization actix can process 1m requests a second

@fafhrd91
Copy link
Member

human readable results are not ready yet. you should open json file in Firefox, it shows json nicely, at least nightly Firefox.

@fafhrd91
Copy link
Member

@asvetlov here is script

import json

#data = open("results.2018-01-27-17-41-53-405.json", 'r').read()
#data = open("results.2018-01-27-17-41-53-405.json", 'r').read()
data = open("results.2018-01-18-09-55-11-468.json", 'r').read()
data = json.loads(data)
raw_data = data['rawData']

res = {}
for key in ['fortune', 'plaintext', 'db', 'update', 'json', 'query']:
    results = []
    d = raw_data[key]
    for name, val in d.items():
        #if len(val) < 5:
        #    continue
        info = val[0]
        s, e = info['startTime'], info['endTime']
        if 'totalRequests' not in info:
            continue

        reqs = info['totalRequests']
        persec = reqs / (e-s)
        results.append((persec, name, reqs))

    results.sort()
    res[key] = [(name, r1, r2) for (r1, name, r2) in list(reversed(sorted(results)))]

print(json.dumps(res))

@pfreixes
Copy link
Contributor

I would suggest sending the headers together - if these haven't been sent explicitly - with the first data sent by the write.

@fafhrd91
Copy link
Member

another question, is there reason. aiohttp can not compete with frameworks from top no matter what you do.

@pfreixes
Copy link
Contributor

BTW just taking a look to your data

{"fortune": [["aiohttp", 6759, 108157], ["flask", 4582, 68730]], "plaintext": [["apistar", 697060, 11152966], ["aiohttp", 190787, 2861808], ["flask", 112457, 1686858], ["sanic", 41, 618]], "db": [["aiohttp", 8733, 131001], ["flask", 6854, 102814]], "update": [["aiohttp", 17710, 265650], ["flask", 10218, 153276]], "json": [["apistar", 97053, 1455807], ["sanic", 52048, 780729], ["aiohttp", 33589, 503845], ["flask", 22467, 337011]], "query": [["aiohttp", 35650, 570412], ["flask", 14858, 222872]]}

Looks like apistar is doing a hard work, it's hard to believe. Also worth mentioning that looks like might be some inconsistencies seeing some sanic numbers for fortune - BTW what does it mean?

@fafhrd91
Copy link
Member

I doubt apistar use much of python inside :). same as japronto, it is fast but python is used for loading c-extension.

I don't know about sanic, it should be fast in json and plaintext, but it still can not process two sequential requests.

@fafhrd91
Copy link
Member

new use case for python: a shell for .so files :)

if you need real speed check my actix-web framework

@pfreixes
Copy link
Contributor

You got a new star.

@asvetlov
Copy link
Member Author

asvetlov commented Jan 30, 2018

Returning to buffering: now aiohttp does 2 syscalls if kernel buffer is not full, with switching to storing sent bytes in user-space memory on overflow and pausing if internal buffer is overflown too.
Sending headers with first data chunk is doable, but it is a separate task.
What I'm proposing is dropping internal buffer from the pipeline.

@fafhrd91
Copy link
Member

I think dropping buffer is fine.

merging header with first chunk could specific optimization, I think this optimization could be used for responses where body already is set. also if we give ability to send first chunk with headers to developer, event for streaming responses, that would be enough for optimization (happy path) all other types of responses could be processed as separate transport calls.

@fafhrd91
Copy link
Member

also, if handler does some IO operation, separate syscall for headers is fine

@hubo1016
Copy link
Contributor

hubo1016 commented Feb 6, 2018

It depends on the usage pattern. If the user is trying to write very small data pieces, It is usually better to merge them in a small buffer before sending them to kernel. In this situation, I would personally prefer a small (4KB) internal buffer, merge and send them at once, then drain the buffer before send again.
But I think asyncio stream writer is already doing this? I didn't read the code and is not familiar with the current implementation.

@asvetlov
Copy link
Member Author

asvetlov commented Feb 6, 2018

Say again, asyncio streams are relative simple objects, they don't merge several write() calls into single sock.send() syscall unless socket write buffer is not full.
In other words streams don't merge small data buffers most likely, at least under moderate load. High load is a big separate question but I doubt if buffers merging is crucial for the case anyway.

@hubo1016
Copy link
Contributor

hubo1016 commented Feb 6, 2018

Maybe some benchmark is necessary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants