-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable TCP_NODELAY for all synchronous sockets. #651
Conversation
3d7fad4
to
635db1e
Compare
The widely documented poor interaction between the Nagle algorithm and TCP's delayed ACK feature, when making short successive writes, leads to unnecessary delays (around 50ms on Linux). This happens in httpcore whenever a POST request is made, since the headers and body are sent as two separate calls. The TCP_NODELAY option can be enabled to disable Nagle's algorithm, and therefore avoid this delay. The option is already enabled by default for all asyncio and Trio sockets. It is also enabled by CPython's http.client module (which urllib and requests use) and by many high-level HTTP libraries found in other languages, including libcurl, Java's Netty, Rust's reqwest and Go's standard library, to name a few.
635db1e
to
54f2506
Compare
Thanks, seems like this might be worth review. I wasn't aware that |
There was some discussion about this recently in Go; golang/go#57530 and https://withinboredom.info/blog/2022/12/29/golang-is-evil-on-shitty-networks/ |
This is the relevant bit in urllib3: https://github.com/urllib3/urllib3/blob/0b1188985b747ae50d7d4f9750f4e5ec46462517/src/urllib3/connection.py#L107 https://github.com/urllib3/urllib3/pull/283/files Similar for http.client (which I mistakenly claimed was used by requests): https://github.com/python/cpython/blame/94fc7706b7bc3d57cdd6d15bf8e8c4499ae53a69/Lib/http/client.py#L955 |
One option here would be to start by adding a https://www.encode.io/httpcore/connection-pools/#other-options I'm also curious if we'd be able to demonstrate a difference in performance with |
Here's a sample code to measure the performance: import httpx
import requests
import asyncio
from time import perf_counter
N = 100
URL = "http://127.0.0.1:8000"
def sync_version():
with httpx.Client() as client:
before = perf_counter()
for i in range(N):
client.post(URL, content=b"Hello")
after = perf_counter()
return (after-before) / N
async def async_version():
async with httpx.AsyncClient() as client:
before = perf_counter()
for i in range(N):
await client.post(URL, content=b"Hello")
after = perf_counter()
return (after-before) / N
def requests_version():
with requests.Session() as session:
before = perf_counter()
for i in range(N):
session.post(URL, data=b"Hello")
after = perf_counter()
return (after-before) / N
print("Sync:", sync_version())
print("Async:", asyncio.run(async_version()))
print("Requests:", requests_version()) Without this change, and running against a local server, I get average request durations on the order of 2-3ms for the async and requests version, and 43ms for the sync one. After applying the change from the PR, the request duration for the sync version is about 2ms. Here's the server code I used. I also get similar results with a Go server. Note that I've set TCP_NODELAY here, as otherwise it adds another 40ms across the board. from http.server import HTTPServer, BaseHTTPRequestHandler
import socket
PORT = 8000
class Handler(BaseHTTPRequestHandler):
protocol_version = "HTTP/1.1"
def do_POST(self):
length = int(self.headers['Content-Length'])
self.rfile.read(length)
self.send_response(200)
self.send_header('Content-Length', '14')
self.end_headers()
self.wfile.write(b"Hello, World!\n")
with HTTPServer(("", PORT), Handler) as httpd:
print("serving at port", PORT)
httpd.socket.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
httpd.serve_forever() |
The motivation in #651 (comment) is compelling here. Would we prefer to accept this PR as-is, or should we make the socket options configurable first? (#662, resolved via #668) rephrased: Is having configurability actually desirable, or do we really just want to make this one change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we oughta okay this, yup.
needs a note in the changelog then it should be good to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually can be firmer on this one.
Yes, urllib3
do this and we should too.
Thanks for nudging us on it, @plietar.
Link to asyncio...
In the 3.7 release notes. Link to trio...
See the docs, and the implementation. |
The widely documented poor interaction between the Nagle algorithm and TCP's delayed ACK feature, when making short successive writes, leads to unnecessary delays (around 50ms on Linux). This happens in httpcore whenever a POST request is made, since the headers and body are sent as two separate calls.
The TCP_NODELAY can be enabled to disable Nagle's algorithm, and therefore avoid this delay.
The option is already enabled by default for all asyncio and Trio sockets. It is also enabled by CPython's http.client module (which urllib and requests use) and by many high-level HTTP libraries found in other languages, including libcurl, Java's Netty, Rust's reqwest and Go's standard library, to name a few.