Detect EOF signaling remote server closed connection #143

yeraydiazdiaz · 2019-07-25T11:22:48Z

Fixes #96

Raise ConnectionClosedByRemote and handle on send

Raise ConnectionClosedByRemote and handle on `send`

httpx/dispatch/http11.py

yeraydiazdiaz · 2019-07-25T15:46:18Z

httpx/concurrency.py

@@ -108,6 +108,9 @@ def __init__(

        return data

+    def is_connection_dropped(self) -> bool:
+        return self.stream_reader.at_eof()


Probably should clarify this seemingly simplistic implementation. The StreamReaderProtocol.connection_lost callback calls stream_reader.feed_eof which in turn sets the EOF flag which at_eof returns.

yeraydiazdiaz · 2019-07-25T15:50:56Z

httpx/dispatch/http11.py

@@ -117,6 +117,8 @@ def __init__(
        Send a single `h11` event to the network, waiting for the data to
        drain before returning.
        """
+        if self.reader.is_connection_dropped():
+            raise NotConnected


Checking the connection status here also prevents us from sending the request event which, as @tomchristie mentioned, makes things ambiguous for non-idempotent requests.

yeraydiazdiaz · 2019-07-25T16:03:07Z

tests/dispatch/test_connection_pools.py

+        response = await http.request("GET", "http://127.0.0.1:8000/")
+        await server.shutdown()  # shutdown the server to close the connection before receiving the data
+
+        with pytest.raises(httpx.exceptions.NotConnected):


Is NotConnected the right exception here? Maybe we should wrap it into a more descriptive Exception and error message?

tomchristie · 2019-07-25T16:20:09Z

Okay - this is looking great! Nice work.

Now, rather than using NotConnected we should instead expose .is_connection_dropped() on the HTTP11Connection and HTTP2Connection classes, and the general HTTPConnection. Then the connection pool should check it explicitly on acquiring a keep-alive connection. (Rather than attempting to send and catching NotConnected if it occurs.)

yeraydiazdiaz · 2019-07-25T16:33:19Z

That sounds good for the two-request scenario, but wouldn't we still need an exception to cover to the last test case where the connection is lost before the response body is read?

tomchristie · 2019-07-25T16:47:42Z

Not exactly.

We’ll remove NotConnected completely, and treat any stream exception that occurs once we start sending as a “disconnected while sending” case.

We will want to make sure we’ve got well defined exception cases for those events, but that doesn’t need to be part of this PR since it’s a different issue from the “keep alive connection had expired” case.

njsmith · 2019-07-25T18:05:48Z

Yeah, the right logic is: only put idle connections in the pool immediately after a successful call to start_next_cycle. When taking an idle connection from the pool to reuse, first check if the underlying socket is readable (ignoring any TLS layer). Implementing this looks very different on different backends, but they can all do it one way or another. If the socket is readable, close it and discard it and try again; otherwise go ahead and use it.

It seems like y'all are still spending a lot of time reinventing stuff here :-/. Do you have an abstraction over backends yet, or is everything still using asyncio only?

sethmlarson · 2019-07-25T18:23:01Z

Btw @njsmith you meant "is NOT readable" for your close and disregard condition?

njsmith · 2019-07-25T19:27:41Z

No, I mean readable. Remember, we think the connection is supposed to be idle, so there shouldn't be any data to read. If it's marked readable, then that means the server is either sending some data in violation of the protocol, or is sending a 408 Request Timeout, or (most likely) has simply closed the connection. (select and friends mark a socket readable when the other side has closed it, which is a little weird, but does make sense – recv will return b"", but it won't block.)

But, we shouldn't actually try to read from the connection while doing this check, because if it's not readable, then we'll block forever, and you wouldn't expect taking an idle connection out of the pool to be a blocking operation that needs a timeout. So it's a weird special-case concept. Trio sockets actually have a special method just to handle this case: python-trio/trio#760

tomchristie · 2019-07-25T19:44:01Z

Let’s try not to railroad this PR with extraneous stuff.

There’s an isolated issue with the “is this connection still alive” on reaquiry of keep-alive connections, because I’d incorrectly assumed that “try writing to the socket” would be the robust way to check for that case. This PR resolves that, and now just needs a little extra rejigging to stop using an unneeded exception-based flow.

(Nothing to do with h11 state or anything else like that here)

Yup we have a concurrency backend interface, let’s discuss that against a separate issue.

njsmith · 2019-07-25T20:31:30Z

There’s an isolated issue with the “is this connection still alive” on reaquiry of keep-alive connections, because I’d incorrectly assumed that “try writing to the socket” would be the robust way to check for that case.

Right, the problem with using writing to check for connection liveness is that you can't tell whether the server started processing the request or not. So if a write fails, you have to give up and report that back to the user, and let them decide what they want to do about it. But if you can tell that the server closed the connection before you started writing, then you know it's safe to go ahead and open a new connection, so the client library can handle it transparently without getting the user involved.

I believe that the right way to check for readability on different backends is:

On asyncio, using the asyncio Stream API: well, you actually can't do it using public APIs. There's StreamReader.at_eof, but that will fail if the HTTP server is polite and sends a 408 Request Timeout before it closes the connection. I think the best you can do is:
```
def asyncio_stream_is_readable(stream):
    return stream._buffer or stream._eof
```
or else giving up on the Stream API and using protocols/transports directly.

On trio, using the trio Stream API: this isn't exposed in Trio's stream abstraction, but we know that we don't have just any stream, but rather a SocketStream, possibly with one or more layers of SSLStream wrapped around it. So you can do:

def trio_stream_is_readable(stream):
    # Peek through any SSLStream wrappers to get the underlying SocketStream
    while hasattr(stream, "transport_stream"):
        stream = stream.transport_stream
    assert isinstance(stream, trio.SocketStream)
    return stream.socket.is_readable()

(This requires the next release of trio, which will be out soon.)

On raw sockets: If you only care about recent Python 3, then you can do something like: https://github.com/python-trio/trio/blob/a9bb9343487c2c64234267d66e04beeccafe2974/trio/_socket.py#L482

If you need to be fully portable, then things are more complicated: https://github.com/urllib3/urllib3/blob/master/src/urllib3/util/wait.py

(But you need most of that anyway to do I/O on raw sockets anyway.)

(Does asyncio provide any way to get at the raw socket underneath a StreamReader? If it does then that might let you do something that doesn't use private APIs, though it'll be pretty convoluted.)

tomchristie · 2019-07-25T21:09:30Z

Closing in favor of #145 which adds a last little bit of clean up on top of @yeraydiazdiaz's great work here.

Yeray Diaz Diaz added 2 commits July 25, 2019 12:20

Detect EOF signaling remote server closed connection

9bdb8fd

Raise ConnectionClosedByRemote and handle on `send`

Fix linting

84ad457

tomchristie reviewed Jul 25, 2019

View reviewed changes

httpx/dispatch/http11.py Outdated Show resolved Hide resolved

Use existing NotConnected exception

171eeee

tomchristie mentioned this pull request Jul 25, 2019

h11._util.RemoteProtocolError: can't handle event type ConnectionClosed when role=SERVER and state=SEND_RESPONSE #96

Closed

sethmlarson suggested changes Jul 25, 2019

View reviewed changes

httpx/dispatch/http11.py Outdated Show resolved Hide resolved

Yeray Diaz Diaz added 2 commits July 25, 2019 16:13

Add Reader.is_connection_dropped method

0e2b5ce

Check connection before sending h11 events as well

71430ed

yeraydiazdiaz commented Jul 25, 2019

View reviewed changes

Add test covering connection lost before reading response content

5ec83c2

yeraydiazdiaz commented Jul 25, 2019

View reviewed changes

sethmlarson self-requested a review July 25, 2019 16:23

Check for connection closed on acquiring it from the pool

d26177f

tomchristie mentioned this pull request Jul 25, 2019

Check disconnections on connection reacquiry #145

Merged

tomchristie closed this Jul 25, 2019

tomchristie mentioned this pull request Jul 25, 2019

Handle 408-response connection closes. #146

Closed

yeraydiazdiaz deleted the handle-remote-keep-alive-connection-close branch July 26, 2019 08:01

njsmith mentioned this pull request Sep 8, 2019

Add trio concurrency backend #276

Merged

florimondmanca mentioned this pull request Sep 8, 2019

HTTP/1.1 keep-alive connections not reestablished if remote has gone away #327

Closed

florimondmanca mentioned this pull request Sep 15, 2019

Does detection of lost connections implicitly depend on asyncio? #346

Closed

florimondmanca mentioned this pull request Oct 9, 2020

Tweak dropped connection detection encode/httpcore#185

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detect EOF signaling remote server closed connection #143

Detect EOF signaling remote server closed connection #143

yeraydiazdiaz commented Jul 25, 2019

yeraydiazdiaz Jul 25, 2019

yeraydiazdiaz Jul 25, 2019

yeraydiazdiaz Jul 25, 2019

tomchristie commented Jul 25, 2019

yeraydiazdiaz commented Jul 25, 2019

tomchristie commented Jul 25, 2019

njsmith commented Jul 25, 2019

sethmlarson commented Jul 25, 2019

njsmith commented Jul 25, 2019

tomchristie commented Jul 25, 2019

njsmith commented Jul 25, 2019

tomchristie commented Jul 25, 2019

Detect EOF signaling remote server closed connection #143

Detect EOF signaling remote server closed connection #143

Conversation

yeraydiazdiaz commented Jul 25, 2019

yeraydiazdiaz Jul 25, 2019

Choose a reason for hiding this comment

yeraydiazdiaz Jul 25, 2019

Choose a reason for hiding this comment

yeraydiazdiaz Jul 25, 2019

Choose a reason for hiding this comment

tomchristie commented Jul 25, 2019

yeraydiazdiaz commented Jul 25, 2019

tomchristie commented Jul 25, 2019

njsmith commented Jul 25, 2019

sethmlarson commented Jul 25, 2019

njsmith commented Jul 25, 2019

tomchristie commented Jul 25, 2019

njsmith commented Jul 25, 2019

tomchristie commented Jul 25, 2019