-
Notifications
You must be signed in to change notification settings - Fork 993
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transports/quic: STOP_SENDING
fails poll_close
#3144
Comments
@MarcoPolo this reminds me of a conversation we had around libp2p/specs#473 and stream close behavior related to the ping protocol. You might have an opinion here. |
Can we replicate this problem with a failing test in the stream muxer test harness? |
Good idea. I will take a look.
This might not work when using rust-libp2p/misc/multistream-select/src/dialer_select.rs Lines 145 to 150 in 8678dc7
|
Thanks for the detailed report and proposing solutions @mxinden! I was able to reproduce this bug with this test: https://github.com/elenaf9/rust-libp2p/blob/826f3f6ae3bf6ad0a4df16776e110b23fa12fced/transports/quic/tests/smoke.rs#L215-L230 (not using the stream muxer harness right now but happy to add it there in a different PR).
But isn't the problem that we drop the stream before the remote closes the stream?
Note that we don't have "flushing" on QUIC substreams. Stream data is send asap. Close then only waits for the ACKs for sent data. I am currently looking into this, trying to think of alternative solutions. @mxinden did you witness this happening in the real world or only on local tests? |
Thank you!
Agreed. This is orthogonal. (In my case the two were racing each other.)
I connected to kademlia-exporter.max-inden.de/ (QUIC server) via my laptop using https://github.com/mxinden/libp2p-lookup (QUIC client). I would consider this a real world scenario and would expect others to see similar behavior when using
Ah, good point. Never mind then. (I think still relevant for non-QUIC streams.)
No we don't. Problem today on flushable-stream-muxers is, that the multistream-select message might never be sent, (i.e. stuck in a buffer) in case the local node never writes application data on the stream. But as you noted above, not relevant for the |
Thanks and yes please :) |
Are you still exploring potential fixes, or do you already have a fix in mind? |
I think I have it fixed; writing tests right now to confirm. But your proposed solution 2. Don't send a |
- Only send `STOP_SENDING` on a stream when dropping it if the remote did not finish the stream yet. - Only call `quinn_proto::SendStream::finish` once. (A second call to it will always fail. Though I don't think this was the issue in #3144.) - Add tests for reading and writing to streams after the remote dropped. Also adds a smoke test for backpressure. Fixes #3144.
Summary
Receiving a
STOP_SENDING
makespoll_close
fail. This is problematic inlibp2p-identify
where a node expects to be able to close its write side before reading.Expected behaviour
libp2p-identify
should successfully exchange identify information.Actual behaviour
Say that node A and B connected. A opens a
libp2p-identify
stream to B. A expects B to send its identify information.After having negotiated the
libp2p-identify
protocol viamultistream-select
:B sends its identify information and closes its write side.
rust-libp2p/protocols/identify/src/protocol.rs
Lines 202 to 203 in 0c85839
B then drops the stream. Dropping the stream results in
libp2p-quic
sending aSTOP_SENDING
.rust-libp2p/transports/quic/src/connection/substream.rs
Lines 203 to 207 in 0c85839
In parallel A closes its write side.
rust-libp2p/protocols/identify/src/protocol.rs
Line 212 in 0c85839
This results in
Substream::poll_close
being called inlibp2p-quic
.rust-libp2p/transports/quic/src/connection/substream.rs
Lines 177 to 192 in 0c85839
In case the
STOP_SENDING
from B already arived,poll_close
fails and thus the identify handshake fails.Possible Solution
STOP_SENDING
onDrop
.STOP_SENDING
onDROP
IFF the remote already closed the write side.quinn
does. https://github.com/quinn-rs/quinn/blob/1d390af2facdf424c7feb470909fffc29a1dca6c/quinn/src/recv_stream.rs#L396-L400STOP_SENDING
.STOP_SENDING
error inpoll_close
.libp2p-identify
before reading the remote's identify information.libp2p-identify
to be an edge-case here, i.e. I would expect other protocols not to run into this issue. E.g. in a standard request response style protocol A would send a request, then close, then B would read, then B writes, then B closes and only then B drops.I am tending towards (2) AND (4).
Version
I am using 055636e on #2712 but any commit on
master
should work. E.g. 0c85839.Would you like to work on fixing this bug?
Maybe. Let's find consensus on a good solution first.
//CC @elenaf9
The text was updated successfully, but these errors were encountered: