Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(identify): don't close stream in
protocol::recv
#3344fix(identify): don't close stream in
protocol::recv
#3344Changes from 2 commits
3ffe5a8
f8efcb3
cf21edb
71c48d2
792ec11
956f7d6
c3ecbd3
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is flushing needed here? Would removing the whole line be enough?
See #3298 (comment) for more details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it work if we ignore the error on the close? Would that be a viable hot-fix too?
Plus, I'd like us to have a comment here linking to an issue on GitHub tracking the overall problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preference for removing the close as it is not needed in the first place. Ignoring an error might hinder debugging in the future.
Yes please. Good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elenaf9 friendly ping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking into this a bit more I think closing a stream here is / will remain problematic, independently of #3343.
The reason is that we also use
recv
in the inbound upgrade onPush
. If this is combined with using upgradeVersion::V1Lazy
I think the following can happen in theory:V1Lazy
it directly flushes the identify payload together with the protocol-idIn go-libp2p this issue is solved by simply ignoring the error in both, the multi-stream-select upgrade1 and when reading the identify2 response.
I am not too familiar with multi-stream select in rust-libp2p so it could be that in the above case the dialer would not drop the stream until it received the echoed-back protocol-id. That said, even if it's implemented like that in rust-libp2p, I assume from the linked go-libp2p code that this is how they are doing it. So I think we also have to handle that case. // CC @MarcoPolo since we talked about this out-of-band - did I understand it correctly?
Am I missing something?
Footnotes
https://github.com/multiformats/go-multistream/blob/5555a5c43be95c669ce5637723ccf690a6a30e23/multistream.go#L203-L205 ↩
https://github.com/libp2p/go-libp2p/blob/6b9c11680e0bb412ef7515f266531c23245015dc/p2p/protocol/identify/id.go#L435 ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand the last step here. Note that closing a stream is a local operation only in TCP + Yamux. It only flushes the stream buffers to the local connection buffers. It does not ensure that the closing of the stream is communicated to the remote. Communicating the closing to the remote is asynchronous to calling
close
. The connection task will make sure the closing of the stream is communicated to the remote.https://github.com/libp2p/rust-yamux/blob/72ccd5734f71971fc1f02ac4a5f43b8eb4dff660/yamux/src/connection/stream.rs#L365-L387
It should await receiving the protocol-id-echo.
rust-libp2p/misc/multistream-select/src/negotiated.rs
Lines 308 to 310 in ab59af4
Never the less, I am advocating for removing the
close
inrecv
. Any objections @elenaf9?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I was probably not clear with what I meant; I was referring to @thomaseizinger suggestion to add a comment here that links to #3343. I was arguing that because in QUIC (contrary to TCP+Yamux) the close is not local only but instead fails if not all data gets acknowledged, even with #3343 resolved
close
could cause issues in the described scenario.I was mostly wondering if my understanding was correct. If it is, I would add a comment that describes that.
No objections; I will remove it.
Edit: What do you think of cf21edb? (sorry for the broken commit message - forgot to escape the backticks in the message)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Friendly ping @mxinden @thomaseizinger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am okay with the comment. At the end of the day, this is a temporary situation until we fix
quinn
(+quinn-proto
), right?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure. That's what I meant above: even with the fix, it could cause errors in case of identify
Push
+V1Lazy
if the remote doesn't read the echo-ed back protocol-id. Not an issue if both peers are rust-libp2p, but I am not sure how go-libp2p implements it.