Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: move Identify I/O from NetworkBehaviour to ConnectionHandler #3208

Merged
merged 18 commits into from
Dec 13, 2022

Conversation

jxs
Copy link
Member

@jxs jxs commented Dec 7, 2022

Description

Addresses #2885

Notes

Following the discussion on #2885 this was the simplest way I found to move the I/O to Handler.
Per commit review is suggested.

Links to any relevant issues

Open Questions

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • A changelog entry has been made in the appropriate crates

@jxs jxs force-pushed the identify-remove-io-behaviour branch from 0ad547f to 4cf8814 Compare December 7, 2022 15:14
jxs added 2 commits December 7, 2022 15:30
instead of responding pending replies from NetworkBehaviour, send them
back to ConnectionHandler.

ConnectionHandler for now just receives them, it's implementation of the
responding will come next.
reply to the Identification requests from the ConnectionHandler.
@jxs jxs force-pushed the identify-remove-io-behaviour branch from 4cf8814 to 3db1ec1 Compare December 7, 2022 15:31
Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

It would be nice if we changed a few more things:

  • Initialise the ConnectionHandler with PeerId and observed address directly
  • Don't send the substream back and forth
  • Upon receiving a new substream, send an event to the behaviour asking for more information like protocols, upon answer, grab the next substream in line and send it back

I think this would be an overall easier design because it clearly shows, what information is needed from the behaviour that can't be accessed by the handler.

We can even pass stuff like the public key and the agent version directly to the handler upon initialisation.

io: Pin<Box<dyn Future<Output = Result<(), UpgradeError>> + Send>>,
},
struct Request {
peer: PeerId,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handler could learn this via IntoConnectionHandler. It will never change across its lifespan.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated!

},
struct Request {
peer: PeerId,
io: ReplySubstream<NegotiatedSubstream>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to not pass this back and forth.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah agreed thanks, updated!

struct Request {
peer: PeerId,
io: ReplySubstream<NegotiatedSubstream>,
observed: Multiaddr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, a Handler can learn this via IntoConnectionHandler and it won't change1 across its lifetime.

Footnotes

  1. AddressChange is not emitted at the moment but once it will, it is directly given to the handler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we only get it on connection_established which seems to be called after new_handler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to implement IntoConnectionHandler additionally on a struct that you return from new_handler, said struct can then capture the ConnectedPoint and initialise the ConnectionHandler with it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Thomas, updated!

jxs added 2 commits December 8, 2022 18:00
to the behaviour, instead keep track of the info as it doesn't change.
Copy link
Member Author

@jxs jxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fast review and guidance Thomas!

I think this would be an overall easier design because it clearly shows, what information is needed from the behaviour that can't be accessed by the handler.

We can even pass stuff like the public key and the agent version directly to the handler upon initialisation.

the downside of that is that information comes from multiple different places instead of coming only from a single Info struct. I updated to code to no longer send back and forth the substream, if you still feel that we should pass first the PeerId on constructor and then all the other details when requested, to prepare for the changes discussed on #2885 no worries I'll update the PR :)

@@ -255,6 +302,46 @@ impl ConnectionHandler for Handler {
}
}

// Check for pending replies to send.
if let Some(ref info) = self.info {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: I thought let_chains had landed already been stabilized, but it was reverted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really? Did you try with Rust 1.65?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, compiler points me to rust-lang/rust#53667, see rust-lang/rust#53667 (comment)

},
struct Request {
peer: PeerId,
io: ReplySubstream<NegotiatedSubstream>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah agreed thanks, updated!

struct Request {
peer: PeerId,
io: ReplySubstream<NegotiatedSubstream>,
observed: Multiaddr,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we only get it on connection_established which seems to be called after new_handler.

@thomaseizinger
Copy link
Contributor

Thanks for the fast review and guidance Thomas!

I think this would be an overall easier design because it clearly shows, what information is needed from the behaviour that can't be accessed by the handler.

We can even pass stuff like the public key and the agent version directly to the handler upon initialisation.

the downside of that is that information comes from multiple different places instead of coming only from a single Info struct.

Information already comes from different places! I think it is cleaner to not hide this fact but explicitly show, what information we request from the behaviour.

all the other details when requested

Everything that doesn't change over the lifetime of the handler should come in via the constructor in my opinion. This accurately models in the type system, what data is known when and how it flows through the system! In other words, only the supported protocols and external addresses should come in via the event.

jxs added 2 commits December 9, 2022 20:34
provided on new_handler constructor Everything that doesn't change over
the lifetime.
Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! A few more ideas!

/// Pending replies to send.
pending_replies: VecDeque<Reply>,
/// Information requests from the handlers to be fullfiled.
requests: VecDeque<PeerId>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'll need to track the connection ID here so you can respond to the correct handler!

};
return Poll::Ready(NetworkBehaviourAction::NotifyHandler {
peer_id: peer,
handler: NotifyHandler::Any,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the curlpit! This will be buggy with > 1 connection per peer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Thomas, addressed.

Comment on lines 148 to 149
/// Information provided by the `Behaviour` upon requesting.
behaviour_info: Option<BehaviourInfo>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of tracking this here, I think it would be cleaner to:

  1. Store pending substreams in a queue
  2. Request a new BehaviourInfo every time a new substream comes in
  3. Construct a Sending every time you receive a BehaviourInfo from the behaviour by popping the oldest substream from the queue in (1)

That will work off substreams in a FIFO fashion and feed them the most up-to-date information.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could initialise this in the constructor as well. I think we can assume that new_handler in the behaviour only gets called after it got polled at least once!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of tracking this here, I think it would be cleaner to:

  1. Store pending substreams in a queue
  2. Request a new BehaviourInfo every time a new substream comes in
  3. Construct a Sending every time you receive a BehaviourInfo from the behaviour by popping the oldest substream from the queue in (1)

That will work off substreams in a FIFO fashion and feed them the most up-to-date information.

Thanks, makes sense to me. But used two queues instead of one with different states for a less confusing management of them ptal Thomas.


/// Information provided by the `Behaviour` upon requesting.
#[derive(Debug)]
pub struct BehaviourInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the best name but should do for now I guess :)

instead of caching the behaviour info.
@jxs jxs force-pushed the identify-remove-io-behaviour branch from 02c1783 to 0db0c52 Compare December 11, 2022 19:09
Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks good to me now, one more idea but happy to merge anytime!

protocols/identify/src/handler.rs Outdated Show resolved Hide resolved
Comment on lines 176 to 179
/// Identifying information of the local node that is pushed to a remote.
Push(Info),
/// Identifying information requested from this node.
Identify(BehaviourInfo),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How different are these two really? Isn't exactly the same, just with different semantics?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for recalling this again

large size difference between variants
   --> protocols/identify/src/handler.rs:174:1
    |
174 | / pub enum InEvent {
175 | |     /// Identifying information of the local node that is pushed to a remote.
176 | |     Push(Info),
    | |     ---------- the largest variant contains at least 304 bytes
177 | |     /// Identifying information requested from this node.
178 | |     Identify(BehaviourInfo),
    | |     ----------------------- the second-largest variant contains at least 48 bytes
179 | | }
    | |_^ the entire enum is at least 304 bytes
    |
    = note: `-D clippy::large-enum-variant` implied by `-D warnings`
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#large_enum_variant
help: consider boxing the large fields to reduce the total size of the enum
    |
176 |     Push(Box<Info>),
    |          ~~~~~~~~~

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't talking about the size difference actually.

Semantically, even for an identify push, all the handler needs to know are the updated protocols and listen addresses, which is represented as the BehaviourInfo struct. Can we model it as the same type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah right! So I followed along and merge Behaviour's pending_push into requests for better balance when fullfiling them. Previously Push requests would always have priority over Identify ptal Thomas.

@@ -82,6 +114,12 @@ pub struct Handler {
>; 4],
>,

/// Streams awaiting `BehaviourInfo` to then send identify requests.
reply_streams: VecDeque<ReplySubstream<NegotiatedSubstream>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the abstraction of a ReplySubstream useful here? Would it be easier to just buffer the streams themselves?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry Thomas, not following. What is the alternative? But the send method is implemented for ReplySubstream.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative is inlining send to wherever it is called which I'd guess is only one place?

Copy link
Contributor

@thomaseizinger thomaseizinger Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really do this wrapping of a substream anywhere else in the codebase so it would be nice to align it here.

Constructing a Framed is what we typically do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, right! Thanks! I was able to remove ReplySubstream but send is still being used in more places.
We can go further and deprecate Info and just use the protobuf Identify, but probably do that on a subsequent PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, right! Thanks! I was able to remove ReplySubstream but send is still being used in more places.
We can go further and deprecate Info and just use the protobuf Identify, but probably do that on a subsequent PR?

Nah, I think that is fine. We usually don't use the protobuf struct directly which is good.

@mergify
Copy link
Contributor

mergify bot commented Dec 12, 2022

This pull request has merge conflicts. Could you please resolve them @jxs? 🙏

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, this actually went beyond what I thought could be simplified :)

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great simplifications along the main goal of the pull request. Thanks.

Needs a changelog entry. Otherwise good to go from my end.

@jxs
Copy link
Member Author

jxs commented Dec 13, 2022

Updated with CHANGELOG.md entry

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

@mergify mergify bot merged commit f80c714 into libp2p:master Dec 13, 2022
jxs added a commit to jxs/rust-libp2p that referenced this pull request Dec 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants