Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: Remove ListenerId #4

Closed
wants to merge 13 commits into from
Closed

Conversation

elenaf9
Copy link
Owner

@elenaf9 elenaf9 commented Jun 5, 2022

Description

As discussed in libp2p#2652 (review), this PR removes any notion of listeners from the Transport API, in particular the ListenerId. As a result now the only way to identify an internal listener is via Multiaddresss. (cc @mxinden @thomaseizinger).
For almost all transports this does not change much, the only exception is WsConfig, which should be reviewed carefully.
For WsConfig we have to map back events from the inner transport to a previously called listen_on. Because we have to handle the case of wildcard addresses, this is a bit more complicated. My current solution relies on the multiaddress being a tcp one (even though our WsConfig is generic, websockets are always based on tcp, so I think its an acceptable assumption to make), which is not ideal. If there would be a similar use-case that is completely generic over the inner transport, this could be even more difficult.
I am still in favor of this change, but I wonder if there is a way we can make this mapping easier for the user.

The PR is split into separate commits for each layer in which the ListenerId is removed, so it is best reviewed commit-by-commit.

Open Questions

Should we include in TransportEvent::NewAddress the address that the user initially provided in listen_on to allow mapping the addresses?

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • A changelog entry has been made in the appropriate crates

Remove ListenerClosed. It was used for the following cases:
- listen_on failed -> Added new event ListenFailure to cover this case
- remove_listener -> Now just report AddressExpired
- fatal error on active listener (not used in any of our transports)
  -> Transports should report AddressExpired for associated addresses
  -> Transports can report Transport::Error for debugging purposed
Don't inform the behaviour of non-fatal listener/ transport errors.
They are only for debug purposes and being `dyn Error` they can not be
handled by the Behaviour anyway.
Rename SwarmEvent::ListenerError to SwarmEvent::TransportError
@elenaf9 elenaf9 changed the title Remove listener *: Remove ListenerId Jun 5, 2022
@thomaseizinger
Copy link

My current solution relies on the multiaddress being a tcp one (even though our WsConfig is generic, websockets are always based on tcp, so I think its an acceptable assumption to make), which is not ideal. If there would be a similar use-case that is completely generic over the inner transport, this could be even more difficult.

Here is an idea but we may need a bit more refinement:

We can define a generic function on multiaddress like:

impl Multiaddress {
    pub fn matches_wildcard_mask(&self, other: Multiaddress) -> bool {}
}

0.0.0.0 is a wildcard mask (there are others too). For example, an address such as /ip4/192.168.0.1/tcp/8080 matches the wildcard address mask /ip4/0.0.0.0/tcp/8080.

Perhaps we could use this so we don't have to tear apart the address completely in WsConfig? Implementing such a matching function in full is obviously much more work but we could get started with it as a pub(crate) function, define it for the internal usecase now and expand it later to a more fully-featured implementation :)

listener_protos: HashMap<ListenerId, Protocol<'static>>,
/// Ws / Wss protocols extension for inner listening addresses, mapped to
/// the port of the listening address.
listener_protos: HashMap<u16, Protocol<'static>>,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could collide on IPv6 and IPv4, right? E.g. say I am listening via wss on /ip6/xxx/tcp/42 and ws on /ip4/xxx/tcp/42.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

if self.tls_config.server.is_some() {
p
} else {
if self.tls_config.server.is_none() {
debug!("/wss address but TLS server support is not configured");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this already checked within self.is_tcp_addr?


fn insert_proto(&mut self, inner_addr: Multiaddr, proto: Protocol<'static>) {
let mut addr_iter = inner_addr.iter();
let ip_proto = addr_iter.next().expect("Address is valid.");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These expect calls are safe due to the previous call to is_tcp_addr, correct? Given that the two methods are called in sequence, how about combining them into one method, e.g. validate_addr_and_insert_proto? That would remove the need for an expect call, right?

I would like to be a bit careful, given that a remote could potentially give us a malformed address and thus panic the main event loop in case miss an edge case here.

@mxinden
Copy link

mxinden commented Jun 8, 2022

Bummer that libp2p-websocket makes this a bit difficult.

impl Multiaddress {
    pub fn matches_wildcard_mask(&self, other: Multiaddress) -> bool {}
}

@thomaseizinger this looks worth exploring.

we could get started with it as a pub(crate) function

Note that multiaddr is a separate crate and thus we won't be able to use the method inside libp2p-websocket. Maybe start with a standalone function in libp2p-websocket?

@mxinden
Copy link

mxinden commented Jun 8, 2022

Will take another look at the remaining changes. Thanks for looking into this @elenaf9!

@thomaseizinger
Copy link

impl Multiaddress {
    pub fn matches_wildcard_mask(&self, other: Multiaddress) -> bool {}
}

@thomaseizinger this looks worth exploring.

we could get started with it as a pub(crate) function

Note that multiaddr is a separate crate and thus we won't be able to use the method inside libp2p-websocket. Maybe start with a standalone function in libp2p-websocket?

Works too. As long as the limitations are properly documented, we could also introduce it on Multiaddress right away.

I am a bit worried that a private, standalone function will bitrot and not actually be expanded. But at the same time, it may be premature optimisation.

Copy link

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of comments. Overall, apart from the libp2p-websocket changes, this looks good to me. Thanks for splitting it into the many commits. Made reviewing very easy.

swarm/src/lib.rs Outdated Show resolved Hide resolved
swarm/src/lib.rs Show resolved Hide resolved
core/src/transport.rs Outdated Show resolved Hide resolved
@elenaf9
Copy link
Owner Author

elenaf9 commented Jun 10, 2022

Perhaps we could use this so we don't have to tear apart the address completely in WsConfig? Implementing such a matching function in full is obviously much more work but we could get started with it as a pub(crate) function, define it for the internal usecase now and expand it later to a more fully-featured implementation :)

👍 I like the idea.

Note that multiaddr is a separate crate and thus we won't be able to use the method inside libp2p-websocket. Maybe start with a standalone function in libp2p-websocket?

@thomaseizinger @mxinden we already have a couple of methods for Multiaddr that are not part of the multiaddr crate but instead either part of a protocol (e.g. autonat::is_global_ip(), which is currently private in AutoNAT) or a standalone function like address_translation.
What do you think about collecting them into one trait libp2p_core::MultiaddrExt?:

trait MultiaddrExt {
    fn address_translation(&self, observed: &Multiaddr) -> Option<Multiaddr>;
    fn is_global_ip(&self) -> bool;
    fn get_ip_addr(addr: &Multiaddr) -> Option<IpAddr>;
    fn matches_wildcard_mask(&self, other: &Multiaddr) -> bool;
}

impl MultiaddrExt for Multiaddr { .. }

The only maybe unwanted implication of this would be that the trait would have to be public. But I don't see a downside of this, as long as we document the maturity of the methods appropriately.

@thomaseizinger
Copy link

thomaseizinger commented Jun 10, 2022

What do you think about collecting them into one trait libp2p_core::MultiaddrExt?:

I am a bit hesitant to create an extension trait that is meant to be used by other crates. That gives me "public utility methods" vibes 😅

If they are generally useful, I think they should be upstreamed and otherwise, they should stay private.

@mxinden
Copy link

mxinden commented Jun 10, 2022

No opinion in regards to where this logic should live. Gut feeling is that it should be private as long as there is no second user. Though again, fine either way.

@elenaf9
Copy link
Owner Author

elenaf9 commented Jun 11, 2022

I refactored the address mapping in WsConfig.
We still need to "tear apart" the multiaddr, because the inner listening address can not only be /ip/0.0.0.0/tcp/0, but also just have only the IP unspecified, or just the port unspecified. How it now works is as follows:

  • We track in a list tuples of inner tcp-address + ws | wss protocol
  • The list is sorted so that non-wildcard addresses are listed before wild-card adresses
  • On inner transport events, we map the address by iterating through that list for the first address where matches_address returns true. matches_address follows the idea that @thomaseizinger proposed; it returns true if:
    • the port matches (since we have multiple addresses for the different interfaces for one port)
    • the port is unspecified and either the IP matches or the IP is unspecified as well
  • If an address was a wildcard address but now mapped to an actual listening address, we update it in the list and resort the list.

This results in the same logic as before, but implemented a bit simpler. Also now it considers if an wildcard address is ipv4 or ipv6.
I am still not 100% happy with this implementation. I fear that it is quite error-prone and not really intuitive, but I wasn't able to come up with a better solution.
Interested in hearing what you think @mxinden @thomaseizinger.

@mxinden
Copy link

mxinden commented Jun 13, 2022

This results in the same logic as before, but implemented a bit simpler. Also now it considers if an wildcard address is ipv4 or ipv6.
I am still not 100% happy with this implementation. I fear that it is quite error-prone and not really intuitive, but I wasn't able to come up with a better solution.
Interested in hearing what you think @mxinden @thomaseizinger.

Another idea that came to my mind: What if a Websocket transport instance either supports /wss or /ws but never both? Folks that want to support both would need to create two Websocket transport instances, combined with an OrTransport.

That would resolve the whole need for /wss /ws tracking and matching, right?

@elenaf9
Copy link
Owner Author

elenaf9 commented Jun 13, 2022

This results in the same logic as before, but implemented a bit simpler. Also now it considers if an wildcard address is ipv4 or ipv6.
I am still not 100% happy with this implementation. I fear that it is quite error-prone and not really intuitive, but I wasn't able to come up with a better solution.
Interested in hearing what you think @mxinden @thomaseizinger.

Another idea that came to my mind: What if a Websocket transport instance either supports /wss or /ws but never both? Folks that want to support both would need to create two Websocket transport instances, combined with an OrTransport.

That would resolve the whole need for /wss /ws tracking and matching, right?

Yes that would work! I think this should be done in a separate PR on master, but that makes only sense if we all agree that ListenerId should be removed and thus this change is really needed.

@thomaseizinger
Copy link

Another idea that came to my mind: What if a Websocket transport instance either supports /wss or /ws but never both? Folks that want to support both would need to create two Websocket transport instances, combined with an OrTransport.

That would resolve the whole need for /wss /ws tracking and matching, right?

Taking this further, we could have two separate transport implementations and thus completely get rid of the conditional logic.

@thomaseizinger
Copy link

Should we include in TransportEvent::NewAddress the address that the user initially provided in listen_on to allow mapping the addresses?

Am I correct that doing this would allow us to unambiguously do the mapping without the "taking apart" etc? If yes, then I am in favor of this change on the libp2p-core level and we can remember a mapping of emitted addresses to their corresponding "original address" in the Swarm to provide an easier API to libp2p-swarm users.

Copy link

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments :)

Comment on lines +240 to +243
NewAddress(Multiaddr),
/// An address is no longer being listened on.
AddressExpired {
/// The listener that is no longer listening on the address.
listener_id: ListenerId,
/// The new address that is being listened on.
listen_addr: Multiaddr,
},
/// A connection is incoming on one of the listeners.
AddressExpired(Multiaddr),
/// A connection is incomings.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StartedListening and StoppedListening would be closer aligned with the names of the trait functions.

@@ -115,15 +114,15 @@ pub trait Transport {
///
/// Returning an error from the stream is considered fatal. The listener can also report
/// non-fatal errors by producing a [`TransportEvent::Error`].
fn listen_on(&mut self, addr: Multiaddr) -> Result<ListenerId, TransportError<Self::Error>>
fn listen_on(&mut self, addr: Multiaddr) -> Result<(), TransportError<Self::Error>>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be start_listen_on then?

/// The new address that is being listened on.
listen_addr: Multiaddr,
},
NewAddress(Multiaddr),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not moving from a struct to a tuple variant would reduce the diff a bit I think.

Comment on lines +111 to +112
transport::TransportEvent::NewAddress(addr) => {
addr_sender.take().unwrap().send(addr).unwrap();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff for example could entirely be avoided :)

Comment on lines +287 to +290
let ws_proto1 = Protocol::Ws(std::borrow::Cow::Borrowed("1"));
let addr1 = Multiaddr::from(Ipv4Addr::UNSPECIFIED)
.with(Protocol::Tcp(0))
.with(ws_proto1.clone());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these would be more readable if parsed from a string.

@elenaf9
Copy link
Owner Author

elenaf9 commented Jun 20, 2022

I thought about this for a while and decided to stick with ListenerIds for now and thus close this PR. The reason for this is mainly the websocket implementation. The current implementation of this PR is far from ideal, and I still feel a bit uneasy about just dropping the path for ws listeners if we following the above idea of not tracking the protocol for inner listeners. From libp2p#2708 it seems like the path is not used for listening and only for dialing, but I still would like to gather more feedback. Also there may be other use-cases where we need to map listening addresses to a listener that I am not aware of right now.
I don't want to block libp2p#2652 because of this, and we can still add this change later.

@elenaf9 elenaf9 closed this Jun 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants