Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Discv5Event::SessionEstablished to surface ENRs #123

Merged
merged 3 commits into from
Jun 3, 2022

Conversation

jacobkaufmann
Copy link
Contributor

Closes #96.

Add a new Discv5Event variant to surface ENRs from established sessions (completed handshakes). This change essentially propagates the HandlerOut::Established event upward so that the application layer can receive the event. The propagated ENR does not go through the "contactable" check performed in inject_session_established. Therefore, the application has no guarantee that the ENR from this event is contactable.

The discussion in the associated issue suggested this path to resolution. If SessionEstablished does not represent desired semantics, then please suggest a more appropriate name.

@jacobkaufmann jacobkaufmann changed the title Add 'Discv5Event::SessionEstablished' to surface ENRs Add Discv5Event::SessionEstablished to surface ENRs May 25, 2022
src/service.rs Outdated
Comment on lines 345 to 346
self.send_event(Discv5Event::SessionEstablished(enr.clone()));
self.inject_session_established(enr, direction);
Copy link
Collaborator

@divagant-martian divagant-martian May 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of a way to get a session established for a non contactable enr, but just to be sure, I think it would be better to propagate up the event after that check, inside inject_session_established

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that a session could be established for non-contactable ENR if that ENR's socket address is None.

verify_enr returns true if the Enr and NodeAddress have the same NodeId and the socket address is None.

The check get_contactable_addr returns None if the socket address is None.

Thus, it is possible for some node to pass the verify_enr check but fail the get_contactable_addr check.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you are right. I keep forgetting this. I think it's still preferable to emit only contactable ENRs. If all communication in the overlay network you are managing is done via discv5, sending talk requests for non contactable ENRs will fail anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. After thinking about this some more, I am not sure whether this change will be sufficient to achieve our ends.

The complication is the uTP protocol built on top of Discovery v5. For this protocol, all packets are transmitted over TALKREQ messages. A TALKRESP message may be sent so that the peer is not evicted from the routing table, but the contents of that message are ignored. In this construction, the recipient of the message would be unable to respond to a TALKREQ for the uTP protocol if the sender's ENR is non-contactable, because we can only send a TALKREQ through Discv5 with a contactable ENR (as you mentioned).

The concern is for nodes behind NATs who may be unaware of their own IP. I don't understand the dynamics of the IP voting mechanism deeply enough to know whether some node behind a NAT would eventually establish a session with a contactable ENR, at which point the application layer can initiate a TALKREQ to that node.

We could make uTP inaccessible to nodes whose ENRs are non-contactable. The other implementations (Status and ChainSafe) have implemented changes that allow the application layer to send a TALKREQ with only a node ID and socket address. That's a more invasive change though, and I wouldn't want to request that unless it's necessary for us.

I think this is beyond the scope of the original issue, but I want to make sure that this change is forward compatible with whatever we might end up doing subsequently.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to answer some of your statements in an attempt to check I'm understanding correctly:

If NodeA has a non contactable Enr and sends a talk request to NodeB, who does have a contactable Enr (that A used to contact it in the first place) then B can still send a talk response to A. It cannot, as things work right now, send a talk request to A. I assume responding to A's talk request is something that you are already doing via Discv5Event::TalkRequest which contains a channel

So to my understanding, what you would like to be able to do is make NodeB be able to send requests to NodeA because there is a session in place? If this is the case, I think it would be quite doable: Send a request to a node given only it's NodeAddress (SocketAddress and NodeId). If there is no session in place, the request would fail, since there is no way to establish a handshake (no public key)

Would this work for your needs?

Also, about voting: If updating the Enr is enabled, a node behind a Nat should eventually find it's contactable Enr and advertise it as such

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your suggestion is essentially the OptimisticNodeContact idea that Age proposed in the associated issue. Age expressed some concerns there, so I was hesitant to push for that unless we are confident that it is necessary.

For all non-uTP communications at the application layer, yes we are using the channel to send responses. For uTP, the TALKRESP is not used, except to ensure that the node is not marked unresponsive and evicted from the routing table. We would like to send a TALKREQ from NodeB to NodeA after NodeA sends a TALKREQ to NodeB because, for the uTP protocol, that TALKREQ from NodeB to NodeA is the response to the initial TALKREQ.

We could make uTP off-limits to nodes that we do not have a contactable ENR for. And if we can expect that a node will eventually update its local ENR (via IP voting) so that it is contactable, then this is not such a huge problem IMO. However, this is not the approach that the other clients have taken. I would like to sync with the other portal client teams on this.

That said, I think it only makes sense to possibly emit non-contactable ENRs if there is a way to send a TALKREQ to something like an OptimisticNodeContact.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion with the other portal client teams, we have decided that we would like to move forward with the OptimisticNodeContact approach. We would like the ability to attempt to send a TALKREQ to a NodeAddress.

I am happy to take up that PR, but I'm not sure if the maintainers would prefer to take that up themselves.

We would still require the Discv5Event added in this PR, but we would not require that the ENR wrapped in the event be contactable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll check with Age, but given the fact that all needed for a successful request is a NodeContact we could emit SessionEstablished for this type. This would ensure you receive the node_id, enr, and socket_addr, guaranteeing everything needed for a next request is present

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @jacobkaufmann.
We've made some changes to NodeContact since we last spoke. It's been changed from an enum into a struct (which has made the code much nicer). So adding the specific OptimisticNodeContact variant no longer really applies, however the solution is much nicer now (you don't have to handle a whole new enum variant).

I think what @divagant-martian is proposing should solve all our issues.

To clarify, we should emit this SessionEstablished event as you've done in this PR and it can contain a NodeContact as @divagant-martian suggests. Then to reply to the node, you can just send that NodeContact and if the node is contactable (i.e will receive messages on the same IP/PORT as the src of the packet we received), then the request will be successful, otherwise it will time out.

The new form of the NodeContact is:

pub struct NodeContact {
    /// Key to use for communications with this node.
    public_key: CombinedPublicKey,
    /// Address to use to contact the node.
    socket_addr: SocketAddr,
    /// The ENR of the node if known.
    enr: Option<Enr>,
}

@divagant-martian is suggesting that when a node connects to us, and we perform a handshake there are a few cases we should consider:

  1. The ENR has specified a socket and it matches the src of the packet it sent us: In this case the socket_addr = the enr address = src socket
  2. The ENR has specified a socket but it doesn't match the src of the packet: We drop these packets and don't respond. So these nodes wont show up in our event
  3. The ENR has not specified a socket: We respond to these nodes and we set the socket_addr to the src of the packet we recieved and emit the SessionEstablished event. You can then try and send messages back to these nodes via the socket they sent us messages.

I'm not sure if you want to distinguish between the 1st and the 3rd in your events. It might be more unlikely the nodes of the 3rd kind don't respond (depending on their NAT'ing etc).

From my understanding this should work for your needs right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AgeManning yes I think this should work. We should be able to distinguish between (1) and (3) based on the fields of the NodeContact. The variant of Option<Enr> and the conditional equality between the SocketAddr values would distinguish the two cases.

I'm not sure how you prefer to update the public API of Discv5. We can defer to a subsequent PR if preferred. That would keep the scope of this PR small.

The simplest backward-compatible thing is to add a new public talk_req method that takes a NodeContact argument.

I made an attempt to update the existing talk_req to something like

pub fn talk_req(node_contact: impl TryInto<NodeContact, Error = NonContactable>, ..)

but this would not accept an Enr because an IpMode is required for conversion.

@@ -193,6 +193,7 @@ async fn main() {
Discv5Event::Discovered(enr) => info!("Enr discovered {}", enr),
Discv5Event::EnrAdded { enr, replaced: _ } => info!("Enr added {}", enr),
Discv5Event::NodeInserted { node_id, replaced: _ } => info!("Node inserted {}", node_id),
Discv5Event::SessionEstablished(enr) => info!("Session established {}", enr),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you update this to better reflect the new contents?

src/node_info.rs Outdated
@@ -66,6 +66,17 @@ impl NodeContact {
)
}

pub fn from_non_contactable_and_socket_addr(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the NonContactable type is mainly used as an expressive error, and if we are to receive an instance of this type in a public api we would need to make sure the Enr does not declare a socket address under the current ip mode, or otherwise we break the guarantee that the given socket_addr and enr are consistent (Either the same or None in the enr)

So in the meantime I suggest that this function is private. We can later improve this

Suggested change
pub fn from_non_contactable_and_socket_addr(
fn from_non_contactable_and_socket_addr(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I removed the entire function. My feeling is that new talk request API will suggest whether a function like this is beneficial.

@divagant-martian
Copy link
Collaborator

divagant-martian commented Jun 2, 2022

just reviewed this, but I realized it's enough to emit the ENR and the socket_addr. The first one contains the public key and the node id, and the second one guarantees a socket_addr exists. It would be easier on the application level to not deal with a possible None value in the enr field that will never occur, since a session is only considered established when an ENR is validated.

On the other side, if we return a NodeContact, which can't the created without validation, for the talk request we could simply add a new method that receives this type. Both approaches have their pros and cons

What do you think @jacobkaufmann ?

@jacobkaufmann
Copy link
Contributor Author

I agree that it would be better to emit (Enr, SocketAddr) so I made that change.

Based on my understanding of discv5, the new talk request API could take a NodeContact, and we could add a new function

fn from_public_key_and_socket_addr(..) -> NodeContact {
  ..
}

to construct a NodeContact whose ENR field is None.

For the application, we would initially try to call NodeContact::try_from_enr. If this returns a NonContactable, then we can call the function above with the ENR public key and socket address from the established session event.

Of course this is just a suggestion. I found it a bit tricky to design for some arbitrary (Enr, SocketAddr) pair, because outside of discv5, we do not have the guarantee that we have for the same pair in a SessionEstablished event.

@divagant-martian
Copy link
Collaborator

I think we can deal with the next step in another PR. The current state seems to improve towards your goals so lets get this in

@divagant-martian divagant-martian merged commit dbf6879 into sigp:master Jun 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Request ENR for NodeAddress
3 participants