Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support client mode in kademlia #2184

Closed
wants to merge 39 commits into from

Conversation

whereistejas
Copy link

@whereistejas whereistejas commented Aug 7, 2021

Fix #2032

Feature description:

To do this, we use libp2p's AutoNAT, which acts as a distributed session traversal utility for NAT (STUN) layer, informing peers of their observed addresses and whether or not they appear to be publicly dialable. Only when peers detect that they are publicly dialable do they switch from client mode (where they can query the DHT but not respond to queries) to server mode (where they can both query and respond to queries). Similarly, if a server discovers that it is no longer publicly dialable, it will switch back into client mode.

From https://docs.ipfs.io/concepts/dht/#undialable-peers

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still a lot missing here. Let me know if you need more details. Also see my comment #2032 (comment).

protocols/kad/src/behaviour.rs Outdated Show resolved Hide resolved
protocols/kad/src/behaviour.rs Outdated Show resolved Hide resolved
protocols/kad/src/behaviour.rs Outdated Show resolved Hide resolved
protocols/kad/src/dht.proto Outdated Show resolved Hide resolved
@whereistejas
Copy link
Author

Hi @mxinden,
I'm trying to write an integration test to test this enhancement. It's taking longer than anticipated.

@whereistejas
Copy link
Author

whereistejas commented Aug 15, 2021

Hi @mxinden,
I have added an example file (sort of like an integration test) for our implementation. The idea is that you run multiple instances of the example in different terminals, and we can run one instance with --client flag. Then, we use the list peer command in the other (non-client) instances to check if the client peer is not present in their routing tables.

Could you give me some pointers about how to proceed from this point?

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great steps forward @whereistejas!

In case you are interested in working on specifications as well: It would be great to have the Client Mode feature described in the Kademlia specification.

@@ -0,0 +1,157 @@
#![allow(dead_code, unused_variables)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of an example, could you add a test to kad/src/behaviour/tests.rs?

Copy link
Author

@whereistejas whereistejas Aug 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing this test in the kademlia module will be kind of difficult, as I'm using a lot of other things as well, for building the swarm or transport. Is it okay if I put this is in a test directory?

I can further enhance kademlia-example to create four peers and cross-check if the peer in client mode is actually, absent from the other's routing tables.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing this test in the kademlia module will be kind of difficult, as I'm using a lot of other things as well, for building the swarm or transport. Is it okay if I put this is in a test directory?

No need for most of it. You can take the manual_bucket_insert and bootstrap test as an example.

I can further enhance kademlia-example to create four peers and cross-check if the peer in client mode is actually, absent from the other's routing tables.

I don't think there is any need for an example. The "cross-check" sounds great, but should really be done in a test instead of an example.

protocols/kad/src/behaviour.rs Outdated Show resolved Hide resolved
protocols/kad/src/behaviour.rs Outdated Show resolved Hide resolved
protocols/kad/src/behaviour.rs Outdated Show resolved Hide resolved
protocols/kad/src/handler.rs Outdated Show resolved Hide resolved
Comment on lines 131 to 133

// If Mode::Client, node will act in `client` mode in the Kademlia network.
pub client: Mode,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Mode logic you are introducing in this pull request replaces the KademliaHandlerConfig::allow_listening logic. The latter can thus be replaced in favor of the former 🎉

Comment on lines 540 to 549
// TODO:
// Just because another peer is sending us Kademlia requests, doesn't necessarily
// mean that it will answer the Kademlia requests that we send to it.
// Thus, Commenting this code out for now.
// if let ProtocolStatus::Unconfirmed = self.protocol_status {
// // Upon the first successfully negotiated substream, we know that the
// // remote is configured with the same protocol name and we want
// // the behaviour to add this peer to the routing table, if possible.
// self.protocol_status = ProtocolStatus::Confirmed;
// }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I am in favor of documenting this, though obviously both the TODO and the commented out code needs to be removed.

@@ -46,6 +46,18 @@ pub const DEFAULT_PROTO_NAME: &[u8] = b"/ipfs/kad/1.0.0";
/// The default maximum size for a varint length-delimited packet.
pub const DEFAULT_MAX_PACKET_SIZE: usize = 16 * 1024;

#[derive(Debug, Clone)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[derive(Debug, Clone)]
/// See [`crate::KademliaConfig::set_mode`].
#[derive(Debug, Clone)]

@whereistejas
Copy link
Author

whereistejas commented Aug 19, 2021

In case you are interested in working on specifications as well: It would be great to have the Client Mode feature described in the Kademlia specification.

Sure! I can do this.

The current code changes in the kademlia submodule are not producing the desired results. I will take a look at it over the weekend. Do I need to make any other changes? I remember us discussing about another edge case in the multistream module (I maybe completely wrong here).

@mxinden
Copy link
Member

mxinden commented Aug 19, 2021

The current code changes in the kademlia submodule are not producing the desired results. I will take a look at it over the weekend. Do I need to make any other changes? I remember us discussing about another edge case in the multistream module (I maybe completely wrong here).

Nothing comes to my mind. What exactly is not working? I don't think this is related to multistream-select.

@whereistejas
Copy link
Author

libp2p kademlia client mode testing
Hi @mxinden, this is what I meant by not working. As you can see in the screenshot, the "other" peers can still see the "client" node.

@whereistejas
Copy link
Author

whereistejas commented Aug 24, 2021

Okay, so something really weird is happening. I put some println statements in listen_protocol and inject_fully_negotiated_inbound to check if these methods are getting called when two peers connect to one another and I found that they are not getting called. I am using the code I wrote in examples/kademlia-example.rs to test this. Am I doing something wrong?

Update: I think this is happening because of the following code:

MdnsEvent::Discovered(nodes) => {
    for (peer_id, multiaddr) in nodes {
        swarm
            .behaviour_mut()
            .kademlia
            .add_address(&peer_id, multiaddr);
    }
}

SInce, I'm directly adding addresses, the peers are not performing a protocol negotiation.

@whereistejas whereistejas requested a review from mxinden August 29, 2021 20:26
@whereistejas
Copy link
Author

whereistejas commented Aug 30, 2021

This screenshot is from my machine.

Screenshot from 2021-08-30 23-29-25

😢

@whereistejas
Copy link
Author

So, I guess the CI failure for commit 54331e5 was just a random event.

if swarm.local_peer_id().clone() == peers[0] {
is_server_present = peer == peers[1];
}
return Poll::Ready(());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest reading the async book: https://rust-lang.github.io/async-book/

With the early return we might not check whether the server stores the client in its routing table.

.collect();

swarms[0].1.dial_addr(addrs[1].clone()).unwrap();
swarms[1].1.dial_addr(addrs[0].clone()).unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the server dial the client?

Copy link
Author

@whereistejas whereistejas Sep 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's there to guarantee that server tried to reach out to the client, before we check if the client is absent from the server's routing table.

Poll::Ready(_) => {
// Check if the client peer is NOT present in the server peer's routing table.
if swarm.local_peer_id().clone() == peers[1] {
is_client_absent = swarm.behaviour_mut().kbucket(peers[0]).is_none();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is kind of hard to reliably check for absence of something, especially in an async setting.

How about:

  1. The server stores a value Kademlia::put_value, waiting for a KademliaEvent::OutboundQueryCompleted.
  2. The client connects to the server.
  3. The client does a Kademlia::get_record, waiting for a KademliaEvent::OutboundQueryCompleted.
  4. Check that the client has the server in its routing table and check that the server does not have the client in its routing table.

Copy link
Author

@whereistejas whereistejas Sep 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I will implement this. I will try to finish before the same time, tomorrow. Please, keep an eye out for any notifications from this PR. I want to get this PR closed, as early as possible (ofcourse, without sacrificing code quality).

Are we finished with the other parts of this PR except the test?

@whereistejas
Copy link
Author

Hi @mxinden, the CI tests are passing on my local machine. Could you take a look at the code, please?

@whereistejas
Copy link
Author

whereistejas commented Sep 20, 2021

Hi @mxinden, this PR ready for review. Just a gentle reminder. 😄

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good progress here.

Just to make sure, does your test fail without the patch?

.gitignore Outdated Show resolved Hide resolved
cfg.set_mode(Mode::Client);
let mut client = build_node_with_config(cfg);

// Fitler out peer Ids.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Fitler out peer Ids.
// Filter out peer Ids.

Comment on lines 1371 to 1381
Ok(PutRecordOk { .. }) => {
// Check if the server peer is not connected to the client peer.
assert!(
swarm
.behaviour_mut()
.connected_peers
.iter()
.all(|p| *p != peers[2]),
"The server peer is connected to the client peer."
);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about calling swarm[2].behaviour_mut().get_record(..) once the first server is done with the PutRecord query. That would prevent any races between the PUT and the GET.

Copy link
Author

@whereistejas whereistejas Oct 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I have a doubt with the present test we have:

It is kind of hard to reliably check for absence of something, especially in an async setting.

  1. The server stores a value Kademlia::put_value, waiting for a KademliaEvent::OutboundQueryCompleted.
  2. The client connects to the server.
  3. The client does a Kademlia::get_record, waiting for a KademliaEvent::OutboundQueryCompleted.
  4. Check that the client has the server in its routing table and check that the server does not have the client in its routing table.

Are we sure that using the GetRecord operation will update the routing tables of the server peer? I'm trying to find the code, which does the "update"-ing. But, I'm unable to find it. Could you please point me towards the appropriate code?

Update: (Please correct me if I'm wrong) I don't think Kademlia is bi-directional. For example, if peer A adds peer B to its routing table, then peer B will not add peer A to its routing table.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: (Please correct me if I'm wrong) I don't think Kademlia is bi-directional. For example, if peer A adds peer B to its routing table, then peer B will not add peer A to its routing table.

Before this pull request both peers would likely add each other to their respective routing table. After this pull request, whether the two peers add each other to their routing table depends on the Mode the other peer is running in.

protocols/kad/src/handler.rs Outdated Show resolved Hide resolved
@whereistejas
Copy link
Author

whereistejas commented Sep 30, 2021

The test is not failing without the patch, which is worrying me. I will try to debug and find out why.


// Create a client peer.
let mut cfg = KademliaConfig::default();
cfg.set_mode(Mode::Server);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused. Why is the client initialized with Mode::Server?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, that's just a typing mistake on my side. Anyways, this test should fail when the client is in Server mode, which it is not. I have some doubts regarding the way Kademlia works, which I have expressed here. Could you please help me out with those?

@MarcoPolo MarcoPolo mentioned this pull request Feb 16, 2022
@mxinden
Copy link
Member

mxinden commented Jun 27, 2022

Closing here in favor of #2521.

@mxinden mxinden closed this Jun 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

protocols/kad: Support client mode
2 participants