DHT findprovs weirdness #6742

markg85 · 2019-11-02T15:57:11Z

Hi,

I'm running IPFS v0.4.22 on ArchLinux.

While i was playing with some IPFS features i ended up typing:
ipfs dht findprovs QmUovtAgkriTmV48sD7SALaBsqv9zZvcDX7urzHkE88AVQ -v

You can look that hash up, it's a folder with a few thousand wallpapers. I was trying to find how many nodes have it pinned (i still don't know how to do that.. but that's being asked here: https://discuss.ipfs.io/t/get-number-of-nodes-with-cid-pinned/986)

Now that -v argument gave me a very interesting result!
I then see:

15:27:53.492: * 12D3KooWDXjTosL7yFfFrVReVuW2pFjitqs4vvMeNR7aGZqFuQLV says use QmfQvXMG863ogfxNAaZmx7G6ZDCHafg8uHHYMQLVC9PYPa QmUb2SBjA3ahg5baCGd5zXkiYJZ8Cbj2sBCxMS893e5iym QmdgbCajis6mstsPRmPyBx8ZmL1zJTiCBKJGuhhGmG5X7P QmYhnmrUYX9cU8cBrnfvxezEy7zpdG6JGr75TSv9AkzHw8 QmQPtxFy1tq1hyY2K3Nwf3koWSRgzJYvjdrAVSKm1VA41i Qme3RQdfaDjU5RgUEVfKHrYfoZWQSvq6hf6LdT5Zq4kbSd QmfVxnbcYWej147Ti4tfdGapor39JR9GjDpsZcgvaDVNMT QmbQ1jUJquca964trBcL5tGw6ZDhcKeg3x4iaMQHXD8nd8 QmSkr7ZBWvc5m15VeetAVQuRwYtb6XkMyMpN6yyGTZ8Sv3 QmcYUAdgArhj83N2X9xe5sreauDh5rQAxGAZJbdZzpr4Er QmSWfzcHW3STkFGPceNYfRwaga8vrkHoaRLSMGNU1Wvi4x QmPDYDzHcYz3XQk6Dz8rxwPC5DtAERNJmCoE4BgfUfiBEt QmYU4jRU1FZNWEbAsc7U25yzk2tApEi7FsTYH7gujXJ3Uf QmTakNK8o5ywJ6JpZCq3ZYtdkvcKFvtmBUzJ81G59VesdD QmWsL7XYcFojkk7i6voyszW3JMz1m3NC5MCw6GzcEeiNQC

I doubt the list of nodes to query is verified by the sender node to see if it's reachable (through ping?). Just to prevent sending a list of nodes where some might be stale. In that "specific" list the log didn't show up any failures for those hashes, but the log does show failures from others:
15:27:54.142: error: failed to dial QmRP8ZNpnQ4VFMpyzwvVUsXfXHBj6nSNH9Y57D9KXMUXwY: no good addresses.

Would it be wise to first verify if the nodes are reachable on the sending node before sending that list to the requester node? You would then also be able to use that mechanism to keep your own nodes list alive and healthy as you know when they were last active. I'm guessing this to likely speeds up findprovs too as it would spend most of it's time querying actually alive nodes as opposed to a mixture of dead and alive ones.

I have quite a few of those errors in my logs

But upon inspecting the log some more, i see some very weird behavior.
I see tons and tons of network requests to local ip's! To clarify, i'm in a local ip with a couple other pc's turned on, but in this network only one is turned on that has IPFS.
Yet i see loads of:

15:28:02.877: error: failed to dial : all dials failed
  * [/ip6/2001:978:2305:43::8e/tcp/4001] dial tcp6 [2001:978:2305:43::8e]:4001: connect: network is unreachable
  * [/ip6/fdda:d0d0:cafe:1302::1004/tcp/4001] dial tcp6 [fdda:d0d0:cafe:1302::1004]:4001: connect: network is unreachable
  * [/ip4/10.16.0.6/tcp/4001] dial tcp4 0.0.0.0:4001->10.16.0.6:4001: i/o timeout
  * [/ip4/192.168.86.25/tcp/4001] dial tcp4 0.0.0.0:4001->192.168.86.25:4001: i/o timeout

Those are all local addresses! Neither is in my local range.
That can't possible be right? Right?

Is this a quirk in my local environment? Or is this something that everyone has but that i happen to notice?

Cheers,
Mark

The text was updated successfully, but these errors were encountered:

ivan386 · 2019-11-02T17:57:54Z

This because dht nodes publish they address by them self. As opposed in bittorrent dht used address from where request come from.

aschmahmann · 2019-11-02T23:18:36Z

There are a number of issues that seem intertwined here that our being discussed in libp2p repos. IIUC the two big categories are:

Peers that are not publicly dialable (e.g. behind NATs) shouldn't be part of the global DHT network (although they should be allowed to query the network). DHT Mode Switching libp2p/go-libp2p-kad-dht#349
When a DHT returns the addresses of peers from a query it should remove addresses that are obviously undialable (e.g. localhost) findpeer queries return peers with no workable addresses libp2p/go-libp2p-kad-dht#357
- There are a number of fix proposals underway that are not yet in master including:

Seems like these issues might be the best place to continue this conversation.

Before we head off to the other issues though...

Note that there are two types of peers that we care about with DHT provider records:

Peers that are providing the data we care about
DHT peers that are more likely to know peers that are providing the data

I think there's an argument to be made for the peers providing data to be allowed to have local addresses. What if there's a node behind a network that is both behind a NAT and has MDNS disabled (e.g. university, hospital, etc.) that wants to advertise data to peers within that network? We may want to allow them to use the public DHT to advertise.

The peers that are just hops along the DHT should definitely be pruned to have dialable addresses though since they are all supposed to be publicly dialable.

markg85 · 2019-11-03T13:00:54Z

There are a number of issues that seem intertwined here that our being discussed in libp2p repos. IIUC the two big categories are:

Peers that are not publicly dialable (e.g. behind NATs) shouldn't be part of the global DHT network (although they should be allowed to query the network). libp2p/go-libp2p-kad-dht#349

When a DHT returns the addresses of peers from a query it should remove addresses that are obviously undialable (e.g. localhost) libp2p/go-libp2p-kad-dht#357

There are a number of fix proposals underway that are not yet in master including:

libp2p/go-libp2p#657

libp2p/go-libp2p-kad-dht#360

libp2p/go-libp2p-kad-dht#363.

Seems like these issues might be the best place to continue this conversation.

Before we head off to the other issues though...

Note that there are two types of peers that we care about with DHT provider records:

Peers that are providing the data we care about

DHT peers that are more likely to know peers that are providing the data

I think there's an argument to be made for the peers providing data to be allowed to have local addresses. What if there's a node behind a network that is both behind a NAT and has MDNS disabled (e.g. university, hospital, etc.) that wants to advertise data to peers within that network? We may want to allow them to use the public DHT to advertise.

The peers that are just hops along the DHT should definitely be pruned to have dialable addresses though since they are all supposed to be publicly dialable.

Thank you very much for your description! That really helps a lot in understanding the rationale for having this "weirdness" (that now doesn't sound so weird anymore).

I understand the need for nodes in a network behind a NAT and without MDSN. Note that those are probably most home networks too :)
Yeah, that's tricky to get right as you don't want to lock out anyone either. But you also don't want to give everyone a penalty to keep it as is. That penalty in this case being that every IPFS node in the world right now are doing (collectively) probably billions of failed network requests per second just because nodes send their local addresses too.

However, the discussions for this seems to be taking place in the issues you've mentioned.
So to not fragment that any further, closing this one.

markg85 added the kind/enhancement A net-new feature or improvement to an existing feature label Nov 2, 2019

markg85 closed this as completed Nov 3, 2019

markg85 mentioned this issue Nov 3, 2019

identify: Update addr advertise logic to exclude localhost addrs selectively libp2p/go-libp2p#657

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DHT findprovs weirdness #6742

DHT findprovs weirdness #6742

markg85 commented Nov 2, 2019

ivan386 commented Nov 2, 2019 •

edited

Loading

aschmahmann commented Nov 2, 2019 •

edited

Loading

markg85 commented Nov 3, 2019

DHT findprovs weirdness #6742

DHT findprovs weirdness #6742

Comments

markg85 commented Nov 2, 2019

ivan386 commented Nov 2, 2019 • edited Loading

aschmahmann commented Nov 2, 2019 • edited Loading

markg85 commented Nov 3, 2019

ivan386 commented Nov 2, 2019 •

edited

Loading

aschmahmann commented Nov 2, 2019 •

edited

Loading