Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http-routing: add "get closest peers" operation #476

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

achingbrain
Copy link
Member

Adds a new HTTP endpoint that can be used to request records for the closest peers to a given key that the routing implementation knows about.

The use-case for this is browser nodes performing random walks to find peers that they can make a circuit relay reservation on, without having to be DHT clients to perform the walk which can be undesirable given all the connection/processing overhead that entails.

I've tried to avoid defining what "closest" means to leave it up to the routing implementation, only that the responses should be "closer" than the optional closerThan parameter which should mean the caller gets useful results for other scenarios.

@achingbrain achingbrain force-pushed the feat/add-get-closest-peers-to-http-routing branch from a6bf57c to 7e4489e Compare June 6, 2024 07:42
Adds a new HTTP endpoint that can be used to request records for the
closest peers to a given key that the routing implementation knows
about.

The use-case for this is browser nodes performing random walks to
find peers that they can make a circuit relay reservation on, without
having to be DHT clients to perform the walk which can be undesirable
given all the connection/processing overhead that entails.
@achingbrain achingbrain force-pushed the feat/add-get-closest-peers-to-http-routing branch from 7e4489e to 0a3d8e3 Compare June 6, 2024 07:43
@achingbrain achingbrain requested a review from lidel June 6, 2024 07:48
@@ -106,6 +106,60 @@ Each object in the `Providers` list is a record conforming to a schema, usually

## Peer Routing API

### `GET /routing/v1/closest-peers/{peer-id}?[closerThan]&[count]`
Copy link
Member

@lidel lidel Jun 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@achingbrain I think we could prototype this in https://github.com/ipfs-shipyard/someguy and deploy to https://delegated-ipfs.dev.

Priority-wise, and for the purpose of IPIP we will write for this at some point, does the utility here stops at finding Circuit Relay servers viable in browsers without being a DHT client?

Would it be useful to have API where the entire hash space could be queried for closest value, not just PeerIDs?
I wonder if this could be generalized into delegated /routing/v1/closest/{cid}?[than]&[count] where behind the scenes we prefix and turn peerids and cids to amindht hashes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the utility here stops at finding Circuit Relay servers viable in browsers without being a DHT client?

No, there are other uses - you may wish to use this to populate your routing tables, or to find peers that can host provider records. It would be useful to perform these operations from environments that have limited resources or cannot open many concurrent connections, for example.

Would it be useful to have API where the entire hash space could be queried for closest value, not just PeerIDs?

Yes, plus here the PeerIDs are represented as CIDs, so the above is something that can be performed.

To make this clearer it might be better specced as {cid} rather than {peer-id} I guess.

Copy link
Contributor

@aschmahmann aschmahmann Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use-case for this is browser nodes performing random walks to find peers that they can make a circuit relay reservation on, without having to be DHT clients

does the utility here stops at finding Circuit Relay servers

Note: In many cases you'd want relay servers you'd also want peer routing (i.e. finding your address based on your peerID) which in the case of Amino means having recently pinged the closest peers to your peerID in XOR space. Ideally one of those (20) nodes is also acting as your relay which means the number of connections and queries you'd need are even smaller. So we don't really want a random walk we want to talk to specific peers when possible and only broaden the search if we have to.

src/routing/http-routing-v1.md Outdated Show resolved Hide resolved
src/routing/http-routing-v1.md Outdated Show resolved Hide resolved

- `closerThan` is an optional [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md) represented as a CIDv1 encoded with `libp2p-key` codec.
- Returned peer records must be closer to `peer-id` than `closerThan`.
- If omitted the routing implementation should use it's own [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the query is expected to return the absolute closest peers to the target CID, the only use case I can see for closerThan is to find peers that are under eclipse attacks, but there are no current plans to implement it. Otherwise it will just limit the number of returned peers.

If there are other reasons to keep this parameter, I would recommend against using the receiver's (routing implementation) own peer id, when no value is provided. The default value should be no filter at all, return all the count closest peers. Otherwise, if a peer is looking for the 20 closest peers to some CID, and it reaches the 5th closest peer, this one will respond only with a list of 4 peers (because all the other are further away compare to the 5th closest peer). So in order to get the 20 closest peers, the requester would have to come up with a closerThan parameter that is far enough so that at least 20 peers are closer to the target CID.

- Returned peer records must be closer to `peer-id` than `closerThan`.
- If omitted the routing implementation should use it's own [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md).
- `count` is an optional number that specifies how many peer records the requester desires.
- Minimum 1, maximum 100, default 20.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the closerThan parameter is set, and count is omitted, the default value should be the maximum, otherwise some peers closer than closerThan may be filtered out. Both parameters could however be combined if both are set.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my assumption here is that the peers would be sorted by closeness value before the limit was applied.

Would this not be the case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my assumption here is that the peers would be sorted by closeness value before the limit was applied.

Yes that is correct. My point is that both closerThan and count can limit the count of peers that are returned, the parameters may conflict with each other.

E.g peer-id=key0&closerThan=key1 should return all the peers between key0 and key1, but if there are more than default=20 peers between key0 and key1 this query will only return the default closest, and not all of them up to the maximum.

If none of count or closerThan parameters is set, the default count should be default=20. However, if closerThan is set and not count, the default count should be set to the maximum=100. This allows to return as many peers as possible between key0 and key1, and if the caller wants to set a limit, they should combine closerThan and count.

Also, when setting count and not closerThan using the node's peer id as default closerThan parameter can be problematic. E.g I want to provide a CID to the 20 closest nodes, and I am the 12th closest peer to this CID. I will receive only 11 closer nodes, and will be unable to find the 20 closest nodes. Hence, if count is set and not closerThan the closerThan default should be the opposite of the target peer-id (peer-id XOR 111...11). Maybe the opposite of the target peer-id is a better default for closerThan than the requester's peer id, allowing the default response to contain a constant number of peers.

@@ -106,6 +106,60 @@ Each object in the `Providers` list is a record conforming to a schema, usually

## Peer Routing API

### `GET /routing/v1/closest-peers/{peer-id}?[closerThan]&[count]`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this makes sense at the /routing/v1 namespace since it's fairly specific to systems like the Amino DHT.

Having an HTTP API for this kind of thing makes sense though, perhaps as /amino or /kad or something. It could be nested under /routing/v1/<amino, etc.> as well if that makes folks happier.

  • Note: Given that this is quite close to being the HTTP equivalent of the FIND_NODE RPC something like this could potentially end up being worked into that protocol itself (e.g. to enable http-based request/response).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hope was that we could have it be routing-agnostic but maybe that's not possible, since the actual results would very much depend on what was being queried and how.

I'm happy with:

GET /routing/v1/amino/closest-peers/{peer-id}?[closerThan]&[count]

though if it's all amino, perhaps we should just have done with it and use the RPC names?

  • GET /routing/v1/amino/find-node/{multibase}

...then maybe later:

  • GET /routing/v1/amino/get-value/{multibase}
  • GET /routing/v1/amino/get-providers/{cid}

though the second is redundant given the existing HTTP endpoints and the first is only partially redundant since we could use it to search for more than just IPNS records.

...and even later PUT_VALUE and ADD_PROVIDER?

@@ -106,6 +106,60 @@ Each object in the `Providers` list is a record conforming to a schema, usually

## Peer Routing API

### `GET /routing/v1/closest-peers/{peer-id}?[closerThan]&[count]`
Copy link
Contributor

@aschmahmann aschmahmann Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use-case for this is browser nodes performing random walks to find peers that they can make a circuit relay reservation on, without having to be DHT clients

does the utility here stops at finding Circuit Relay servers

Note: In many cases you'd want relay servers you'd also want peer routing (i.e. finding your address based on your peerID) which in the case of Amino means having recently pinged the closest peers to your peerID in XOR space. Ideally one of those (20) nodes is also acting as your relay which means the number of connections and queries you'd need are even smaller. So we don't really want a random walk we want to talk to specific peers when possible and only broaden the search if we have to.

Co-authored-by: Guillaume Michel <guillaumemichel@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants