Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TODOs before someguy's General Availability #24

Open
8 of 11 tasks
Tracked by #12
lidel opened this issue Nov 30, 2023 · 13 comments
Open
8 of 11 tasks
Tracked by #12

TODOs before someguy's General Availability #24

lidel opened this issue Nov 30, 2023 · 13 comments

Comments

@lidel
Copy link
Member

lidel commented Nov 30, 2023

TODOs for delegated-ipfs.dev/routing/v1

Minimum we need to switch delegated-ipfs.dev/routing/v1 from Kubo to someguy:

  • ask both cid.contact and Amino DHT
  • allow adjusting config via env variables instead of CLI params (prior art in )
  • Docker image build & publish the same way as https://github.com/ipfs/rainbow/

cc @ns4plabs @cewood lmk if there are any additional asks from bifrost infra side that would make deployment/configuration easier – my idea is to follow similar conventions as rainbow / bifrost-gateway

TODOs for General Availability

Things to do before we start recommending someguy to our users for use in their dev/prod:

cc @2color for visibility, feel free to flag any dx/ux pain points we should fix before GA

@aschmahmann
Copy link
Contributor

defaults: in addition to DHT, should we ask both delegated-ipfs.dev and cid.contact, or only delegated-ipfs.dev?

Looks like this is being done in #50 where both are asked, but trying to understand why a little more. Is this about future proofing?

At the moment there are two widely deployed routing "systems" used by IPFS implementations:

  • Amino DHT
  • IPNI (albeit until federation is landed along with expectations around sync-times it's mostly considered the cid.contact service)

delgated-ipfs.dev currently just proxies those two systems. So by default hitting it in addition to the others just means the same query is happening twice, just with maybe some caching happening on delegated-ipfs.dev. What's the goal here?


Setting the default for someguy ask to delegated-ipfs.dev, or (re)building a tool like ipget to default to delegated-ipfs.dev seems reasonable though.

@lidel
Copy link
Member Author

lidel commented Mar 7, 2024

@aschmahmann I raised similar questions in #50 (comment) before reading your comment.

In my mind the goals of including delegated-ipfs.dev are:

  • include an endpoint that follows latest specs and is actively maintained by the same team, allowing us to dogfood more and give people example of system where there is more than a single endpoint.
  • cid.contact and IPNI as it is today does not support /routing/v1/peers or /routing/v1/ipns routing endpoints. delegated-ipfs.dev supports all routing types.
  • cid.contact does not support streaming (Accept: application/x-ndjson), delegated-ipfs.dev does.

My main question is do we even keep cid.contact , if delegated-ipfs.dev proxies to it already.

We could say we trade the additional request for resiliency of having different hostnames with separate HTTP caches, which I am fine enough as a reason enough to keep it. Thoughts?

@willscott
Copy link

cid.contact does not support streaming (Accept: application/x-ndjson)

It does support streaming afaik? - is there a subtle difference in the spec vs what we support that we should update?

@masih
Copy link
Member

masih commented Sep 3, 2024

cid.contact does not support streaming (Accept: application/x-ndjson)

This is inaccurate. cid.contact does support streaming result.

@aschmahmann
Copy link
Contributor

cid.contact does not support streaming (Accept: application/x-ndjson)

This is inaccurate. cid.contact does support streaming result.

@masih are you sure?

❯ curl -H "accept: application/json" https://cid.contact/routing/v1/providers/bafybeif6f27eonqanzvltpfhaf2fgmwz6n5e7j6fksuc6jrs5payvufyha
{"Providers":[{"Addrs":["/ip4/76.219.232.45/tcp/24001"],"ID":"12D3KooWPNbkEgjdBNeaCGpsgCrPRETe4uBZf1ShFXStobdN18ys","Protocols":["transport-graphsync-filecoinv1"],"Schema":"peer","transport-graphsync-filecoinv1":"kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgmiTyU31IQA81ZyvKUkNR5cW/orcsg79MUwG2wmtwthVsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="},{"Addrs":["/ip4/76.219.232.45/tcp/24888"],"ID":"12D3KooWSoSgVaUvoguDQZu1doytze9RgnnANwJoiLw7KUcAXq8i","Protocols":["transport-bitswap"],"Schema":"peer","transport-bitswap":"gBI="},{"Addrs":["/dns/cesginc.com/tcp/443/https"],"ID":"12D3KooWPNbkEgjdBNeaCGpsgCrPRETe4uBZf1ShFXStobdN18ys","Protocols":["transport-ipfs-gateway-http"],"Schema":"peer","transport-ipfs-gateway-http":"oBIA"}]}
❯ curl -H "accept: application/x-ndjson" https://cid.contact/routing/v1/providers/bafybeif6f27eonqanzvltpfhaf2fgmwz6n5e7j6fksuc6jrs5payvufyha
{"Providers":[{"Addrs":["/ip4/76.219.232.45/tcp/24001"],"ID":"12D3KooWPNbkEgjdBNeaCGpsgCrPRETe4uBZf1ShFXStobdN18ys","Protocols":["transport-graphsync-filecoinv1"],"Schema":"peer","transport-graphsync-filecoinv1":"kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgmiTyU31IQA81ZyvKUkNR5cW/orcsg79MUwG2wmtwthVsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="},{"Addrs":["/ip4/76.219.232.45/tcp/24888"],"ID":"12D3KooWSoSgVaUvoguDQZu1doytze9RgnnANwJoiLw7KUcAXq8i","Protocols":["transport-bitswap"],"Schema":"peer","transport-bitswap":"gBI="},{"Addrs":["/dns/cesginc.com/tcp/443/https"],"ID":"12D3KooWPNbkEgjdBNeaCGpsgCrPRETe4uBZf1ShFXStobdN18ys","Protocols":["transport-ipfs-gateway-http"],"Schema":"peer","transport-ipfs-gateway-http":"oBIA"}]}
❯ curl -H "accept: application/json" https://delegated-ipfs.dev/routing/v1/providers/bafybeif6f27eonqanzvltpfhaf2fgmwz6n5e7j6fksuc6jrs5payvufyha
{"Providers":[{"Addrs":["/ip4/76.219.232.45/tcp/24001"],"ID":"12D3KooWPNbkEgjdBNeaCGpsgCrPRETe4uBZf1ShFXStobdN18ys","Protocols":["transport-graphsync-filecoinv1"],"Schema":"peer","transport-graphsync-filecoinv1":"kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgmiTyU31IQA81ZyvKUkNR5cW/orcsg79MUwG2wmtwthVsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="},{"Addrs":["/ip4/76.219.232.45/tcp/24888"],"ID":"12D3KooWSoSgVaUvoguDQZu1doytze9RgnnANwJoiLw7KUcAXq8i","Protocols":["transport-bitswap"],"Schema":"peer","transport-bitswap":"gBI="},{"Addrs":["/dns/cesginc.com/tcp/443/https"],"ID":"12D3KooWPNbkEgjdBNeaCGpsgCrPRETe4uBZf1ShFXStobdN18ys","Protocols":["transport-ipfs-gateway-http"],"Schema":"peer","transport-ipfs-gateway-http":"oBIA"}]}
❯ curl -H "accept: application/x-ndjson" https://delegated-ipfs.dev/routing/v1/providers/bafybeif6f27eonqanzvltpfhaf2fgmwz6n5e7j6fksuc6jrs5payvufyha
{"Addrs":["/ip4/76.219.232.45/tcp/24001"],"ID":"12D3KooWPNbkEgjdBNeaCGpsgCrPRETe4uBZf1ShFXStobdN18ys","Protocols":["transport-graphsync-filecoinv1"],"Schema":"peer","transport-graphsync-filecoinv1":"kBKjaFBpZWNlQ0lE2CpYKAABgeIDkiAgmiTyU31IQA81ZyvKUkNR5cW/orcsg79MUwG2wmtwthVsVmVyaWZpZWREZWFs9W1GYXN0UmV0cmlldmFs9Q=="}
{"Addrs":["/ip4/76.219.232.45/tcp/24888"],"ID":"12D3KooWSoSgVaUvoguDQZu1doytze9RgnnANwJoiLw7KUcAXq8i","Protocols":["transport-bitswap"],"Schema":"peer","transport-bitswap":"gBI="}
{"Addrs":["/dns/cesginc.com/tcp/443/https"],"ID":"12D3KooWPNbkEgjdBNeaCGpsgCrPRETe4uBZf1ShFXStobdN18ys","Protocols":["transport-ipfs-gateway-http"],"Schema":"peer","transport-ipfs-gateway-http":"oBIA"}

It does support streaming afaik? - is there a subtle difference in the spec vs what we support that we should update?

It's doing json rather than ndjson, i.e. it's not streaming and relying on this line from the spec https://specs.ipfs.tech/routing/http-routing-v1/#streaming

Legacy server MAY respond with non-streaming application/json response even if the client requested streaming. It is up to the client to inspect the Content-Type header before parsing the response.

Showing this explicitly:

❯ curl -I -H "accept: application/x-ndjson" https://cid.contact/routing/v1/providers/bafybeif6f27eonqanzvltpfhaf2fgmwz6n5e7j6fksuc6jrs5payvufyha
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Content-Length: 715
Connection: keep-alive
Date: Tue, 03 Sep 2024 11:12:49 GMT
Access-Control-Allow-Methods: GET, PUT, POST, DELETE, PATCH, OPTIONS
Access-Control-Allow-Origin: *
Strict-Transport-Security: max-age=15724800; includeSubDomains
Access-Control-Allow-Credentials: true
Access-Control-Allow-Headers: DNT,X-CustomHeader,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Authorization
Access-Control-Max-Age: 1728000
X-Cache: Hit from cloudfront
Via: 1.1 521101b4b5baafcfa7548a73a3442cea.cloudfront.net (CloudFront)
X-Amz-Cf-Pop: BOS50-P1
X-Amz-Cf-Id: jtfTprleZVwJJ9CQ0LeoOJfXHyBCElEXZuZgTjtLBWZLUqVlgg2Ctw==
Age: 681

@willscott
Copy link

cid.contact will stream https://github.com/ipni/indexstar/blob/main/find_ndjson.go#L280-L282

but i think it maybe doesn't do it on the '/routing/v1/providers' endpoint, but rather on the '/mh/' find endpoint that lassie / saturn was using

@masih
Copy link
Member

masih commented Sep 3, 2024

Thanks for pointing that out @aschmahmann; you are right. Captured this issue to get it sorted.

@willscott
Copy link

Note on spec:
https://specs.ipfs.tech/routing/http-routing-v1/#streaming

  • the types of the lines of streaming aren't defined
  • the main response is an 'IPNI response' record, but the lines are publisher records and that semantic that we're getting rid of the top level object could probably get spelled out a bit better.

There's a PR to add support to indexer delegated routing endpoint for streaming here ipni/indexstar#200

@willscott
Copy link

Also, am I right that delegated routing drops the support for encrypted / double-hashed query/responses? is that something we want to add back in?

@aschmahmann
Copy link
Contributor

the types of the lines of streaming aren't defined

Are you asking to be more explicit by rewriting the same syntax as the responses normally are but without the wrapping instead of the shorthand that's defined in the section you linked? If so that seems fine. Would you be able to put up a PR to https://github.com/ipfs/specs with how you think it should look based on your understanding? If you're having trouble I can put one up and you can review / give feedback there.

the main response is an 'IPNI response' record, but the lines are publisher records and that semantic that we're getting rid of the top level object could probably get spelled out a bit better.

Well they're not IPNI responses they're routing responses from whatever the backend is (Amino DHT, IPNI, a custom server someone is running, ...), but yes it we're stripping the wrapper object because we're streaming individually parseable responses

Also, am I right that delegated routing drops the support for encrypted / double-hashed query/responses?

It's never supported them, but yes if you access cid.contact via the delegated routing endpoint rather than the IPNI endpoint you lose the ability to do the double-hashed queries.

is that something we want to add back in?

Open to feedback here, but TLDR is no since it's currently only useful to IPNI and people can use the IPNI API if they need that functionality.

The way I see the delegated routing API is as an abstraction across routing systems that's easy for consumers to work with. The double-hashing is specific to IPNI which means a few things:

  1. It doesn't seem like a good idea to build an abstraction around a single implementation, so waiting until there are two routing systems that both need this before putting something together seems reasonable
  2. If consumers are trying to gather data from multiple routing systems via a single endpoint then there's not really a sane version of double-hashing that makes sense for the client to use. The server could choose to use the IPNI double-hashing API behind the scenes if it was valuable. I don't know if it really is valuable for something like delegated.ipfs.dev given it represents many different peers. It might provide some level of privacy (e.g. prevent cid.contact from knowing which provider records are popular and when) but that doesn't seem super valuable at least to me.
    • If we land something like IPIP-0388: Routing HTTP API Support for Querying Multiple Routers specs#388 then clients could choose to just get the Amino DHT records from delegated.ipfs.dev and hit an IPNI provider separately via double hashing, they'd have to decide if that made sense to them given they're still sending their routing requests in cleartext to someone anyway. Maybe it would if they trust delegated-ipfs.dev more than the IPNI provider or they don't send to delegated-ipfs.dev until they're done checking with the IPNI provider, it seems like the client's call though

Note: Having specific APIs for routing systems that are fulfilled by a single service provider seems reasonable to me. For example, I suggested in ipfs/specs#476 (comment) that we could have Amino DHT or kad-dht specific endpoints that don't apply to others, the same could of course be done for IPNI as well.

@willscott
Copy link

TLDR is no since it's currently only useful to IPNI and people can use the IPNI API if they need that functionality.

I think the thing i was wondering is that a bunch of people spent effort on thinking through how this would work in a DHT context as well, so it seemed like it did get a while as a life beyond just IPNI. It also is a nice protection to be able to have when making queries that seems generally beneficial without much complexity on the client.

@aschmahmann
Copy link
Contributor

I'm not disagreeing with the idea of delegated double hashed queries that having. I'm just saying that right now I'm having trouble finding a reason anyone would benefit from having a shared API for it when it's really only useful for IPNI. If the work ever lands for Amino, or someone wants to spin up a separate routing system for their network that supports double hashing then that's another story but until then I'm having trouble seeing the value.

that seems generally beneficial without much complexity on the client.

Yes / No. If in practice you need to query both IPNI and the Amino DHT for content, but you're going to delegate your DHT queries (e.g. because you're in a browser or low resource environment) then you have some complexity in the choices that you'd need to figure out for example:

  1. Delegate the Amino DHT and IPNI lookups to the same endpoint
    a. Decided to ditch double hashing because you'll need cleartext for the Amino DHT queries anyway -> no need for double hashing in routing-v1
    b. Decide on a threat model and ordering that has you first check IPNI and then Amino through the same delegated routing endpoint with whatever delays, etc. seem appropriate -> still no need for double hashing in routing-v1 since for the double-hashed query could've just gone to an IPNI node directly
  2. Delegate Amino DHT and IPNI lookups to different endpoints -> decide on threat model and if you're ok querying in parallel or not, but either way no need to delegate IPNI alone to a separate endpoint because you could just go to an IPNI node directly and query them

That being said, my 2c are:

  1. If having a subsection "officially" mounted under /routing/v1/ipni would be helpful I'm happy to support that, the spec work would likely be easy since you could probably largely copy over the relevant IPNI spec. For example, I could see it potentially being helpful if you're building out federation and you want to be able to delegate the work of interacting with the federation while still supporting double-hashing
  2. If there are other users who want to build/augment a routing system compatible with IPNI's double hashing and spec + code it out, it seems reasonable to me to support that in someguy as well

@gammazero
Copy link

Please move discussion to IPFS Specs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants