-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libp2p + HTTP #477
libp2p + HTTP #477
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for getting thoughts down here @MarcoPolo. I did a quick read.
I also wanted to make sure you had seen this document from @aschmahmann .
Yup! but this isn't public, so I didn't link. |
Thanks for the link ping @BigLep and thanks @MarcoPolo for pushing this forward. As anyone who looks at that document will see, it describes some of the options and alternatives and tradeoffs here that are probably worth fleshing out. Some of the notable ones are:
|
@MarcoPolo try here. Notion has a crazy system where even if data is already public it needs to have a special link. |
I don't think this is a libp2p problem. This is a general problem with HTTP support in browsers. I would consider this out of scope for this spec. If this is really needed maybe the API should allow users to specify
I've thought a lot about this but purposely didn't include it in this spec to keep this one small and add extensions later. Now I'm thinking I should add a discussion section about this as future work. What I'm thinking is:
There's definitely more to expand on above, but I hope that's a good overview of what this spec enables, other use-cases we could enable with future specs, and why this is the first step. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the write up. I am excited for us to explore this space. I will take a more in depth look.
Co-authored-by: Max Inden <mail@max-inden.de>
I'm a bit confused about this doc. I think it may make sense for people that have discussed in person but I'm missing something.
What loses me is the part about automatically choosing whether to send your HTTP request normal, or to tunnel it via an existing stream. I can't make sense of that. If the server is an http server, why would a client ever bother tunneling. If the server is a libp2p server, why would it bother exposing over http? If the server needs to serve both regular http clients and libp2p-enabled ones, why would it need involve libp2p for the plain-http part? Client+server certificates to identify peers over http and noise handshakes? How do load-balancing and caching and all the HTTP goodies work then? So I'm pretty lost. I think it might be good to have some sort of http transport for libp2p (and before that, request/response semantics I guess), but this is somewhat unrelated to what go-libp2p-http does already. And I'm not even 100% it makes sense, given websockets is http already etc. At least I would like to hear ideas beyond the |
@hsanjuan thanks for taking a look and the great questions. I'll incorporate more details in the doc, but let me give a couple high level answers: I think maybe part of the confusion is that when folks say "HTTP" they usually mean the "HTTP protocol on top of a TCP (+ TLS)" connection. In this spec when I say "HTTP" I mean only the HTTP request response protocol. This protocol can run both on top of a libp2p stream (what go-libp2p-http does) or it can run on top of a plain old TCP+TLS connection (standard https traffic).
This is one of the main points of this spec. Instead of coming up with a request/response protocol that we can then easily put on top of HTTP, we should just use HTTP. There's no benefit of reinventing the wheel here.
Yes. I think this is good, and this spec is to standardize what go-libp2p-http does.
It wouldn't. If you connect to a node with only a multiaddr of
It doesn't have to expose an https server (HTTP + plain TCP+TLS). It could only support HTTP on top of libp2p streams (just like go-libp2p-http). The reason to expose HTTP on top of libp2p streams is because we would be using HTTP as our request/response protocol.
It doesn't have to, but it may be convenient since the logic is already there. Here's an example: Assume I have a simple protocol called Maybe later, I want to stop serving data blobs over libp2p-streams because I can't cache these as well. So I change my Now why did I even bother with the libp2p part in the first place? Wouldn't it have been better to just use an HTTP server and skip libp2p entirely? You could do this, but you lose out on a couple benefits:
This spec is partly to define the request/response semantics part (just use HTTP). Then the http transport, (which will only work for request/response protocols) you get for free-ish.
This doesn't conflict with websockets. Sometimes you want a request/response protocol rather than a stream based one. Here are some interesting usecases for libp2p + HTTP (longer term):
This spec is the first part of this long term goal. Right now we don't have a standarized request/response abstraction in libp2p. Folks have been using The next step is to figure out how to support servers that can't have a custom TLS certificate with the libp2p x509 extension. That's where the noise handshake could come in. But this is off-topic for the core of this spec (but happy to discuss). Phew! that was more than I intended to write. I hope that helps clarify things. |
Thanks for taking the time to explain! go-libp2p-http has been using urls like : Should a client instead use I think the case of having an nginx middleware needs to be polished. What type of SSL-termination can we do (sounds like not full termination?), what identity does nginx have if the client needs to verify identity and nginx is load-balancing multiple servers? In order to take care of caching etc... nginx should be doing layer 7, but that may mean nginx needs to proxy between two encrypted sides (one to the client, one to the server/s), which will result in perf penalty. Also, instead of nginx in particular, which is very configurable, we should think AWS Application Load Balancer. How can we make an ALB work with this? Finally, I would like to bring up proxy-protocol support libp2p/go-libp2p#1065. I was reminded of it because this idea overlaps in the intention to help operators, enable options for load-balancing and libp2p cross-compatibility with standard tooling, but it is not really what we asked for from the infra side long ago. I'm not quite sure that http-transport is a good answer for what we need to be honest, even though I like it and see its merits. In that sense it is a bit sad that our proxy-protocol support request is effectively abandoned. Should we be lobbying for it more? (we know proxy-protocol is horrible etc, but still). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @MarcoPolo for writing up this doc! I like that we don't create a separate libp2p request-response protocol. Reusing HTTP semantics here makes a lot of sense.
Authentication
I'm wondering how much we get out of client authentication here.
One important use case of libp2p + HTTP will be issuing request from browsers. The browser UI for selecting a client cert makes it practically unusable. Furthermore, browsers typically don't have a libp2p peer identity, if at all, we generate an ephemeral libp2p ID that doesn't have any long-term meaning. Maybe we need to start embracing the notion that not all clients have IDs?
One way to do so would be to introduce a dedicated client auth endpoint, which clients could call to authenticate themselves (e.g. after receiving a 401), and that would provide a bearer token.
Similarly, browsers don't allow accessing the certificate presented by the server. Even if the server's certificate contains the libp2p extension, we wouldn't be able to use it.
Redirects
I'm wondering how we should handle redirects (3xx). Arguably, those are one of the most important features of HTTP, and could prove important once we start using the DHT over HTTP.
Problems:
- While we can get from HTTP/libp2p to plain HTTP, the opposite direction seems to be more challenging.
- We'd need to figure out if it's possible to pass multiple redirect targets to the client, and have the client pick one (or multiple of them).
1. NAT traversal: You can make an HTTP request to a peer that's behind a NAT. | ||
1. Fewer connections: If you already have a libp2p connection, we can use that | ||
to create a stream for the HTTP request. The HTTP request will be faster since | ||
you don't have to pay the two round trips to establish the connection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's only true for HTTP 1.1 and HTTP/2. A fresh HTTP/3 is established within one RTT, and allows 0-RTT resumption for subsequent connections. The performance benefit of HTTP/libp2p is less clear in this case.
Client: | ||
1. Open a new stream to the target peer. | ||
1. Negotiate the `/libp2p-http` protocol. | ||
1. Use this stream for HTTP. (i.e. start sending the request) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this HTTP 1.1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://datatracker.ietf.org/doc/rfc9292/ could be useful as the wire format.
|
||
We can define HTTP Handlers using standard types: | ||
``` | ||
h1.SetHTTPHandler("/echo", func(peer peer.ID, w http.ResponseWriter, r *http.Request) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be really nice if we could reuse the http.ServeMux
. We could make the peer ID available as a header field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea!
|
||
## libp2p over plain HTTPS | ||
|
||
This is nothing more than a thin wrapper over standard HTTP. The only thing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be slightly OT but has the Upgrade
header been considered for running libp2p on top of HTTP?
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Upgrade
GET /index.html HTTP/1.1
Host: www.example.com
Connection: upgrade
Upgrade: multistream-select/1.0.0
Once confirmed, we can then negotiate any other protocol on top, i.e. yamux, noise, etc.
This could be a neat way for establishing a libp2p connection from the browser, assuming the peer we want to reach exposes an HTTP endpoint we can use to trigger the upgrade.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming we define a corresponding "libp2p over http" multiaddress protocol, we can build a Transport
that makes a GET
request with the above upgrade, waiting for 101 Switching Protocols
and then using the resulting stream for libp2p.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe: /http(s)/dns4/example.com/tcp/80
Note that http
at the front means we run all of the following protocols on top of it, i.e. the very bottom transport is HTTP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like reinventing WebSocket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WebSockets have a framing overhead though whereas unless I am missing something, using Upgrade
would just hand us the stream.
Closing this in favor of focusing on #508 |
We've discussed this a lot, and I think it's a good idea for the points listed in the spec. I've written the first draft here along with a PoC implementation in go-libp2p: libp2p/go-libp2p#1874.
This unlocks a lot of cool use cases. For example with this you can put a dumb HTTP cache in front of your libp2p node and easily scale the amount of requests you can handle. You can integrate directly with existing CDN infrastructure (what if the provider record said something like
/dns4/my-bucket.r2.dev/tcp/443/tls/http
and you seamlessly fetched it with your existing libp2p request/response protocol).Note: This came from discussions that happened around IPFS camp. This work is currently not on the roadmap, so this will be prioritized below unless we (as a community) decide otherwise.