Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: websocket transport should send host header #1829

Closed
wants to merge 1 commit into from
Closed

fix: websocket transport should send host header #1829

wants to merge 1 commit into from

Conversation

thibmeu
Copy link
Contributor

@thibmeu thibmeu commented Oct 12, 2022

TLDR: kubo 0.16 cannot connect to /wss behind a reverse proxy

websocket transport resolve should make sure HTTP Host is preserved, even from dns resolution performed by swarm DNS resolver. This commit updates the resolve strategy to add an SNI for both WS and WSS, if a host is present. This SNI is used as the WS URL Host, while the resolved host is used to perform the underlying TCP dial.

This also moves toMultiAddr() method of parsedWebsocketMultiaddr in the file the object is declared, in an attempt to make reading simpler.

The issue this addresses is described in more length in this comment.

websocket transport resolve should make sure HTTP Host is preserved, even from dns resolution performed by swarm DNS resolver.
This commit updates the resolve strategy to add an SNI for both WS and WSS, if a host is present.
This SNI is used as the WS URL Host, while the resolved host is used to perform the underlying TCP dial.
@thibmeu
Copy link
Contributor Author

thibmeu commented Oct 12, 2022

Copy link
Collaborator

@MarcoPolo MarcoPolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of, thank you for this PR!

Second, I think we need a new component for the host header rather than trying to reuse the SNI component.

The problem with using the SNI component is that it's meant to be a parameter to TLS. Without a tls component what gets the SNI parameter? The tcp component? the ip component? Neither makes sense. We can't propagate the "sni" component to the right, because multiaddrs are interpreted right to left. Take a look at this section: https://github.com/multiformats/multiaddr/#interpreting-multiaddrs.

The other problem is that SNI has a specific meaning, which is the server name indication for TLS. It's defined in https://www.rfc-editor.org/rfc/rfc6066.html#section-3.

I think your other idea is correct:

add a dnsname component in the swarm resolved multiaddr.

However I would call this component host-header since this is really just about the Host header in the http framing. (really I would love to see a headers component and then a map of headers, but that's a bit more involved and shouldn't block this).

The multiaddr would look something like this:

/ip4/1.2.3.4/tcp/443/ws/host-header/example.com

This is obviously a bigger change than using the SNI, but I think this is the correct change. It's consistent with the multiaddr interpretation, it opens the door for other kinds of parameters to transports in general, and will make other changes (like passing in a specific http path) much easier!

Here's a rough outline of the next steps:

I hope that makes sense, and please let me know if I can explain anything better. Also if you want to pair program on this we can make that happen :)

Thanks again!

@thibmeu
Copy link
Contributor Author

thibmeu commented Oct 13, 2022

@MarcoPolo thanks for the quick turnaround and comments

I agree having a dedicated component would be better, as the SNI is a hack to make things work. I needed a fix because the resolver change broke my kubo setup with wss. As I experienced it, this was a regression introduced by go-libp2p.

I have opened PRs against multiaddr, multicodec, and go-multiaddr repository. Once these are reviewed and merged, I'll update this PR to use the new host-header component instead of sni.

@lidel lidel changed the title Fix websocket transport should send host header fix: websocket transport should send host header Oct 14, 2022
@BigLep
Copy link
Contributor

BigLep commented Oct 14, 2022

@MarcoPolo and go-libp2p maintainers: I'm not close to the details here, but it seems like we've had various websocket issues for multiple months. What do we need to do so we don't regress in this area in the future? Is there enough test coverage in this PR or do we need something more? As I understand it, there are different areas at play for whether WS or WSS are used. Are we covering both branches appropriately? Apologies if this isn't the right place to have this discussion - feel free to redirect. (Maybe this comment just needs to be redirected as a followup to libp2p/go-libp2p-relay-daemon#18 (comment) ? I'm suspecting we need test coverage in both IPFS and libp2p.). I just want to make sure we harden here for the future. Thanks!

@marten-seemann
Copy link
Contributor

I'm wondering if we should deprecate (insecure) WebSocket support:

  • With the web having moved to ubiquitous TLS encryption, there's simply no websites left for which the browser would allow an unencrypted WebSocket connection. When a website is loaded via HTTPS, the browser will block all unencrypted WebSocket requests originating from that site.
  • Standalone nodes could in principle still use WebSocket, as they're not bound by the browser's security model. However, there's no point in dialing a WebSocket connection to begin with: It's much faster (in terms of handshake latency) to dial a libp2p TCP connection.

If I understand the issue that prompted this PR correctly, this problem wouldn't exist if we just used the SNI we use in the TLS handshake for the Host header in the WebSocket CONNECT HTTP request. We don't set the Host header currently, but that would be a much smaller fix than adding a new multiaddr component.
The values for TLS SNI and the HTTP Host Header are always identical, unless you're doing some form of domain fronting. I'd be ok with leaving that fringe use case unsupported for now.

@thibmeu @MarcoPolo Is my understanding correct?

@MarcoPolo
Copy link
Collaborator

As I experienced it, this was a regression introduced by go-libp2p.

@thibmeu can you expand on this? go-libp2p never sent host headers, so I don't follow how this used to work.

@MarcoPolo
Copy link
Collaborator

MarcoPolo commented Oct 14, 2022

@BigLep

@MarcoPolo and go-libp2p maintainers: I'm not close to the details here, but it seems like we've had various websocket issues for multiple months. What do we need to do so we don't regress in this area in the future?

I don't think this is a regression (but I've asked for clarification from OP). The issue here is that if the websocket listener is behind a reverse proxy, the reverse proxy relies on the HTTP host header being set. As far as I know we've never touched HTTP headers, so I don't see how this would have ever worked.

Edit: This is a regression, see my comment below for details. Gist is that this worked by accident.

Is there enough test coverage in this PR or do we need something more?

We're likely changing the PR quite a bit, but we'll make sure we have some testing before merging. We need to test that a websocket server sees the Host header.

As I understand it, there are different areas at play for whether WS or WSS are used.

If we do the host-header multiaddr component strategies, then it should be orthogonal to WS vs WSS. But yes, we'll test both branches anyways.

Feel free to redirect. (Maybe this comment just needs to be redirected as a followup to libp2p/go-libp2p-relay-daemon#18 (comment) ?

Will do.

I'm suspecting we need test coverage in both IPFS and libp2p.). I just want to make sure we harden here for the future. Thanks!

On a meta point, leave to Ceasar Kubo the things that are theirs. I appreciate all the extra testing that Kubo gives go-libp2p, but unless Kubo is doing something special this test coverage should probably live solely in go-libp2p.

@MarcoPolo
Copy link
Collaborator

I'm wondering if we should deprecate (insecure) WebSocket support:

Good points! My fear is that we don't understand the use cases where folks would use just non-encrypted WS. Also we should have a clear deprecation period where we slowly phase out non-encrypted WS.

If I understand the issue that prompted this PR correctly, this problem wouldn't exist if we just used the SNI we use in the TLS handshake for the Host header in the WebSocket CONNECT HTTP request. We don't set the Host header currently, but that would be a much smaller fix than adding a new multiaddr component.

@thibmeu if this fits your use case (i.e. you're always using secure websockets) I think this is a fine change as well.

@MarcoPolo
Copy link
Collaborator

MarcoPolo commented Oct 14, 2022

Some brainstorming with @marten-seemann on what could have caused this regression:

Older versions of go-libp2p (v0.22, v0.21, but not v0.20) worked with this use case. They used to send this Host header somewhat by accident. They would end up dialing resolved and unresolved address (an unresolved address has a DNS name and needs to still be resolved to an IP address). So we would eventually try dialing both the resolved (the ip address version) and the unresolved (the DNS address version). The websocket transport would fail with the resolved address (like we currently do), but then succeed with the unresolved version.



With v0.23 (specifically this change) we changed it so DNS resolution is done by the core swarm component of go-libp2p, but transports can still be asked if they want to do their own resolution. And then we only ask the transport to dial resolved addresses. This is better because we don't end up resolving an address twice. However in this fix, we lost the old behavior of sending the host header. While I was explicit about making sure we passed the SNI to the TLS layer, I forgot about the host header that proxies rely on.







So yes, this is a regression. I'm sorry we broke your setup!











Going forward, we should be explicit about supporting use cases and testing them for regressions. We should support the reverse proxy/load balancer use case (and I think we've even mentioned supporting it) so that support should at bare minimum include tests.

@thibmeu
Copy link
Contributor Author

thibmeu commented Oct 17, 2022

@marten-seemann @MarcoPolo I'm good with fixing the issue only for secure websocket, as I think it's going to ease and speed up a fix release. Using the TLS SNI on secure websocket connection is actually the change I've made to my setup, before crafting the present PR.

I missed pointing the commit introducing the regression in my previous comments, so thanks for linking them.

As a go-libp2p user (via kubo), I feel like having divergent behaviour for secure and insecure websocket could be tricky. As part of the deprecation announcement, clearly highlighting that insecure websocket cannot be hosted behind a reverse proxy with multiple host per IP is going to be important.

@thibmeu
Copy link
Contributor Author

thibmeu commented Oct 19, 2022

@MarcoPolo I've created the commit which would fix WSS only at the following address. It is much shorter.

If breaking WS and WSS compatibility is fine, I can update the present PR to the above commit.
I would like to see this change released to unblock the use of websocket peers, and can iterate quickly if need be.

@MarcoPolo
Copy link
Collaborator

@thibmeu yes that looks good. Although do we need to set the dialer.NetDial?

I'd love a test to make sure we don't break this. Something simple like making sure the server sees the host header should be fine.

Feel free to add the test, but I'm also happy to add the test later today. Thanks again, I appreciate the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants