-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support ED25519 at subdomain gw with TLS #7441
Conversation
This comment has been minimized.
This comment has been minimized.
cebf519
to
842ff8e
Compare
842ff8e
to
4081ba1
Compare
core/corehttp/hostname.go
Outdated
@@ -334,11 +390,16 @@ func toSubdomainURL(hostname, path string, r *http.Request) (redirURL string, ok | |||
// if object turns out to be a valid CID, | |||
// ensure text representation used in subdomain is CIDv1 in Base32 | |||
// https://github.com/ipfs/in-web-browsers/issues/89 | |||
rootID, err = cid.NewCidV1(multicodec, rootCid.Hash()).StringOfBase(mbase.Base32) | |||
rootCID = cid.NewCidV1(multicodec, rootCID.Hash()) | |||
rootID, err = rootCID.StringOfBase(mbase.Base32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: we still default to Base32 here..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea that even if we changed the default libp2p-key output to be base36 that people who put base32 RSA libp2p-keys into the browser domain or path wouldn't end up with the domain (and therefore origin) changed?
Similarly, is this solution too flexible? Do we need a canonical base encoding for the domains so that data stored in browser local storage isn't lost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the idea here is to PR a surgical change to use Base36 only where it is really needed and continue with Base32 as default for subdomains (so already existing subdomain Origins don't change, and we don't need to worry about any data lost).
I may miss your point(?), but I don't believe we need to add any additional normalization.
This comment was marked as duplicate.
This comment was marked as duplicate.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question was basically about what we want to happen if I use base36 where I could've used base32.
Leaving as is may lead to me having two different views (with different local data) for the same website.
Making them canonical is both a bit of a pain and uses an arbitrary rule that we likely end up stuck with (like the Cidv0-1 size cutoff).
Wanted to double check we are ok with the pros and cons of our solution now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formal 👍 from me on flipping as much as possible to b36
. I fully recognize the pain it will cause. That pain is utterly dwarfed by the cost of doing another switch further down the road.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to be as consistent as possible. Given that IPNS isn't widely used for obvious reasons anyways, I'd suggest at least using base36 for IPNS. I'm fine not canonicalizing CIDs.
The same HTML can be loaded from different hostnames on the web, and each gets distinct Origin.
That's fine. The problem is if we decide to start canonicalizing CIDs to base36 for consistency in the future. If we do that, we'll be changing the origin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I tend to think that we might as well do the b36 encoding for IPNS. IPNS users tend to be pretty happy when fixes show up and there'll probably be at least a few fixes in 0.7 or 0.8 that make it worth their while.
As an aside I've been thinking about this a bit and I suspect the confusion here stems from us not really telling people what the "correct" way is for users to describe an IPFS/IPNS URL. Do we even have a single answer here, or is it context dependent? For example, we'd like people to ideally use ipfs://b32encodedID
, but since ipfs://
isn't yet widely supported we probably like dweb.link/ipfs/b32encodedID
. Not having an canonical answer means we're stuck guessing about how people "could" be using URLs and so makes reasoning about upgrade paths difficult.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I've updated this PR to use b36 for ALL libp2p-keys in subdomains (79d571f)
@aschmahmann I believe that was the last thing blocking this PR – mind taking a final look and merging if it is ok?
ps. regarding canonical address, for the time being it is:
https://dweb.link/ipfs/{cid}
https://dweb.link/ipns/{libp2p-key}
Works everywhere, can be shared and loads fine, and is future proof.
We need to work with gui/docs/ecosystem to make that more evident in our apps, docs and comms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks will take a look. Is there any base encoding recommended for the dweb.link paths, or is that left unspecified? 💯 on documenting the path based recommended addressing though since it will at least allow us to point people at something if they ever run into issues copy-pasting a subdomain-based URL.
Finished tests (go&sharness), and updated PR description to reflect current version. cc @autonome @Gozala @MichaelMure for visibility as this impacts behavior of subdomain gateways in the browser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall approach looks good to me. Deferring code review to @aschmahmann
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good. Left a few questions to clarify what's going on. Thanks 😀
core/corehttp/hostname.go
Outdated
@@ -334,11 +390,16 @@ func toSubdomainURL(hostname, path string, r *http.Request) (redirURL string, ok | |||
// if object turns out to be a valid CID, | |||
// ensure text representation used in subdomain is CIDv1 in Base32 | |||
// https://github.com/ipfs/in-web-browsers/issues/89 | |||
rootID, err = cid.NewCidV1(multicodec, rootCid.Hash()).StringOfBase(mbase.Base32) | |||
rootCID = cid.NewCidV1(multicodec, rootCID.Hash()) | |||
rootID, err = rootCID.StringOfBase(mbase.Base32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea that even if we changed the default libp2p-key output to be base36 that people who put base32 RSA libp2p-keys into the browser domain or path wouldn't end up with the domain (and therefore origin) changed?
Similarly, is this solution too flexible? Do we need a canonical base encoding for the domains so that data stored in browser local storage isn't lost?
#7441 (review) License: MIT Signed-off-by: Marcin Rataj <lidel@lidel.org>
Consensus reached in #7441 (comment) #7441 (comment) #7441 (comment) License: MIT Signed-off-by: Marcin Rataj <lidel@lidel.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful! Thank you so much for shepherding this through
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for getting this shipped 🚢
@lidel. I merged the other PR you tagged me in first which led to this having a small merge conflict. Do you mind rebasing, pushing, and tagging me for merge? |
This: - adds subdomain gateway support for ED25519 CIDs in a way that fits in a single DNS label to enable TLS for every IPNS website. - cleans up subdomain redirect logic and adds more explicit error handling. TL;DR on router logic: When CID is longer than 63 characters, router at /ipfs/* and /ipns/* converts to Base36, and if that does not help, returns a human readable 400 Bad Request error. Addressing code review: #7441 (review) refactor: use b36 for all libp2p-keys in subdomains Consensus reached in #7441 (comment) #7441 (comment) #7441 (comment)
357155c
to
b0af543
Compare
@aschmahmann rebased, merge at will 👍 |
#7441 (review) License: MIT Signed-off-by: Marcin Rataj <lidel@lidel.org>
This adds Base36 version of CIDv1 if it fits inside of a single DNS label. Follows convention introduced in: ipfs/kubo#7441
This adds Base36 version of CIDv1 if it fits inside a single DNS label. Follows convention introduced in: ipfs/kubo#7441
This: - adds subdomain gateway support for ED25519 CIDs in a way that fits in a single DNS label to enable TLS for every IPNS website. - cleans up subdomain redirect logic and adds more explicit error handling. TL;DR on router logic: When CID is longer than 63 characters, router at /ipfs/* and /ipns/* converts to Base36, and if that does not help, returns a human readable 400 Bad Request error. Addressing code review: ipfs/kubo#7441 (review) refactor: use b36 for all libp2p-keys in subdomains Consensus reached in ipfs/kubo#7441 (comment) ipfs/kubo#7441 (comment) ipfs/kubo#7441 (comment) This commit was moved from ipfs/kubo@231fab8
Rationale
If we split original CID into multiple sub labels (#7358) DNS resolution will work, but no public gateway will be able to get TLS certificate that supports double wildcards like
*.*.ipfs.example.com
.When splitting is used, visitors using web browsers will see a TLS Error page, which effectively kills UX and defeats the purpose of subdomain gateway (one step forward by increasing security by providing Origin isolation, two steps backward by removing ability to do TLS validation when loading website backed by a long CID).
Given this reality, I believe a better trade off could be to try shortening CID by converting it to Base36, and if that fails return an error before redirect to subdomain occurs.
This PR
TL;DR on router logic:
When CID is longer than 63 characters, router at /ipfs/* and /ipns/* converts to Base36, and if that does not help, returns a human readable 400 Bad Request error.
Examples
Given a subdomain gateway at
example.com
:ED25519 fits perfectly when converted to Base36, so it gets loaded just fine:
CID produced by
ipfs add --cid-version 1 --hash sha2-512
is too long, so an error is returned instead:Note that returning this error overrides #6982 which means tools like
curl
won't be able to load content from long CIDs via the subdomain gateway. I don't think that is a bug, pretty sure it is how we want it, but mentioning here for completenes.Why we dropped the previous iteration
We looked into automatically replacing root node which has shorter CID, but as noted in #7441 (comment) its not worth it.
Click here to expand notes from that older version
TL;DR
The idea evaluated here is to redirect
example.com/ipfs/$CID
to$dnsCID.ipfs.example.com
where$dnsCID
is a new CID created with a hash function known to fit in a single DNS label (<=63chars).This way public gateways can enable HTTPS without TLS errors for content roots backed by longer hash functions.
Rationale
If we split original CID into multiple sub labels (#7358) DNS resolution will work, but no public gateway will be able to get TLS certificate that supports double wildcards like
*.*.ipfs.example.com
. This means errors like this.This means when splitting is used, visitors using web browsers will see a TLS Error page, which effectively kills UX and defeats the purpose of subdomain gateway (one step forward by increasing security by providing Origin isolation, two steps backward by removing ability to do TLS validation when loading website backed by a long CID).
Given this reality, I believe a better trade off could be to create a new root CID that is compatible with DNS and make websites work in existing user agents that way.
Thoughts?
Open questions