Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signed HTTP Exchanges #171

Open
sideshowbarker opened this issue Mar 7, 2019 · 8 comments
Open

Signed HTTP Exchanges #171

sideshowbarker opened this issue Mar 7, 2019 · 8 comments

Comments

@sideshowbarker
Copy link
Contributor

allow people to bundle together the resources that make up a website, so they can be shared offline, either with or without a proof that they came from the original website

An HTTP exchange consists of an HTTP request and its response. A publisher (like https://theestablishment.co/) writes (or has an author write) some content and owns the domain where it's published. A client (like Firefox) downloads content and uses it. An intermediate (like Fastly, the AMP cache, or old HTTP proxies) downloads content from its author (or another intermediate) and forwards it to a client (or another intermediate). When an HTTP exchange is encoded into a resource, the resource can be fetched from a distributing URL that is different from the publishing URL of the encoded exchange.

Use cases: https://wicg.github.io/webpackage/draft-yasskin-webpackage-use-cases.html
Explainer: https://github.com/WICG/webpackage/blob/master/explainer.md

As noted at https://www.chromestatus.com/feature/5745285984681984, representatives from both the Firefox dev team and Safari dev team has expressed unwillingness to implement:

The gist of the feedback is that the associated security considerations are serious to the degree that the Signed HTTP Exchanges feature is harmful:

Using signed HTTP exchanges to enhance the security of accessing a resource or verifying its authenticity seems like a good thing; but it seems positively harmful to use signed HTTP exchanges as a replacement for the longstanding web security model

See also #96

@codedokode
Copy link

codedokode commented Apr 17, 2019

I don't like that this spec allows changing URL bar contents. If we look at Google's documentation on AMP Viewer, there is a screenshot where the URL bar displays "www.amp.dev" while in fact the content is fetched from Google's servers. The user might think that they are connecting to amp.dev but in fact they are connected to Google and Google is collecting their data including IP address according to their policy.

I think the URL bar should show the real URL from which the content was loaded.

The author of the spec mentions this:

Two search engines have built systems to do this with today’s technology: Google’s AMP and Baidu’s MIP formats and caches allow them to prefetch search results while preserving privacy, at the cost of showing the wrong URLs for the results once the user has clicked. A good solution to this problem would show the right URLs but still avoid a request to the publishing origin until after the user clicks.

But I don't understand why they call the URL of the cache "wrong". It is the actual URL, not wrong. The user connects to Google's servers via a TLS connection signed by Google's key. The address bar should display google.com in this case.

@sideshowbarker
Copy link
Contributor Author

Mozilla’s Position on Web Packaging
https://docs.google.com/document/d/1ha00dSGKmjoEh2mRiG8FIA5sJ1KihTuZe-AXX1r8P-8/edit

From a technical standpoint, the changes are thorough and well-considered. There are some technical costs around security, operations, and complexity, but the specifications take steps to limit most of these costs.
Many of the technical concerns are relatively minor. There are security problems, but most are well managed. There are operational concerns, but those can be overcome.
…we don’t understand enough to say definitively that this is damaging to the system

In making an assessment about value, we have to see what benefits are realized, by whom.
The main concern is web packaging might be employed to alter power dynamics between aggregators and publishers.
…until more information is available on the effect on the web ecosystem, Mozilla concludes that it would not be good for the web to deploy web packaging

@wseltzer
Copy link
Member

Discussion on IETF wpack list and at IETF 105 suggests a BOF at 106.

@ylafon
Copy link
Member

ylafon commented Sep 11, 2019

@sideshowbarker
Copy link
Contributor Author

Content-Based Origins for the Web
https://martinthomson.github.io/wpack-content/draft-thomson-wpack-content-origin.html

Content-based origins are proposed as an alternative to signed exchanges.

https://martinthomson.github.io/wpack-content/draft-thomson-wpack-content-origin.html#name-content-based-origin-defini

A content-based origin ascribes an identity to content based on the content itself. For instance, a web bundle [BUNDLE] is assigned a URI based on its content alone.

The sequence of bytes that comprises the content or bundled content is hashed using a hash function that is strongly resistant to collision and pre-image attack, such as SHA-256 [SHA-2]. The resulting hash is encoded using the Base 64 encoding with an URL and filename safe alphabet [BASE64].

This can be formed into the ASCII or Unicode serialization of an origin based on the Named Information URI scheme [NI]. This URI is formed from the string "ni:///", the identifier for the hash algorithm (see Section 9.4 of [NI]); a semi-colon (";"), and the base64url encoding of the hash function output. Though this uses the ni URL form, the authority and query strings are omitted from this serialization.

For instance, the origin of content comprising the single ASCII character 'a' is represented as ni:///sha-256;ypeBEsobvcr6wjGzmiPcTaeG7_gUfE5yuYB3ha_uSLs.

https://martinthomson.github.io/wpack-content/draft-thomson-wpack-content-origin.html#section-3.3

Signed exchanges … in effect, they add an object-based security model to the existing channel-based model used on the web. Signatures over bundles (or parts thereof) are used by an origin to attest to the contents of a bundle.

Having two security models operate in the same space potentially creates an exposure to the worst properties of each model.

In comparison, content-based origins do not require signatures. Questions of validity only apply at the point that a state transfer is attempted.

This avoids the complexity inherent to merging two different security models, but the process of state transfer could be quite complicated in practice… content-based origins aren't prevented from interacting with HTTP origins, which could lead to surprising outcomes if existing code is poorly unprepared for this possibility

https://martinthomson.github.io/wpack-content/draft-thomson-wpack-content-origin.html#name-communication-between-origi

Without knowledge of the content of a resource, or bundle of resources, a content-based origin will be impossible to guess. This means that communication is only possible if the frame in which the content is loaded by the origin attempting communication, or the content is known to that origin.

@iherman
Copy link
Member

iherman commented Mar 24, 2020

There is also this, which looks very close to the first option listed there:

https://tools.ietf.org/html/draft-sporny-hashlink-04

When using a hyperlink to fetch a resource from the Internet, it is
often useful to know if the resource has changed since the data was
published. Cryptographic hashes, such as SHA-256, are often used to
determine if published data has changed in unexpected ways. Due to
the nature of most hyperlinks, the cryptographic hash is often
published separately from the link itself. This specification
describes a data model and serialization formats for expressing
cryptographically protected hyperlinks. The mechanisms described in
the document enables a system to publish a hyperlink in a way that
empowers a consuming application to determine if the resource
associated with the hyperlink has changed in unexpected ways.

Cc: @msporny

@msporny
Copy link
Member

msporny commented Mar 24, 2020

https://tools.ietf.org/html/draft-sporny-hashlink-04

Yes, that could be a partial solution that's backwards compatible and doesn't break the existing Web model by tacking on ?hl=XYZ to the URL; it's a bit of a hack.

The other option, which is being picked up by IETF's HTTP WG is:

https://github.com/richanna/request-signing/blob/revise/draft-richanna-http-message-signatures-00.txt

... which is capable of digitally signing HTTP headers from the server, which again, doesn't break the Web's security model but gives you the option of doing a HEAD, getting the content hash of what you should be receiving as well as who signed the content hash (original domain) and then you have options on where you get the content from. This is also backwards compatible with the security model of the Web.

PS: Not taking a position on the Signed HTTP Exchanges discussion as I'm sure it would take me a week to catch up on the current status of that discussion. :)

@sambacha
Copy link

Mozilla’s Position on Web Packaging
docs.google.com/document/d/1ha00dSGKmjoEh2mRiG8FIA5sJ1KihTuZe-AXX1r8P-8/edit

From a technical standpoint, the changes are thorough and well-considered. There are some technical costs around security, operations, and complexity, but the specifications take steps to limit most of these costs.
Many of the technical concerns are relatively minor. There are security problems, but most are well managed. There are operational concerns, but those can be overcome.
…we don’t understand enough to say definitively that this is damaging to the system
In making an assessment about value, we have to see what benefits are realized, by whom.
The main concern is web packaging might be employed to alter power dynamics between aggregators and publishers.
…until more information is available on the effect on the web ecosystem, Mozilla concludes that it would not be good for the web to deploy web packaging

Here is an Assessment on value

Per industry standards, certificates that include the Signed HTTP Exchange extension have a 90-day maximum validity limit. source

Digicert charges $198 for an enabled certificate (this does not include an existing certificate you must have).

That's $792 extra a year.
To use your own domain name, because of AMP.

Why even bother using domain names at all? Why not just let Google host everything?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

8 participants
@msporny @sideshowbarker @wseltzer @iherman @ylafon @codedokode @sambacha and others