Signed HTTP Exchanges #171

sideshowbarker · 2019-03-07T12:05:17Z

allow people to bundle together the resources that make up a website, so they can be shared offline, either with or without a proof that they came from the original website

An HTTP exchange consists of an HTTP request and its response. A publisher (like https://theestablishment.co/) writes (or has an author write) some content and owns the domain where it's published. A client (like Firefox) downloads content and uses it. An intermediate (like Fastly, the AMP cache, or old HTTP proxies) downloads content from its author (or another intermediate) and forwards it to a client (or another intermediate). When an HTTP exchange is encoded into a resource, the resource can be fetched from a distributing URL that is different from the publishing URL of the encoded exchange.

Use cases: https://wicg.github.io/webpackage/draft-yasskin-webpackage-use-cases.html
Explainer: https://github.com/WICG/webpackage/blob/master/explainer.md

Loading Signed Exchanges (W3C WICG draft): https://wicg.github.io/webpackage/loading.html
Signed HTTP Exchanges (IETF draft): https://wicg.github.io/webpackage/draft-yasskin-http-origin-signed-responses.html
Bundled HTTP Exchanges (IETF draft): https://wicg.github.io/webpackage/draft-yasskin-wpack-bundled-exchanges.html
TAG design review: Signed Exchanges w3ctag/design-reviews#235

As noted at https://www.chromestatus.com/feature/5745285984681984, representatives from both the Firefox dev team and Safari dev team has expressed unwillingness to implement:

The gist of the feedback is that the associated security considerations are serious to the degree that the Signed HTTP Exchanges feature is harmful:

Using signed HTTP exchanges to enhance the security of accessing a resource or verifying its authenticity seems like a good thing; but it seems positively harmful to use signed HTTP exchanges as a replacement for the longstanding web security model

See also #96

codedokode · 2019-04-17T15:44:37Z

I don't like that this spec allows changing URL bar contents. If we look at Google's documentation on AMP Viewer, there is a screenshot where the URL bar displays "www.amp.dev" while in fact the content is fetched from Google's servers. The user might think that they are connecting to amp.dev but in fact they are connected to Google and Google is collecting their data including IP address according to their policy.

I think the URL bar should show the real URL from which the content was loaded.

The author of the spec mentions this:

Two search engines have built systems to do this with today’s technology: Google’s AMP and Baidu’s MIP formats and caches allow them to prefetch search results while preserving privacy, at the cost of showing the wrong URLs for the results once the user has clicked. A good solution to this problem would show the right URLs but still avoid a request to the publishing origin until after the user clicks.

But I don't understand why they call the URL of the cache "wrong". It is the actual URL, not wrong. The user connects to Google's servers via a TLS connection signed by Google's key. The address bar should display google.com in this case.

sideshowbarker · 2019-05-25T00:28:00Z

Mozilla’s Position on Web Packaging
https://docs.google.com/document/d/1ha00dSGKmjoEh2mRiG8FIA5sJ1KihTuZe-AXX1r8P-8/edit

From a technical standpoint, the changes are thorough and well-considered. There are some technical costs around security, operations, and complexity, but the specifications take steps to limit most of these costs.
Many of the technical concerns are relatively minor. There are security problems, but most are well managed. There are operational concerns, but those can be overcome.
…we don’t understand enough to say definitively that this is damaging to the system

In making an assessment about value, we have to see what benefits are realized, by whom.
The main concern is web packaging might be employed to alter power dynamics between aggregators and publishers.
…until more information is available on the effect on the web ecosystem, Mozilla concludes that it would not be good for the web to deploy web packaging

wseltzer · 2019-07-23T21:52:07Z

Discussion on IETF wpack list and at IETF 105 suggests a BOF at 106.

ylafon · 2019-09-11T00:53:36Z

See https://datatracker.ietf.org/doc/html/draft-thomson-escape-report

sideshowbarker · 2020-03-24T05:46:59Z

Content-Based Origins for the Web
https://martinthomson.github.io/wpack-content/draft-thomson-wpack-content-origin.html

Content-based origins are proposed as an alternative to signed exchanges.

https://martinthomson.github.io/wpack-content/draft-thomson-wpack-content-origin.html#name-content-based-origin-defini

A content-based origin ascribes an identity to content based on the content itself. For instance, a web bundle [BUNDLE] is assigned a URI based on its content alone.

The sequence of bytes that comprises the content or bundled content is hashed using a hash function that is strongly resistant to collision and pre-image attack, such as SHA-256 [SHA-2]. The resulting hash is encoded using the Base 64 encoding with an URL and filename safe alphabet [BASE64].

This can be formed into the ASCII or Unicode serialization of an origin based on the Named Information URI scheme [NI]. This URI is formed from the string "ni:///", the identifier for the hash algorithm (see Section 9.4 of [NI]); a semi-colon (";"), and the base64url encoding of the hash function output. Though this uses the ni URL form, the authority and query strings are omitted from this serialization.

For instance, the origin of content comprising the single ASCII character 'a' is represented as ni:///sha-256;ypeBEsobvcr6wjGzmiPcTaeG7_gUfE5yuYB3ha_uSLs.

https://martinthomson.github.io/wpack-content/draft-thomson-wpack-content-origin.html#section-3.3

Signed exchanges … in effect, they add an object-based security model to the existing channel-based model used on the web. Signatures over bundles (or parts thereof) are used by an origin to attest to the contents of a bundle.

Having two security models operate in the same space potentially creates an exposure to the worst properties of each model.

In comparison, content-based origins do not require signatures. Questions of validity only apply at the point that a state transfer is attempted.

This avoids the complexity inherent to merging two different security models, but the process of state transfer could be quite complicated in practice… content-based origins aren't prevented from interacting with HTTP origins, which could lead to surprising outcomes if existing code is poorly unprepared for this possibility

https://martinthomson.github.io/wpack-content/draft-thomson-wpack-content-origin.html#name-communication-between-origi

Without knowledge of the content of a resource, or bundle of resources, a content-based origin will be impossible to guess. This means that communication is only possible if the frame in which the content is loaded by the origin attempting communication, or the content is known to that origin.

iherman · 2020-03-24T07:39:02Z

There is also this, which looks very close to the first option listed there:

https://tools.ietf.org/html/draft-sporny-hashlink-04

When using a hyperlink to fetch a resource from the Internet, it is
often useful to know if the resource has changed since the data was
published. Cryptographic hashes, such as SHA-256, are often used to
determine if published data has changed in unexpected ways. Due to
the nature of most hyperlinks, the cryptographic hash is often
published separately from the link itself. This specification
describes a data model and serialization formats for expressing
cryptographically protected hyperlinks. The mechanisms described in
the document enables a system to publish a hyperlink in a way that
empowers a consuming application to determine if the resource
associated with the hyperlink has changed in unexpected ways.

Cc: @msporny

msporny · 2020-03-24T14:44:19Z

https://tools.ietf.org/html/draft-sporny-hashlink-04

Yes, that could be a partial solution that's backwards compatible and doesn't break the existing Web model by tacking on ?hl=XYZ to the URL; it's a bit of a hack.

The other option, which is being picked up by IETF's HTTP WG is:

https://github.com/richanna/request-signing/blob/revise/draft-richanna-http-message-signatures-00.txt

... which is capable of digitally signing HTTP headers from the server, which again, doesn't break the Web's security model but gives you the option of doing a HEAD, getting the content hash of what you should be receiving as well as who signed the content hash (original domain) and then you have options on where you get the content from. This is also backwards compatible with the security model of the Web.

PS: Not taking a position on the Signed HTTP Exchanges discussion as I'm sure it would take me a week to catch up on the current status of that discussion. :)

sambacha · 2020-07-18T07:33:06Z

Mozilla’s Position on Web Packaging
docs.google.com/document/d/1ha00dSGKmjoEh2mRiG8FIA5sJ1KihTuZe-AXX1r8P-8/edit

From a technical standpoint, the changes are thorough and well-considered. There are some technical costs around security, operations, and complexity, but the specifications take steps to limit most of these costs.
Many of the technical concerns are relatively minor. There are security problems, but most are well managed. There are operational concerns, but those can be overcome.
…we don’t understand enough to say definitively that this is damaging to the system
In making an assessment about value, we have to see what benefits are realized, by whom.
The main concern is web packaging might be employed to alter power dynamics between aggregators and publishers.
…until more information is available on the effect on the web ecosystem, Mozilla concludes that it would not be good for the web to deploy web packaging

Here is an Assessment on value

Per industry standards, certificates that include the Signed HTTP Exchange extension have a 90-day maximum validity limit. source

Digicert charges $198 for an enabled certificate (this does not include an existing certificate you must have).

That's $792 extra a year.
To use your own domain name, because of AMP.

Why even bother using domain names at all? Why not just let Google host everything?

sideshowbarker added Core Security Publishing labels Mar 7, 2019

sideshowbarker self-assigned this Mar 7, 2019

rektide mentioned this issue Jul 22, 2019

Audiobooks w3ctag/design-reviews#345

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Signed HTTP Exchanges #171

Signed HTTP Exchanges #171

sideshowbarker commented Mar 7, 2019

codedokode commented Apr 17, 2019 •

edited

Loading

sideshowbarker commented May 25, 2019

wseltzer commented Jul 23, 2019

ylafon commented Sep 11, 2019

sideshowbarker commented Mar 24, 2020

iherman commented Mar 24, 2020

msporny commented Mar 24, 2020 •

edited

Loading

sambacha commented Jul 18, 2020

Signed HTTP Exchanges #171

Signed HTTP Exchanges #171

Comments

sideshowbarker commented Mar 7, 2019

codedokode commented Apr 17, 2019 • edited Loading

sideshowbarker commented May 25, 2019

wseltzer commented Jul 23, 2019

ylafon commented Sep 11, 2019

sideshowbarker commented Mar 24, 2020

iherman commented Mar 24, 2020

msporny commented Mar 24, 2020 • edited Loading

sambacha commented Jul 18, 2020

codedokode commented Apr 17, 2019 •

edited

Loading

msporny commented Mar 24, 2020 •

edited

Loading