Extend link HTTP header to support subresource signed exchange loading #347

horo-t · 2018-12-07T08:10:30Z

I want to introduce two new fields in application/signed-exchange format.

Alternative-Signed-Exchange-Subresources map in unsigned field.
Allowed-Alternative-Signed-Exchange-Subresources list in signed field.

Problem

Currently content publishers can sign their HTML contents using their own private keys. User Agents (UAs) can trust the signed contents as if the contents are served from the publisher’s origins even if they are served from other distributors’ origins. The signed contents can be served from any distributors’ origins. But if the publisher wants to serve subresources such as scripts and images from the distributors’ origin, the publisher needs to change the subresource URLs in the HTML to point to each distributors’ URL and need to sign for each distributor. The proposed two fields can solve this problem.

Alternative-Signed-Exchange-Subresources map:
A map from the original subresource requests to the SXG URLs. This field is not signed. So the distributor can change this field to point to their URLs.

Allowed-Alternative-Signed-Exchange-Subresources list:
The subresource URL list which can be served using SXG instead of fetching the original URL. This field is signed by the publisher. So the distributor can’t change this field.

Example

Publisher: https://publisher.example/article_1.html

  <script src="framework.js"></script>
  <img src="article_1.jpg>

SXG in Publisher: https://publisher.example/article_1.html.sxg

[
  // URL
  'https://publisher.example/article_1.html',
  // Signature
  'sig1: sig=*...; integrity="digest/mi-sha256";cert-url="https://publisher.example/cert"',
  // [New field] Alternative-Signed-Exchange-Subresources
  // The key of the mapping may need Accept headers info in order to enable content
  // negotiation (e.g. for WebP).
  [
    [{':url': 'https://publisher.example/framework.js', 'accept': '*/*'},
     'https://publisher.example/framework.js.sxg'],
    [{':url': 'https://publisher.example/article_1.jpg', 'accept': '*/*'},
     'https://publisher.example/article_1.jpg.sxg']
  ],
  // Signed headers
  [
    { ':method': 'GET', 'accept': '*/*' },
    {
      ':status': '200',
      // [New field]
      ':allowed-alternative-signed-exchange-subresources':
          '"https://publisher.example/framework.js",'
          '"https://publisher.example/article_1.jpg"',
      'content-encoding': 'mi-sha256-03',
      'content-type': 'text/html; charset=utf-8',
      'digest': 'mi-sha256-03=....'
    },
  ],
  // Payload body
  '<html><body>...'
]

SXG in Distributor: https://distributor.example/article_1.html.sxg

[
  // URL
  'https://publisher.example/article_1.html',
  // Signature
  'sig1: sig=*...; integrity="digest/mi-sha256";cert-url="https://distributor.example/publisher.example/cert"',
  // [New field] Alternative-Signed-Exchange-Subresources
  [
    [{':url': 'https://publisher.example/framework.js', 'accept': '*/*'},
     'https://distributor.example/publisher.example/framework.js.sxg'],
    [{':url': 'https://publisher.example/article_1.jpg', 'accept': '*/*'},
     'https://distributor.example/publisher.example/article_1.jpg.sxg']
  ],
  // Signed headers (Same as the SXG in Publisher)
  [
    { ':method': 'GET', 'accept': '*/*' },
    {
      ':status': '200',
      // [New field]
      ':allowed-alternative-signed-exchange-subresources':
          '"https://publisher.example/framework.js",'
          '"https://publisher.example/article_1.jpg"',
      'content-encoding': 'mi-sha256-03',
      'content-type': 'text/html; charset=utf-8',
      'digest': 'mi-sha256-03=....'
    },
  ],
  // Payload body (Same as the SXG in Publisher)
  '<html><body>...'
]

How UAs should work

When the user opens the SXG in the distributor, the UA must check the signature using the certificate in https://distributor.example/publisher.example/cert. (This is the existing behavior)
The UA processes the script tag and the img tag and decides to fetch the "framework.js" and the image "article_1_small.jpg".
Instead of fetching the original URL in publisher.example, UA should fetch the SXG files in distributor.example after checking the Alternative-Signed-Exchange-Subresources field and the Allowed-Alternative-Signed-Exchange-Subresources field.
If the original URL is not in the Allowed-Alternative-Signed-Exchange-Subresources field, the UA must fetch the original URL. This is intended to avoid the subresource monitoring attack.
If the original URL’s origin is not same as the signed origin of the main SXG (publisher.example), the UA must fetch the original URL. This restriction is intended to avoid providing a way of tracking.
UAs should handle the preload link header in the signed response header in the same way. (ex: link: <https://example.com/framework.js>;rel="preload";as="script")

Subresource monitoring attack

We need the signed Allowed-Alternative-Signed-Exchange-Subresources to avoid the subresource monitoring attack like this:

A publisher generates a SXG of a html which shows the user's icon using JS. icon.src = USER_ID + '.png';
An attacker sets the mapping info like this: { 'example.com/a.png': 'attacker.com/a.png.sxg', 'example.com/b.png': 'attacker.com/b.png.sxg', ....}
If UA fetches the SXG of png when the image tag is added, the attacker can know the USER_ID. Even if UA only uses the prefetched SXG, the attacker distributor can intentionally delay returning the SXGs one by one to see when the load actually finishes by monitoring the onload event, therefore can know the USER_ID.

Tracking using subresource SXG

We need to prohibit the SXG loading for cross-origin subresources to avoid the user tracking like this:

A publisher sets one subresource to the Allowed-Alternative-Signed-Exchange-Subresources field (https://tracking.example/id.js)
The distributor server can let the publishers’ site know about the user’s ID (ABCD1234) by changing the Alternative-Signed-Exchange-Subresources field.
tracking.example/id.js points to tracking.example/ABCD1234.sxg (body is `const id='ABCD1234';)

Tracking is still possible even if we prohibit cross-origin subresources using the following logic. But this is more difficult.

A publisher sets 30 subresources to the Allowed-Alternative-Signed-Exchange-Subresources field (https://publisher.example/00, 01, ... 29)
The publisher prepares 60 files, 00_0.sxg (body is 0), 00_1.sxg (body is 1), 01_0.sxg (body is 0), 01_1.sxg (body is 1)...
The distributor server can let the publishers’ site know about the user’s ID in binary digits by changing the Alternative-Signed-Exchange-Subresources field.
- publisher.example/00 points to 00_0.sxg or 00_1.sxg
- publisher.example/01 points to 01_0.sxg or 01_1.sxg
- ....
This logic provides a way of user tracking of 2^30 users.

The text was updated successfully, but these errors were encountered:

sleevi · 2018-12-07T21:44:44Z

The security implications of this are non-obvious, and may benefit from being fleshed out more. In particular, I always get uncomfortable when seeing new unsigned field - they make it dangerous for parsers and give attackers a lot of opportunities, some non-obvious.

My first thought was about substitution attacks - what prevents an attacker from modifying the URL for framework.js.sxg to point to other-different-payload.js.sxg. I'm assuming here that the answer is the SXG is strongly-bound to the request URL, and thus changing it to other-different-payload.js.sxg will cause it to fail when it attempts to match the request to the SXG - is that correct?

The next question is what the implications are of a version substitution of the SXG subresource. In this case, framework.js.sxg (v1) is served instead of the intended framework.js.sxg (v2). The best I can tell is the intent is to address that through the SXG signature expiration (that is, stop signing v1). This would be a 'new' problem, in as much as SXG-subresource-fetches and SXG-caching-inner-resources are not well-defined (or implemented) enough to be a thing that folks would rely on, even though they would also introduce these problems.

The final question is what the implications are of creating a bidirectional communication path. As you note, by virtue of the Allowed-Alternative-Signed-Exchange-Subresources, this potentially creates a communication channel between the distributor and the publisher when loading the SXG. The privacy implications are profound (as noted), but also the security implications of what happens if sites begin to rely on this communication channel, but that it is, by nature, unauthenticated (i.e. publisher.example can't tell whether it was distributor.example setting those bits or whether it was evil-distributor.example setting those bits.

horo-t · 2018-12-11T08:43:15Z

My first thought was about substitution attacks - what prevents an attacker from modifying the URL for framework.js.sxg to point to other-different-payload.js.sxg. I'm assuming here that the answer is the SXG is strongly-bound to the request URL, and thus changing it to other-different-payload.js.sxg will cause it to fail when it attempts to match the request to the SXG - is that correct?

Yes. UAs must check the subresource SXG's inner URL.

The next question is what the implications are of a version substitution of the SXG subresource. In this case, framework.js.sxg (v1) is served instead of the intended framework.js.sxg (v2). The best I can tell is the intent is to address that through the SXG signature expiration (that is, stop signing v1). This would be a 'new' problem, in as much as SXG-subresource-fetches and SXG-caching-inner-resources are not well-defined (or implemented) enough to be a thing that folks would rely on, even though they would also introduce these problems.

Publishers must be careful to add subresources to the Allowed-Alternative-Signed-Exchange-Subresources field.
If framework.js.sxg (v1) has a security bug and the signature of SXG is still valid, the publisher must change the URL of framework.js.

The final question is what the implications are of creating a bidirectional communication path. As you note, by virtue of the Allowed-Alternative-Signed-Exchange-Subresources, this potentially creates a communication channel between the distributor and the publisher when loading the SXG. The privacy implications are profound (as noted), but also the security implications of what happens if sites begin to rely on this communication channel, but that it is, by nature, unauthenticated (i.e. publisher.example can't tell whether it was distributor.example setting those bits or whether it was evil-distributor.example setting those bits.

Introducing a new CSP directive sxg-src could be a solution for that.
For example, if the signed response header has Content-Security-Policy: sxg-src https://distributor.example, the sxg on evil-distributor.example should be blocked.

sleevi · 2018-12-11T16:12:34Z

Introducing a new CSP directive sxg-src could be a solution for that.
For example, if the signed response header has Content-Security-Policy: sxg-src https://distributor.example, the sxg on evil-distributor.example should be blocked.

I don't think this addresses the substance of the concern. I was not attempting to say "We should grant control to page authors", I was trying to frame it as "We need to carefully reason about the security implications of allowing this". The goal of framing it like that is to understand if we're proposing a mechanism that is default-insecure, whether that's desirable in and of itself, and what solutions might exist.

Even more concretely: I'm not convinced we should be displaying the publisher origin if we allow for arbitrary content injection by the distributor, which having such a channel would imply. I think we need to carefully reason about that. This statement is based on seeing the harm come from code-signing systems that allow for (limited) content injection/manipulation - such as Authenticode or macOS Bundles. It's certainly true that such unauthenticated injection allows for things developers perceive interesting and useful use cases - for example, injecting whether or not a user has opted-in to metrics collection services on the download page for an executable (e.g. Chrome) - but it's also true that such methods have caused a substantial number of security vulnerabilities (e.g. https://docs.microsoft.com/en-us/security-updates/securityadvisories/2014/2915720 )

horo-t · 2018-12-12T07:57:06Z

How about having the each signatures of subresources in Allowed-Alternative-Signed-Exchange-Subresources to prevent distributors from injecting arbitrary content?

[
  // URL
  'https://publisher.example/article_1.html',
  // Signature
  'sig1: sig=*...; integrity="digest/mi-sha256";cert-url="https://distributor.example/publisher.example/cert"',
  // [New field] Alternative-Signed-Exchange-Subresources
  [
    [{':url': 'https://publisher.example/framework.js', 'accept': '*/*'},
     'https://distributor.example/publisher.example/framework.js.sxg'],
    [{':url': 'https://publisher.example/article_1.jpg', 'accept': '*/*'},
     'https://distributor.example/publisher.example/article_1.jpg.sxg']
  ],
  // Signed headers (Same as the SXG in Publisher)
  [
    { ':method': 'GET', 'accept': '*/*' },
    {
      ':status': '200',
      // [New field]
      ':allowed-alternative-signed-exchange-subresources':
          '"https://publisher.example/framework.js" '
            '*MEUCIQDX...=* '  // The first signature of framework.js.sxg
            '*MEQCIGjZ...=*,'  // The second signature of framework.js.sxg
          '"https://publisher.example/article_1.jpg" '
            '*lGZVaJJM...=* '  // The first signature of article_1.jpg.sxg
            '*MEYCIQCN...=*',  // The second signature of article_1.jpg.sxg
      'content-encoding': 'mi-sha256-03',
      'content-type': 'text/html; charset=utf-8',
      'digest': 'mi-sha256-03=....'
    },
  ],
  // Payload body (Same as the SXG in Publisher)
  '<html><body>...'
]

sleevi · 2018-12-12T14:18:06Z

As I mentioned previously, it's probably more useful to analyze the problem before we try to step forward to solve the problem. The latest proposed approach has, for example, the same deficiency w/r/t user tracking - if you treat framework.js as 'bit 0', article_1.jpg as 'bit 1', etc, then the existence of the two signatures lets you smuggle a bit at a time from the distributor by allowing the distributor to select which signature to use, which allows altering the content (e.g. framework.js having 0 bytes vs 1 byte).

This is why it's helpful to first make sure we've analyzed the problem, clearly stated it, and made sure it's not, in fact, a pre-existing problem, so then we can look at solution spaces or make informed tradeoff decisions.

jyasskin · 2018-12-12T22:38:31Z

IIUC, the core goal here is that if publisher.example has signed all of article1.html, framework.v2.js, article1.400x300.jpg, and article1.1600x1200.jpg, we'd like searchengine.example to be able to prefetch the appropriate subset of those for their user to be able to view the article without any new fetches to publisher.example. (See Privacy-Preserving Prefetch.)

Bundles solve this, but they require searchengine.example to subset the bundle to omit whichever of article1.400x300.jpg or article1.1600x1200.jpg is the wrong size for the user. This ability to subset gives, I think, the same communication abilities @sleevi's worried about here, but Bundles do give us a straightforward way to prevent version skew. The Bundles implementation is also farther off than @horo-t, et. al. think they could implement this extension to the SXG format.

We've talked at times about adding a way to specify external dependencies for bundles. The Allowed-Alternative-Signed-Exchange-Subresources list is similar to what we'd need for that.

The Alternative-Signed-Exchange-Subresources map fundamentally, tells the browser "for this dependency that the SXG or bundle said you need, you can fetch it from this URL." I don't particularly like the idea of requiring the distributor to modify the SXG file itself in order to communicate that. Would a response header work? e.g.

Content-Location: https://distributor.example/article1.html.sxg
Link: <https://distributor.example/framework.v2.js.sxg>; anchor="https://publisher.example/framework.v2.js"; rel=alternate_tbd

sleevi · 2018-12-12T22:48:11Z

@jyasskin Your mention of bundles made me realize that there may be another implication of this - cache probing. That is, if the distributor can modify the URL used to fetch subresources, can it infer or learn what sub-resources the user may already have (cached or loaded) by seeing which requests are not made?

That is, if I fetch article1.html.sxg from distributor.example, and it refers to publisher.example/resources/a.jpg, distributor.example does not learn whether or not the resource was cached or loaded - because the user contacts publisher.example to fetch that resource. In the Bundles case, if the user issues a range request (for the bundle), then distributor.example can learn which resources are needed. The same would apply with this sort of modification - whether or not distributor.example/sxgs/publisher.example/resources/a.jpg.sxg was fetched reveals whether or not the user needed publisher.example/resources/a.jpg. This is similar to the privacy implications mentioned by @horo-t with regards to user IDs, but isn't mitigated by the Allowed-Alternative... solution.

jyasskin · 2018-12-12T22:51:22Z

In what ways is the recommended value for Allowed-Alternative-Signed-Exchange-Subresources different from the value we'd recommend for a Link: <>; rel=preload header in the signed response? Could we just have the browser use the signed preloads for this purpose?

jyasskin · 2018-12-12T22:54:51Z

@sleevi Cute. I agree cache probing is a risk, but I think the UA can solve it the same way we solve other cache-based tracking attempts: we fetch the resource redundantly if the server we're fetching it from shouldn't know whether it's already cached. Edit: And searchengine.example needs to take that cost into account when deciding whether to offer a SXG for any particular resource.

sleevi · 2018-12-12T22:57:31Z

@jyasskin Sure, I didn't explicitly come out and say 'double-keyed caching', but I think that's the assumption. But that's something unique, in this case, because it's not double-keying based on the resource's logical origin (publisher.example) but instead based on the physical origin (distributor.example). Introducing that sort of split - where some of the security properties use physical and some logical - would benefit from that sort of analysis about the implications.

sleevi · 2018-12-12T23:02:20Z

In what ways is the recommended value for Allowed-Alternative-Signed-Exchange-Subresources different from the value we'd recommend for a Link: <>; rel=preload header in the signed response? Could we just have the browser use the signed preloads for this purpose?

Wouldn't it be both signed and unsigned preloads?

That is, https://distributor.example/publisher.example/article_1.html.sxg would Link: <https://distributor.example/framework.v2.js.sxg>; rel=preload when serving the SXG (i.e. the unsigned part), but then within the SXG, you'd do

  // Signed headers (Same as the SXG in Publisher)
  [
    { ':method': 'GET', 'accept': '*/*' },
    {
      ':status': '200',
      ':link', '<https://publisher.example/framework.v2.js>; rel=preload',
      ...
    }
   ...
  ]

Or did I misunderstand the question?

jyasskin · 2018-12-13T00:18:20Z

My Link: <>; rel=preload question was more for @horo-t than @sleevi. 😄 We have to look at the signed one for the same reasons Allowed-... has to be signed in the original post.

I think it's more complex than "double-keying" or even "physical" vs "logical". We need to find a way to describe which entities (origins or organi zations) know that which other entities have asked the profile to download a URL. I think each entry in the cache winds up annotated with a list of entities that are allowed to know it's cached, and if you request it but aren't in that list, it gets refetched from the network. ... But I haven't thought that all the way through.

horo-t · 2018-12-15T03:09:21Z

My Link: <>; rel=preload question was more for @horo-t than @sleevi. 😄 We have to look at the signed one for the same reasons Allowed-... has to be signed in the original post.

I think introducing a new Link header instead of Alternative-Signed-Exchange-Subresources is an alternative solution. But if we have Alternative-Signed-Exchange-Subresources field in SXG, we can easily host the SXG files in HTTP servers. The distributor doesn't need to implement the logic of setting the HTTP header for each SXG files. So I want to introduce the new field in SXG format.

I think it's more complex than "double-keying" or even "physical" vs "logical". We need to find a way to describe which entities (origins or organizations) know that which other entities have asked the profile to download a URL. I think each entry in the cache winds up annotated with a list of entities that are allowed to know it's cached, and if you request it but aren't in that list, it gets refetched from the network. ... But I haven't thought that all the way through.

To avoid letting the distributor know about the existence of publisher's content in the user's HTTPCache, UA must fetch https://distributor.example/publisher.example/framework.js.sxg even if https://publisher.example/framework.js is in the HTTPCache. But if https://distributor.example/publisher.example/framework.js.sxg is in the HTTPCache, UA doesn't need to fetch it again.

horo-t · 2018-12-15T03:09:46Z

This is why it's helpful to first make sure we've analyzed the problem, clearly stated it, and made sure it's not, in fact, a pre-existing problem, so then we can look at solution spaces or make informed tradeoff decisions.

I think the problem of this idea is that the user can't notice the channel which can be used by the distributor to send arbitrary information to the publisher. Is my understanding correct?

sleevi · 2018-12-15T06:26:55Z

I think the problem of this idea is that the user can't notice the channel which can be used by the distributor to send arbitrary information to the publisher. Is my understanding correct?

We can frame the problem like this: In the existing Web platform, the document always explicitly requests any dependent information - if it isn’t inline in the document itself, it’s in the headers or the subresources it loads. In all cases, the displayed page explicitly makes the requests to get extra information. This explicitness is good for security. It helps make sure that all the information in the page can be audited, and you can be sure where and how information makes it’s way in. A page author can then use things like CSP to restrict the inflow and outflow of information even further. This explicitness is also good for privacy. By making all information flow outgoing and explicit, users and extensions can inspect, audit, or alter the outflow of information. This is used heavily by privacy preserving extensions and users. The proposed CSP directive is trying to address the security aspect, by giving page authors a means to control the content. It is unsafe by default - the SXG packager that didn’t use the CSP could find unexpected content injected, which is functionally indistinguishable from a MITM. We might say that requires cleverness by the distributor, but if we’re worried about security as browser and spec authors, we need to worry about clever distributors. The problem with the approach is it doesn’t address the privacy angle. This is when both the distributor and publisher are clever and collaborating. For example, one clever attack around third-party cookie blockers would be to have the publisher publish a “tracking” SXG that can be hosted by different distributors. The distributors could inject information in using this channel - smuggling bits into the SXG. If the SXG has access to storage or persistence APIs - for example, it can use IndexedDB or service workers - then it can create a persistent record, associated with “publishers” origin that the user visited “distributor”. This would all be invisible to the privacy conscious user, who would only see resource loads from the distributor - no third-party loads or cookies. The only way a privacy conscious user could regain those privacy properties would be to either block these subresource loads and substitutions, or block all SXGs from being loaded by distributors. Both seem like they would be a significant setback for the utility of this functionality and the utility of SXG in general, since privacy conscious browsers would likely do one or both of these things. I think those two properties - that the page explicitly requests the data that gets loaded in to itself (the security property) and that users, extensions, and privacy conscious browsers can then inspect, audit, alter, or block this data loading (the privacy properties) - are what the existing system has, and which this might undermine. If that’s a good framing of the problem, at least, then we may be able to identify or come up with solutions that can meet both sets of needs.

horo-t · 2019-01-07T02:23:55Z

Thank you for the detailed framing of the problem.

I think if the allowed-alternative-signed-exchange-subresources field must have the signatures of SXG #347 (comment), we can solve the security issue.

One possible solution for the privacy issue is like this: Privacy conscious browsers can delay the subresource SXGs loading until the all subresource SXGs are successfully verified. If one of the SXGs has an error, the browsers must fetch the original publisher's URL. So the distributors can't use the smuggling bits in the SXG.

sleevi · 2019-01-07T20:24:03Z

@horo-t I may be misunderstanding the proposal a bit, so I thought I'd try to write it out and check if it's what you're proposing:

Only declaratively-specified subresources would have this mapping applied, and only for first-order SXGs. That is, those SXGs loaded by JS (e.g. mutating a .src attribute) or those referenced within SXGs (for example, loading a CSS file that then loads dependent resources) won't go through this transformation. This is, AIUI, more restrictive than the generic preload scanner.
(Naive algorithm) After the page has fully loaded, and it's determined all URLs that this transformation would apply to, it then attempts to fetch all SXGs. After it has fully downloaded and verified the SXG (the entire resource), it may then either use all of those resources in lieu of the original URLs, or may otherwise restart and begin fetching those other URLs (throwing out all of the SXGs it downloaded)

Is that roughly the proposal? I see lots of edge cases, so I wasn't sure if I was missing something fundamental.

horo-t · 2019-01-09T01:59:33Z

Ah, I forgot to say about the link headers.

My proposal for the privacy issue is:

Privacy conscious browsers can use subresource SXGs only when the subresources are listed in the link (rel=preload) header in the signed response headers.
While loading the main resource SXG, the browser checks the link header.
If there are corresponding SXGs in Alternative-Signed-Exchange-Subresources map, the browser fetchs the SXGs.
After finishing the verification of the all SXGs, the browser can load the subresources from the SXGs. If there is an error, the browser must fetch the original URL for the all subresources.

horo-t · 2019-01-18T10:08:01Z

We (@jyasskin, @sleevi, @kinu, @horo-t) discussed about this issue yesterday. This is the summary.

Goal:
- Only while prefetching a main SXG before the content document starts to be processed, allow subresources SXGs to be preloaded using existing prefetch+preload mechanisms (e.g. link headers).
- Allow these subresource preloads to also be served by SXGs from the SXG distributor/physical origin, allowing more efficient loading and w/o requiring a connection to the inner resource’s logical origin.
To limit the complexity:
- We might restrict the usage of SXG subresource only for prefetches. No plan to support SXG subresources which were NOT prefetched.
- We might restrict the main SXG and the subresources SXGs to be served from the same host.

Possible attacks:

The Source (who has a link to the SXG) or Distributor sends a tracking ID to the Publisher
1.1. In query parameters or fragment
1.2. In the set of prefetched resources
1.3. In the content of prefetched resources
1.4. In the user history (referrer)
If the subresource request from the SXG is observable by the Distributor:
2.1. The Publisher can send arbitrary information to the Distributor
2.2. Accidental information leak may occur.
Version skew attack. An evil Distributor can serve old version JS which contains a bug.

Attack 1.1 is already possible without SXG.
- If the SXG's physical URL is observable by publishers, the Source can use it to send a tracking ID to publishers (attack 1.1). Signed Exchange Reporting to the publishers may expose this. Signed Exchange Reporting w3c/network-error-logging#99
If SXG subresources must be declared as prefetchable and all must be prefetched for any of the prefetches to apply:
- The Distributor can send only 1 bit (succeeded or failed) to the Publisher using the set of prefetched resources. (attack 1.2)
- This requirement also prevents the attack 2.1 and 2.2.
If the main SXG must have the subresources SXG's signatures in signed field:
- This Distributors can’t send a tracking ID in the content of prefetched resources. (attack 1.3)
- This also prevents version skew attack (attack 3).
- This requirement may make the packaging tool complex.
- We might need to think about WebFonts case.
Publishers can know the source page URL which has a link to the SXG using document.referrer. (attack 1.4)
- The source page can send a tracking ID using the page URL.
- This is the status quo.
- document.referrer in SXG is not supported yet in Chromium (https://crbug.com/920905).

yoavweiss · 2019-01-31T10:11:15Z

If attack 1 is already possible with or without SXG (using either 1.1 or 1.4), why is it important to block 1.2?

Requiring subresource signatures makes sense from a security perspective (to prevent content injection). blocking 1.3 is a nice side-effect of that.

It's also not immediately clear to me how limiting this to prefetches reduces the complexity or increases privacy. Can you elaborate on that?

jyasskin · 2019-01-31T16:51:39Z

Limiting it to prefetches prevents attack #2. Unless you're thinking of a third kind of fetch besides prefetches and post-load fetches?

@RByers, do you have a feeling for which attacks we can exclude from the threat model because they're possible today?

yoavweiss · 2019-02-01T09:36:37Z

Isn't attack #2 readily available to any page with network access? e.g. can't they send a 1x1 pixel image to with request parameters distributor.com/tracking to leak whatever information that they so choose?

sleevi · 2019-02-03T15:54:01Z

@yoavweiss No.

When prefetching, the author has to declaratively commit to what to disclose, rather than being able to leak traffic from the current origin. Note the caveats on #347 (comment) as well.

I think an important gap in that comparison is that this is not loading distributor.com/tracking, but allowing any intermediary to insert and/or observe traffic in the session. While it's true that it's "with the consent" of the origin (by virtue of saying how they can collaborate), it's functionally indistinguishable from mixed content. That is, despite the HTTPS page 'wanting' to load the HTTP page, it's not in the user's security or privacy interests to do so. Similarly, unlike an explicitly keyed load of distributor.com/example (which you can do prior to signing the SXG, if you explicitly indicate to load that SXG), this would allow attackers full mutability of where that content is loaded from. That is, they're not just causing it to load from distributor.com/tracking but {insert distributor here}, which as the analysis above explains, turns into a full primitive for insecure side-channels and injection unless limited to prefetch.

Hopefully, that explains why it's not at all comparable.

horo-t · 2019-02-05T01:45:32Z

Instead of adding two new dedicated fields (Allowed-Alternative-Signed-Exchange-Subresources,Alternative-Signed-Exchange-Subresources) in the application/signed-exchange format, extending the link header sounds reasonable.

For example:

In unsigned HTTP response from distributor.example:

content-type: application/signed-exchange
link: <https://distributor.example/publisher.example/script.js.sxg>;rel="alternate";type="application/signed-exchange";anchor="https://publisher.example/script.js";

In signed response header of SXG:

link: <https://publisher.example/script.js>;rel="allowed-alt-sxg";sig="MEUCIA..."
link: <https://publisher.example/script.js>;rel="preload";as="script"

(Sorry for contradicting my previous comment)

jyasskin · 2019-02-07T00:45:16Z

I looked through http://microformats.org/wiki/existing-rel-values and https://www.iana.org/assignments/link-relations/link-relations.xhtml but didn't see anything that seems to serve the purpose of rel="allowed-alt-sxg", so we're free to invent our own name.

We should think about whether to include the format version number in the type="application/signed-exchange" bit. I suspect we should require that version number since the distributor has already received an Accept header saying which version(s) the client supports.

Should we make allowed-alt-sxg a separate Link or an extra parameter to the preload Link? We have lots of precedent for adding parameters to preload, but this one's a bit weird because we don't want it to affect preloads retrieved directly from the publisher.

yoavweiss · 2019-02-08T09:57:18Z

OK, that makes that clearer. Thanks!

horo-t · 2019-02-25T07:03:20Z

We should think about whether to include the format version number in the type="application/signed-exchange" bit. I suspect we should require that version number since the distributor has already received an Accept header saying which version(s) the client supports.

Having the format version number in the type="application/signed-exchange" bit sounds good to me.

Should we make allowed-alt-sxg a separate Link or an extra parameter to the preload Link? We have lots of precedent for adding parameters to preload, but this one's a bit weird because we don't want it to affect preloads retrieved directly from the publisher.

I think we should have the separateallowed-alt-sxg Link. If we have the signature param in preload Link, it will be completed to selectively preload images using imagesrcset and imagesizes.

Example:
In unsigned HTTP response from distributor.example:

content-type: application/signed-exchange
link: <https://distributor.example/publisher.example/wide.jpg.sxg>;rel="alternate";type="application/signed-exchange;v=XX";anchor="https://publisher.example/wide.jpg";
link: <https://distributor.example/publisher.example/narrow.jpg.sxg>;rel="alternate";type="application/signed-exchange;v=XX";anchor="https://publisher.example/narrow.jpg";

In signed response header of SXG:

link: <https://publisher.example/wide.jpg>;rel="allowed-alt-sxg";sig="MEUCIB..."
link: <https://publisher.example/narrow.jpg>;rel="allowed-alt-sxg";sig="MEUCIC..."
link: <https://publisher.example/wide.jpg>;rel=preload; as=image;imagesrcset="https://publisher.example/wide.jpg 640w, https://publisher.example/narrow.jpg 320w";imgesizes="(min-width: 400px) 50vw, 100vw"

horo-t · 2019-02-25T07:17:07Z

Filed a crbug: https://crbug.com/935267

horo-t · 2019-02-27T06:39:56Z

I wrote that 'allowed-alternative-signed-exchange-subresources' ('allowed-alt-sxg' in the current idea) should have the signatures of subresources.
But the signature is valid only for 7 days.

Instead of the signature, I want to use SHA-256 hash of headerBytes byte sequence which includes digest header and content-type header and other arbitrary headers.
Note that we can't use the digest header for the integrity check. If we do so, subresource SXGs are used for user tracking by adding arbitrary headers or changing content-type to cause image load failure.

The signed response header of main SXG will be like this:

link: <https://publisher.example/wide.jpg>;rel="allowed-alt-sxg";header-integrity="sha256-h0KP..."
link: <https://publisher.example/narrow.jpg>;rel="allowed-alt-sxg";header-integrity="sha256-AmOC..."
link: <https://publisher.example/wide.jpg>;rel="preload";as=image;imagesrcset="https://publisher.example/wide.jpg 640w, https://publisher.example/narrow.jpg 320w";imgesizes="(min-width: 400px) 50vw, 100vw"

mfalken · 2019-03-14T03:49:11Z

Instead of fetching the original URL in publisher.example, UA should fetch the SXG files in distributor.example after checking the

How does this interact with service workers? Would the request go to publisher.example's service worker first?

horo-t · 2019-03-14T04:21:53Z

How about introducing a new method "getPreloadedResponses()" in FetchEvent?
Service workers can get the prefetched subresources which are preloaded while prefetching the main resource in the previous page.

interface FetchEvent : ExtendableEvent {
  ...
  Promise<FrozenArray<Response>> getPreloadedResponses();
};

kinu · 2019-03-14T07:42:10Z

It looks it should probably clarify where the URL replacement happens in the Fetch process model? Is the assumption that the replacement layer sits between the page and SW?

mfalken · 2019-03-14T07:46:07Z

Yes this is my point of confusion... would be useful to see the sequence of which service workers get consulted when and when the replacement happens for the main resource and the subresources.

jyasskin · 2019-03-15T22:15:14Z

@mattto To preserve privacy during the prefetch, the publisher's SW MUST NOT get an event saying which subresources are getting prefetched, even transitively.

We do need to specify that ... but doing so will be difficult until there's a specification of how <link rel="prefetch"> interacts with Service Workers at all.

My guess is that whatever bit of the browser is scanning a prefetched resource for preloads to prefetch recursively (whew) needs to maintain the mapping of available alternate SXGs, and replace the URLs before it invokes Fetch. @horo-t / @kinu, does that make sense?

horo-t · 2019-03-18T06:10:43Z

My current idea of Service Worker and subresource SXG prefetching integration is like this:

The user opens "https://aggregator.example/index.html".
When the UA processes <link rel="prefetch" href="https://distributor.example/publisher/article.sxg">:
- Invoke the FetchEvent of aggregator's SW with "article.sxg" request.
- If the SW didn't call respondWith(), perform a HTTP-network-or-cache fetch. The SXG is stored to the HTTPCache.
The response has the following headers:
- In unsigned outer HTTP response:
  - content-type: application/signed-exchange
  - link: <https://distributor.example/publisher/script.js.sxg>;rel="alternate";type="application/signed-exchange[;v=...]";anchor="https://publisher.example/script.js";
- In signed inner response header:
  - link: <https://publisher.example/script.js>;rel="allowed-alt-sxg";header-integrity="sha256-MEUCIA..."
  - link: <https://publisher.example/script.js>;rel="preload";as="script"
The UA processes the headeres, and starts prefetching "https://distributor.example/publisher/script.js.sxg".
- Invoke the FetchEvent of aggregator's SW with "script.js.sxg" request.
- If the SW didn't call respondWith(), perform a HTTP-network-or-cache fetch. The SXG is stored to the HTTPCache.
The user clicks the link of https://distributor.example/publisher/article.sxg
- Invoke the FetchEvent of distributor's SW with "article.sxg" request.
- If the SW didn't call respondWith(), perform a HTTP-network-or-cache fetch. The SXG is served from the HTTPCache.
The UA processes the SXG response as if it is a 303 redirect to "https://publisher.example/article.html" and set request’s "stashed exchange" to the parsedExchange.
- Invoke the FetchEvent of publisher's SW with "article.html" request.
  - FetchEvent.request.url is "https://publisher.example/article.html".
  - FetchEvent.preloadResponse is a promise which returns the inner response.
  - FetchEvent.getPreloadedResponses() returns a promise which returns the preloaded responses (In this case: "https://publisher.example/script.js").
- If the SW didn't call respondWith(), perform a HTTP-network-or-cache fetch. The response is set to the stashed exchange's response.
The UA processes the link header.
- Invoke the FetchEvent of publisher's SW with "script.js" request. The SW can return the response which have been retrieved using getPreloadedResponses() at 6.
- If the SW didn't call respondWith(), Check the existence of the prefetched "script.js.sxg" in HTTPCache, and returns the inner response of it if exists. Othewise perform a HTTP-network-or-cache fetch.

jyasskin · 2019-03-18T23:20:51Z

I like that overall sketch.

"The SXG is stored to the HTTPCache." is ambiguous here, since we're designing for a multi-key'ed HTTP cache. We'll wind up using terminology from w3c/resource-hints#82, but I think the goal is to put:

https://distributor.example/publisher/article.sxg in the new prefetch cache and
The inner resource from https://distributor.example/publisher/script.js.sxg in a cache that's promoted to the https://publisher.example origin's partition of the HTTP cache only if the navigation is to https://publisher.example/article.html.

This promotion to the HTTP cache reminds me of things @sleevi has been nervous of, and I don't understand his concerns well enough to know if they're assuaged by this happening only on navigation to the controlling top-level document.

Separately, I haven't thought through whether we need the FetchEvent.getPreloadedResponses() method, and I suspect you should propose it separately from this SXG proposal. It probably makes sense, or doesn't, for all recursive prefetches, so should go to https://github.com/w3c/resource-hints?

kinu · 2019-03-19T01:04:24Z

@jyasskin @sleevi Reg: inner resource and HTTP cache, if we're feeling ready to talk about this I prefer we discuss the generic case first, possibly in a separate issue, before talking about this specific case, could we?

@horo-t Reg: FetchEvent.getPreloadResponses(): why don't we just let FetchEvent.preloadResponse expose the preload-to-prefetched resource for the particular fetch (e.g. for "script.js")? It looks UA anyway needs to track the relationship until step 7, wasn't sure why returning an array in navigation request is better. Either way I agree with @jyasskin that proposing this separately might be good, I think similar idea has been discussed somewhere else before (e.g. exposing prefetched response as FetchEvent.preloadResponse).

But also wondered if we do start to store innerResponse in HTTP cache when navigation happens something like 2. in #issuecomment-474138609 then getPreloadResponses() might not be really needed?

mfalken · 2019-03-19T01:27:27Z

Thanks for sketching that out, that's very clear. I'll note that this adds more cases where respondWith(fetch(event.request)) differs from not calling respondWith(). Historically we've tried to keep those equivalent, but maybe we've already lost that guarantee, and it aligns with the main resource SXG. Was there a discussion about the SW interaction described in https://wicg.github.io/webpackage/loading.html#overview?

horo-t · 2019-03-19T01:55:51Z

@kinu
Using FetchEvent.preloadResponse for the preload-to-prefetched resources sounds good to me.
FetchEvent.getPreloadResponses() may be useful when we want to store the prefetched subresrouces to CacheStorage. But I don't think this is super important.

I commented about using FetchEvent.preloadResponse in SW for prefetched resources at w3c/resource-hints#78 (comment). Let's discuss about it there.

horo-t · 2019-03-19T02:22:24Z

@mattto
The SW integration with signed exchange was added to the spec at #281 (comment). If we want to keep calling respondWith(fetch(event.request)) and not calling respondWith() same behavior, we need to change the spec. I think we should discuss about it in a separate issue.

mfalken · 2019-03-19T06:28:59Z

Thanks, filed #409

sleevi · 2019-03-20T16:12:58Z

The UA processes the headeres, and starts prefetching "https://distributor.example/publisher/script.js.sxg".

Invoke the FetchEvent of aggregator's SW with "script.js.sxg" request.

If the SW didn't call respondWith(), perform a HTTP-network-or-cache fetch. The SXG is stored to the HTTPCache.

I find this part uncomfortable and hard to reason about. In a 'normal' TLS loading case, my understanding is that aggregator.example would have no knowledge of distributor.example preloading here, so this feels like a new information disclosure vector.

If I understand correctly, but wanting to confirm, we're reasoning that this isn't particularly new information, because aggregator.example can see the headers of the inner SXG, and thus know about the link: ...;rel="preload" content, and thus know what the user will load anyways. Does that sound roughly correct?

I think one area that would need more specificity here is what happens if aggregator.example does trigger a respondWith() call for the fetch to distributor.example/publisher/script.js.sxg.

What if they glue it to a fetch event of `otherdistributor.example/publisher/script.js.sxg'
What if they glue it to a fetch event of publisher.example/script.js
What if they glue it to a synthetic response (e.g. a blob) which has the same header-integrity value as expressed in the rel="allowed-alt-sxg" (which AIUI refers to the hash of the inner content not the outer content?)

Separate from these concerns, as @jyasskin highlighted, we need to figure out what it means by storing in the HTTPCache / serving from the HTTPCache, and how those requests are inserted and matched. If I understood @kinu's comment it sounds like we're good to defer that?

horo-t · 2019-03-23T05:36:34Z

Humm...
Now I think we should skip service workers for prefetching requests (2 and 4 of #347 (comment)), at least for MVP (minimum viable product).

Introducing the new prefetch cache sounds good to me. If we can put the prefetched resources (https://distributor.example/publisher/article.sxg and https://distributor.example/publisher/script.js.sxg) and the certificate URL of each SXGs to the new prefetch cache, and we can use the cached resources when navigating from https://aggregator.example/index.html, this mechanism works even when double-key caching is enabled. (I'm trying to find a good way to implement this in Chromium.)

I still don't know whether it is ok or not to store the inner resources (https://publisher.example/article.html and https://publisher.example/script.js) into the prefetch cache.
It is good for performance because we can skip the verification process.
@sleevi Do you have any concern about it?

horo-t · 2019-07-11T00:16:40Z

I have written two explainer documents.

Signed Exchange subresource substitution
- This introduces rel="allowed-alt-sxg" link header.
- By using this header, content publishers can declare that the UA can load the specific subresources from cached signed exchanges which were prefetched in the referrer page.
Signed Exchange alternate link
- This extends the usage of the existing rel="alternate" link header.
- By using this header, UAs can recursively prefetch appropriate subresource signed exchanges while prefetching the main resource signed exchange.

I uploaded explainer documents of subresource signed exchanges to my repository (https://github.com/horo-t/subresource-signed-exchange). But they should be in this webpackage repository. So this patch copies them from "horo-t/subresource-signed-exchange" repository. Spec issue: WICG#347 TAG review: w3ctag/design-reviews#352

I uploaded explainer documents of subresource signed exchanges to my repository (https://github.com/horo-t/subresource-signed-exchange). But they should be in this webpackage repository. So this patch copies them from "horo-t/subresource-signed-exchange" repository. Spec issue: #347 TAG review: w3ctag/design-reviews#352

horo-t changed the title ~~Two new fields in SXG format to support subresource loading~~ Extend link HTTP header to support subresource signed exchange loading Feb 5, 2019

shhnjk mentioned this issue Mar 1, 2019

Consider adding web package mime types to CORB #402

Open

horo-t mentioned this issue Mar 12, 2019

Subresource prefetching+loading via Signed HTTP Exchange w3ctag/design-reviews#352

Closed

3 tasks

horo-t mentioned this issue Mar 19, 2019

Prefetch vs Service Workers w3c/resource-hints#78

Closed

mfalken mentioned this issue Mar 19, 2019

SXG loading and service worker integration #409

Closed

lidel mentioned this issue Mar 21, 2019

Signed/Bundled HTTP Exchanges and WebPackage ipfs/in-web-browsers#121

Open

mfalken mentioned this issue Apr 5, 2019

Is respondWith(fetch(event.request)) always the same as not calling respondWith()? w3c/ServiceWorker#1395

Open

sleevi mentioned this issue May 13, 2019

Is it OK to let the publishers know the SXG distributor's URL? #433

Open

horo-t mentioned this issue Jun 28, 2019

Make dump-signedexchange show header integrity value #447

Merged

WICG deleted a comment from almshibin Sep 27, 2019

horo-t mentioned this issue Dec 4, 2019

Add an explainer for subresource signed exchanges #542

Merged

Extend link HTTP header to support subresource signed exchange loading #347

Extend link HTTP header to support subresource signed exchange loading #347

Comments

horo-t commented Dec 7, 2018 • edited Loading

Problem

Example

How UAs should work

Subresource monitoring attack

Tracking using subresource SXG

sleevi commented Dec 7, 2018

horo-t commented Dec 11, 2018

sleevi commented Dec 11, 2018

horo-t commented Dec 12, 2018

sleevi commented Dec 12, 2018

jyasskin commented Dec 12, 2018 • edited Loading

sleevi commented Dec 12, 2018

jyasskin commented Dec 12, 2018

jyasskin commented Dec 12, 2018 • edited Loading

sleevi commented Dec 12, 2018

sleevi commented Dec 12, 2018

jyasskin commented Dec 13, 2018

horo-t commented Dec 15, 2018

horo-t commented Dec 15, 2018

sleevi commented Dec 15, 2018 via email

horo-t commented Jan 7, 2019

sleevi commented Jan 7, 2019

horo-t commented Jan 9, 2019

horo-t commented Jan 18, 2019

Possible attacks:

yoavweiss commented Jan 31, 2019

jyasskin commented Jan 31, 2019 • edited Loading

yoavweiss commented Feb 1, 2019

sleevi commented Feb 3, 2019

horo-t commented Feb 5, 2019

jyasskin commented Feb 7, 2019

yoavweiss commented Feb 8, 2019

horo-t commented Feb 25, 2019

horo-t commented Feb 25, 2019

horo-t commented Feb 27, 2019

mfalken commented Mar 14, 2019

horo-t commented Mar 14, 2019

kinu commented Mar 14, 2019

mfalken commented Mar 14, 2019

jyasskin commented Mar 15, 2019

horo-t commented Mar 18, 2019

jyasskin commented Mar 18, 2019

kinu commented Mar 19, 2019

mfalken commented Mar 19, 2019

horo-t commented Mar 19, 2019

horo-t commented Mar 19, 2019

mfalken commented Mar 19, 2019

sleevi commented Mar 20, 2019

horo-t commented Mar 23, 2019

horo-t commented Jul 11, 2019 • edited Loading

horo-t commented Dec 7, 2018 •

edited

Loading

jyasskin commented Dec 12, 2018 •

edited

Loading

jyasskin commented Dec 12, 2018 •

edited

Loading

jyasskin commented Jan 31, 2019 •

edited

Loading

horo-t commented Jul 11, 2019 •

edited

Loading