-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend link HTTP header to support subresource signed exchange loading #347
Comments
The security implications of this are non-obvious, and may benefit from being fleshed out more. In particular, I always get uncomfortable when seeing new unsigned field - they make it dangerous for parsers and give attackers a lot of opportunities, some non-obvious. My first thought was about substitution attacks - what prevents an attacker from modifying the URL for The next question is what the implications are of a version substitution of the SXG subresource. In this case, The final question is what the implications are of creating a bidirectional communication path. As you note, by virtue of the Allowed-Alternative-Signed-Exchange-Subresources, this potentially creates a communication channel between the distributor and the publisher when loading the SXG. The privacy implications are profound (as noted), but also the security implications of what happens if sites begin to rely on this communication channel, but that it is, by nature, unauthenticated (i.e. |
Yes. UAs must check the subresource SXG's inner URL.
Publishers must be careful to add subresources to the
Introducing a new CSP directive |
I don't think this addresses the substance of the concern. I was not attempting to say "We should grant control to page authors", I was trying to frame it as "We need to carefully reason about the security implications of allowing this". The goal of framing it like that is to understand if we're proposing a mechanism that is default-insecure, whether that's desirable in and of itself, and what solutions might exist. Even more concretely: I'm not convinced we should be displaying the publisher origin if we allow for arbitrary content injection by the distributor, which having such a channel would imply. I think we need to carefully reason about that. This statement is based on seeing the harm come from code-signing systems that allow for (limited) content injection/manipulation - such as Authenticode or macOS Bundles. It's certainly true that such unauthenticated injection allows for things developers perceive interesting and useful use cases - for example, injecting whether or not a user has opted-in to metrics collection services on the download page for an executable (e.g. Chrome) - but it's also true that such methods have caused a substantial number of security vulnerabilities (e.g. https://docs.microsoft.com/en-us/security-updates/securityadvisories/2014/2915720 ) |
How about having the each signatures of subresources in
|
As I mentioned previously, it's probably more useful to analyze the problem before we try to step forward to solve the problem. The latest proposed approach has, for example, the same deficiency w/r/t user tracking - if you treat This is why it's helpful to first make sure we've analyzed the problem, clearly stated it, and made sure it's not, in fact, a pre-existing problem, so then we can look at solution spaces or make informed tradeoff decisions. |
IIUC, the core goal here is that if Bundles solve this, but they require We've talked at times about adding a way to specify external dependencies for bundles. The The Content-Location: https://distributor.example/article1.html.sxg
Link: <https://distributor.example/framework.v2.js.sxg>; anchor="https://publisher.example/framework.v2.js"; rel=alternate_tbd |
@jyasskin Your mention of bundles made me realize that there may be another implication of this - cache probing. That is, if the distributor can modify the URL used to fetch subresources, can it infer or learn what sub-resources the user may already have (cached or loaded) by seeing which requests are not made? That is, if I fetch |
In what ways is the recommended value for |
@sleevi Cute. I agree cache probing is a risk, but I think the UA can solve it the same way we solve other cache-based tracking attempts: we fetch the resource redundantly if the server we're fetching it from shouldn't know whether it's already cached. Edit: And searchengine.example needs to take that cost into account when deciding whether to offer a SXG for any particular resource. |
@jyasskin Sure, I didn't explicitly come out and say 'double-keyed caching', but I think that's the assumption. But that's something unique, in this case, because it's not double-keying based on the resource's logical origin ( |
Wouldn't it be both signed and unsigned preloads? That is,
Or did I misunderstand the question? |
My I think it's more complex than "double-keying" or even "physical" vs "logical". We need to find a way to describe which entities (origins or organizations) know that which other entities have asked the profile to download a URL. I think each entry in the cache winds up annotated with a list of entities that are allowed to know it's cached, and if you request it but aren't in that list, it gets refetched from the network. ... But I haven't thought that all the way through. |
I think introducing a new Link header instead of
To avoid letting the distributor know about the existence of publisher's content in the user's HTTPCache, UA must fetch |
I think the problem of this idea is that the user can't notice the channel which can be used by the distributor to send arbitrary information to the publisher. Is my understanding correct? |
I think the problem of this idea is that the user can't notice the channel
which can be used by the distributor to send arbitrary information to the
publisher. Is my understanding correct?
We can frame the problem like this:
In the existing Web platform, the document always explicitly requests any
dependent information - if it isn’t inline in the document itself, it’s in
the headers or the subresources it loads. In all cases, the displayed page
explicitly makes the requests to get extra information.
This explicitness is good for security. It helps make sure that all the
information in the page can be audited, and you can be sure where and how
information makes it’s way in. A page author can then use things like CSP
to restrict the inflow and outflow of information even further.
This explicitness is also good for privacy. By making all information flow
outgoing and explicit, users and extensions can inspect, audit, or alter
the outflow of information. This is used heavily by privacy preserving
extensions and users.
The proposed CSP directive is trying to address the security aspect, by
giving page authors a means to control the content. It is unsafe by default
- the SXG packager that didn’t use the CSP could find unexpected content
injected, which is functionally indistinguishable from a MITM. We might say
that requires cleverness by the distributor, but if we’re worried about
security as browser and spec authors, we need to worry about clever
distributors.
The problem with the approach is it doesn’t address the privacy angle. This
is when both the distributor and publisher are clever and collaborating.
For example, one clever attack around third-party cookie blockers would be
to have the publisher publish a “tracking” SXG that can be hosted by
different distributors. The distributors could inject information in using
this channel - smuggling bits into the SXG. If the SXG has access to
storage or persistence APIs - for example, it can use IndexedDB or service
workers - then it can create a persistent record, associated with
“publishers” origin that the user visited “distributor”. This would all be
invisible to the privacy conscious user, who would only see resource loads
from the distributor - no third-party loads or cookies.
The only way a privacy conscious user could regain those privacy properties
would be to either block these subresource loads and substitutions, or
block all SXGs from being loaded by distributors. Both seem like they would
be a significant setback for the utility of this functionality and the
utility of SXG in general, since privacy conscious browsers would likely do
one or both of these things.
I think those two properties - that the page explicitly requests the data
that gets loaded in to itself (the security property) and that users,
extensions, and privacy conscious browsers can then inspect, audit, alter,
or block this data loading (the privacy properties) - are what the existing
system has, and which this might undermine.
If that’s a good framing of the problem, at least, then we may be able to
identify or come up with solutions that can meet both sets of needs.
|
Thank you for the detailed framing of the problem. I think if the allowed-alternative-signed-exchange-subresources field must have the signatures of SXG #347 (comment), we can solve the security issue. One possible solution for the privacy issue is like this: Privacy conscious browsers can delay the subresource SXGs loading until the all subresource SXGs are successfully verified. If one of the SXGs has an error, the browsers must fetch the original publisher's URL. So the distributors can't use the smuggling bits in the SXG. |
@horo-t I may be misunderstanding the proposal a bit, so I thought I'd try to write it out and check if it's what you're proposing:
Is that roughly the proposal? I see lots of edge cases, so I wasn't sure if I was missing something fundamental. |
Ah, I forgot to say about the link headers. My proposal for the privacy issue is:
|
We (@jyasskin, @sleevi, @kinu, @horo-t) discussed about this issue yesterday. This is the summary.
Possible attacks:
|
If attack 1 is already possible with or without SXG (using either 1.1 or 1.4), why is it important to block 1.2? Requiring subresource signatures makes sense from a security perspective (to prevent content injection). blocking 1.3 is a nice side-effect of that. It's also not immediately clear to me how limiting this to prefetches reduces the complexity or increases privacy. Can you elaborate on that? |
Limiting it to prefetches prevents attack #2. Unless you're thinking of a third kind of fetch besides prefetches and post-load fetches? @RByers, do you have a feeling for which attacks we can exclude from the threat model because they're possible today? |
Isn't attack #2 readily available to any page with network access? e.g. can't they send a 1x1 pixel image to with request parameters |
@yoavweiss No. When prefetching, the author has to declaratively commit to what to disclose, rather than being able to leak traffic from the current origin. Note the caveats on #347 (comment) as well. I think an important gap in that comparison is that this is not loading Hopefully, that explains why it's not at all comparable. |
Instead of adding two new dedicated fields (Allowed-Alternative-Signed-Exchange-Subresources,Alternative-Signed-Exchange-Subresources) in the application/signed-exchange format, extending the link header sounds reasonable. For example: In unsigned HTTP response from distributor.example:
In signed response header of SXG:
(Sorry for contradicting my previous comment) |
I looked through http://microformats.org/wiki/existing-rel-values and https://www.iana.org/assignments/link-relations/link-relations.xhtml but didn't see anything that seems to serve the purpose of We should think about whether to include the format version number in the Should we make |
OK, that makes that clearer. Thanks! |
Having the format version number in the
I think we should have the separate Example:
In signed response header of SXG:
|
Filed a crbug: https://crbug.com/935267 |
I wrote that 'allowed-alternative-signed-exchange-subresources' ('allowed-alt-sxg' in the current idea) should have the signatures of subresources. Instead of the signature, I want to use SHA-256 hash of The signed response header of main SXG will be like this:
|
How does this interact with service workers? Would the request go to publisher.example's service worker first? |
How about introducing a new method "getPreloadedResponses()" in FetchEvent?
|
It looks it should probably clarify where the URL replacement happens in the Fetch process model? Is the assumption that the replacement layer sits between the page and SW? |
Yes this is my point of confusion... would be useful to see the sequence of which service workers get consulted when and when the replacement happens for the main resource and the subresources. |
@mattto To preserve privacy during the prefetch, the publisher's SW MUST NOT get an event saying which subresources are getting prefetched, even transitively. We do need to specify that ... but doing so will be difficult until there's a specification of how My guess is that whatever bit of the browser is scanning a prefetched resource for preloads to prefetch recursively (whew) needs to maintain the mapping of available alternate SXGs, and replace the URLs before it invokes Fetch. @horo-t / @kinu, does that make sense? |
My current idea of Service Worker and subresource SXG prefetching integration is like this:
|
I like that overall sketch. "The SXG is stored to the HTTPCache." is ambiguous here, since we're designing for a multi-key'ed HTTP cache. We'll wind up using terminology from w3c/resource-hints#82, but I think the goal is to put:
This promotion to the HTTP cache reminds me of things @sleevi has been nervous of, and I don't understand his concerns well enough to know if they're assuaged by this happening only on navigation to the controlling top-level document. Separately, I haven't thought through whether we need the |
@jyasskin @sleevi Reg: inner resource and HTTP cache, if we're feeling ready to talk about this I prefer we discuss the generic case first, possibly in a separate issue, before talking about this specific case, could we? @horo-t Reg: FetchEvent.getPreloadResponses(): why don't we just let FetchEvent.preloadResponse expose the preload-to-prefetched resource for the particular fetch (e.g. for "script.js")? It looks UA anyway needs to track the relationship until step 7, wasn't sure why returning an array in navigation request is better. Either way I agree with @jyasskin that proposing this separately might be good, I think similar idea has been discussed somewhere else before (e.g. exposing prefetched response as FetchEvent.preloadResponse). But also wondered if we do start to store innerResponse in HTTP cache when navigation happens something like 2. in #issuecomment-474138609 then getPreloadResponses() might not be really needed? |
Thanks for sketching that out, that's very clear. I'll note that this adds more cases where |
@kinu I commented about using FetchEvent.preloadResponse in SW for prefetched resources at w3c/resource-hints#78 (comment). Let's discuss about it there. |
@mattto |
Thanks, filed #409 |
I find this part uncomfortable and hard to reason about. In a 'normal' TLS loading case, my understanding is that If I understand correctly, but wanting to confirm, we're reasoning that this isn't particularly new information, because I think one area that would need more specificity here is what happens if
Separate from these concerns, as @jyasskin highlighted, we need to figure out what it means by storing in the HTTPCache / serving from the HTTPCache, and how those requests are inserted and matched. If I understood @kinu's comment it sounds like we're good to defer that? |
Humm... Introducing the new prefetch cache sounds good to me. If we can put the prefetched resources (https://distributor.example/publisher/article.sxg and https://distributor.example/publisher/script.js.sxg) and the certificate URL of each SXGs to the new prefetch cache, and we can use the cached resources when navigating from https://aggregator.example/index.html, this mechanism works even when double-key caching is enabled. (I'm trying to find a good way to implement this in Chromium.) I still don't know whether it is ok or not to store the inner resources (https://publisher.example/article.html and https://publisher.example/script.js) into the prefetch cache. |
I have written two explainer documents.
|
I uploaded explainer documents of subresource signed exchanges to my repository (https://github.com/horo-t/subresource-signed-exchange). But they should be in this webpackage repository. So this patch copies them from "horo-t/subresource-signed-exchange" repository. Spec issue: WICG#347 TAG review: w3ctag/design-reviews#352
I uploaded explainer documents of subresource signed exchanges to my repository (https://github.com/horo-t/subresource-signed-exchange). But they should be in this webpackage repository. So this patch copies them from "horo-t/subresource-signed-exchange" repository. Spec issue: #347 TAG review: w3ctag/design-reviews#352
I want to introduce two new fields in application/signed-exchange format.
Problem
Currently content publishers can sign their HTML contents using their own private keys. User Agents (UAs) can trust the signed contents as if the contents are served from the publisher’s origins even if they are served from other distributors’ origins. The signed contents can be served from any distributors’ origins. But if the publisher wants to serve subresources such as scripts and images from the distributors’ origin, the publisher needs to change the subresource URLs in the HTML to point to each distributors’ URL and need to sign for each distributor. The proposed two fields can solve this problem.
Alternative-Signed-Exchange-Subresources map:
A map from the original subresource requests to the SXG URLs. This field is not signed. So the distributor can change this field to point to their URLs.
Allowed-Alternative-Signed-Exchange-Subresources list:
The subresource URL list which can be served using SXG instead of fetching the original URL. This field is signed by the publisher. So the distributor can’t change this field.
Example
Publisher: https://publisher.example/article_1.html
SXG in Publisher: https://publisher.example/article_1.html.sxg
SXG in Distributor: https://distributor.example/article_1.html.sxg
How UAs should work
Alternative-Signed-Exchange-Subresources
field and theAllowed-Alternative-Signed-Exchange-Subresources
field.Allowed-Alternative-Signed-Exchange-Subresources
field, the UA must fetch the original URL. This is intended to avoid the subresource monitoring attack.link: <https://example.com/framework.js>;rel="preload";as="script"
)Subresource monitoring attack
We need the signed
Allowed-Alternative-Signed-Exchange-Subresources
to avoid the subresource monitoring attack like this:icon.src = USER_ID + '.png';
{ 'example.com/a.png': 'attacker.com/a.png.sxg', 'example.com/b.png': 'attacker.com/b.png.sxg', ....}
Tracking using subresource SXG
We need to prohibit the SXG loading for cross-origin subresources to avoid the user tracking like this:
Allowed-Alternative-Signed-Exchange-Subresources
field (https://tracking.example/id.js)Alternative-Signed-Exchange-Subresources
field.Tracking is still possible even if we prohibit cross-origin subresources using the following logic. But this is more difficult.
Allowed-Alternative-Signed-Exchange-Subresources
field (https://publisher.example/00, 01, ... 29)Alternative-Signed-Exchange-Subresources
field.The text was updated successfully, but these errors were encountered: