-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add prefetch processing model, including double-key caching privacy protections #4115
Conversation
source
Outdated
<var>as</var>.</p></li> | ||
<li>If the browser is using both the <var>request</var>'s <span data-x="concept-request-url">URL</span> and the <span data-x="top-level-browsing-context">top-level browsing context</span>'s <span data-x="document">document</span>'s <var data-x="dom-document-origin">origin</span> as cache keys for <var>request</var>, then: | ||
<ol> | ||
<li><p>If <var>request</var>'s <span data-x="concept-request-credentials-mode">credentials mode</span> is "include", then return.</p></li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be same-origin, depending on how we define the browsing context of this fetch request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on that?
source
Outdated
<ol> | ||
<li><p>Set <var>request</var>'s <span data-x="concept-request-initiator">initiator</span> | ||
to "prefetch".</p></li> | ||
<li><p>Set <var>request</var>'s <span data-x="concept-request-keepalive-flag">keep-alive</span> flag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also wondering whether we should use keep alive or not.
My understanding so far is that keep alive has one context (the initial one) and when it goes away, it has no context.
This implies putting some restrictions on the number of keep alive requests. It also means we do not care about the response when context goes away.
For prefetch, the initial context is the same, but once it gets destroyed through navigation, we might either actually use the prefetch for the navigation task (hence using the response) or cancel it (navigating to some other URL).
This might be a different model with different restrictions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add a text like 'the UA may abort the fetch if navigation happens to a different URL' could it work? Reusing response part could happen at cache level from impl pov but the part is not really spec'ed so it might be a bit tricky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a speculative flag at whatwg/fetch#881
Can you take a look?
source
Outdated
attribute.</p></li> | ||
<li><p>Set <var>request</var>'s <span data-x="concept-request-destination">destination</span> | ||
to the result of <span data-x="concept-potential-destination-translate">translating</span> | ||
<var>as</var>.</p></li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure 'as' will always be a valid destination? If not valid, are we ending up with the destination equal to the empty string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think since as
is an enumerated attribute, it can be in a conforming state and a non-conforming state, so we'll probably want a guard around it to make sure we only attempt translations on conforming states; see https://html.spec.whatwg.org/multipage/links.html#link-type-modulepreload:attr-link-as-2
source
Outdated
<li><p>Set <var>request</var>'s <span data-x="concept-request-destination">destination</span> | ||
to the result of <span data-x="concept-potential-destination-translate">translating</span> | ||
<var>as</var>.</p></li> | ||
<li>If the browser is using both the <var>request</var>'s <span data-x="concept-request-url">URL</span> and the <span data-x="top-level-browsing-context">top-level browsing context</span>'s <span data-x="document">document</span>'s <var data-x="dom-document-origin">origin</span> as cache keys for <var>request</var>, then: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is the first introduction of that concept in web specs.
Should it be in fetch spec or somewhere else?
I understand 'cache keys' for request, fetch is referring to 'HTTP cache' so maybe HTTP should be made more explicit there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with whatever works to define this. I agree that it would be better if "cache key" here referred to something in HTTP or Fetch.
source
Outdated
<li><p>If <var>request</var>'s <span data-x="concept-request-destination">destination</span> is not "document", then return.</p></li> | ||
<li><p>Set <var>request</var>'s <span data-x="concept-request-redirect-mode">redirect mode</span> to "manual".</p></li> | ||
<li><p>Set <var>request</var>'s <span data-x="concept-request-redirect-mode">redirect mode</span> to "manual".</p></li> | ||
<li><p>Set <var>request</var>'s <span data-x="concept-request-service-workers-mode">service workers mode</span> to "none".</p></li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should just apply this regardless of the cache keys. (While it can be discussed separately)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the discussion on w3c/resource-hints#78 and at TPAC, it certainly seems like we'd need to either move the as=="document"
check outside of the double-key case or skip SW for all prefetches. (or both!)
source
Outdated
<ol> | ||
<li><p>Set <var>request</var>'s <span data-x="concept-request-initiator">initiator</span> | ||
to "prefetch".</p></li> | ||
<li><p>Set <var>request</var>'s <span data-x="concept-request-keepalive-flag">keep-alive</span> flag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add a text like 'the UA may abort the fetch if navigation happens to a different URL' could it work? Reusing response part could happen at cache level from impl pov but the part is not really spec'ed so it might be a bit tricky.
source
Outdated
data-x="concept-document-origin">origin</span> as cache keys for <var>request</var>, then:</p> | ||
<ol> | ||
<li><p>If <var>request</var>'s <span | ||
data-x="concept-request-credentials-mode">credentials mode</span> is "include", then return. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, this means that all no-cors prefetches will just return and be ignored (if my understanding is correct).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that was my original intention, but it's true that aborting it is probably better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think this PR effectively kills all prefetches:
- Both same- and cross-origin in the wild today without a
crossorigin
attribute, and... - Both same- and cross-origin In the wild today with
crossorigin=use-credentials
Also, it seems to enforce usage of CORS, because the only way to fetch these resources with non-"include" credentials mode is with the "cors" request mode. I'm wondering if we could get away with only ignoring all prefetches with crossorigin=use-credentials
, but just changing the default prefetch credentials mode to "same-origin". The default request mode would still be "no-cors".
When a developer prefetches a cross-origin resource, the uncredentialed response will be in the cache. When the user navigates to the resource, if no cookies accompany the request, it will match. If cookies were sent and the Vary: Cookie
header is properly set on the prefetched response, the request will not match. The case where this breaks is when the prefetched response does vary with cookies, but is missing the Vary
header.
/cc @yutakahirano
21f1cb9
to
e4193e2
Compare
Looks like the spec moved from underneath this PR. I'll rebase it |
…g double-key caching privacy protections
e4193e2
to
2f70c31
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay here. I think we do want to be a bit more specific (and perhaps also more vague at the same time, since the type of keying you're talking about can differ on a per URL basis).
<p>If the browser is using both the <var>request</var>'s <span | ||
data-x="concept-request-url">URL</span> and the | ||
<span>top-level browsing context</span>'s <span>active document</span>'s <span | ||
data-x="concept-document-origin">origin</span> as cache keys for <var>request</var>, then:</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little too vague. Which cache are we talking about?
Also, from your comment it seems this is talking about requestStorageAccess()
type of isolation. I suspect we want to land some infrastructure for that first and agree on how it should work generally.
If that were in place, it's not clear to me why we'd modify redirect mode and such. It seems that might affect any cache in place in weird ways.
It seems this would need to be removed, right? Is the idea with prefetch still that subresources are also fetched (i.e., it's "prenavigate")? Wouldn't we have to create some kind of fake browsing context in that case? There's also a number of XSLeaks implications with this feature. Some of that seems to be already under consideration from the discussion I read, but it might be good to spell it out more clearly in a note or some such. |
I think prefetching is not meant to be prenavigate. However, I am concerned with its potential as a cross-site tracking tool. If prefetch loads are done with credentials, they create a cross-site tracking vector pretty directly. If all you get is "load" and "error" events, then you get one bit of information per prefetched resource, so N prefetches could be used to create an N-bit unique user ID. However, the old Resource Hints spec suggests that prefetch can be used in CORS mode with credentials. If that allows the prefetching page to read back the prefetched resource, it creates a direct tracking vector with only one prefetch. Just by busting cache partitioning, they can also be used to provide a hidden way to transfer state from one page to the next that's not as visible to the UA (unlike data in the URL or the Referer header) by loading a custom per-user resource that the next page can read back. Note: these comments are based on Resource Hints draft, I have not read the new PR yet. |
I don't have labeling abilities in this repo but this should probably get some sort of privacy/tracking/fingerprinting related label. |
security/privacy
Do you think we should un-combine that into Do you think fingerprinting merits having its own separate label? What about tracking? |
@sideshowbarker Thanks. I did not mean to suggest creating more specific labels. I just didn't know offhand what labels existed. If I did have opinions on labels, would the https://github.com/whatwg/meta/ repo be the right place? |
Yup |
I've noted my concerns with the proposed model at w3c/resource-hints#82 (comment). |
This closes w3c/resource-hints#82 in order to:
@annevk @domenic - I handwavily talk about cache keys here. Let me know if that works, and if you want me to add a note/issue about better defining that (and double keying) in the future.
I think that the second part of w3c/resource-hints#82 would be to define what a "speculative fetch" is (as a Fetch primitive), and sat that browsers can choose to never fetch them, and should keep them in a non-partitioned, time-limited cache.
💥 Error: Wattsi server error 💥
PR Preview failed to build. (Last tried on Jan 15, 2021, 7:58 AM UTC).
More
PR Preview relies on a number of web services to run. There seems to be an issue with the following one:
🚨 Wattsi Server - Wattsi Server is the web service used to build the WHATWG HTML spec.
🔗 Related URL
If you don't have enough information above to solve the error by yourself (or to understand to which web service the error is related to, if any), please file an issue.