-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add section for BFCache eviction in cache clearing #77
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not mentioning this earlier, but I would feel better if we maintained the abstraction boundary, instead of having this spec poke into HTML's internals.
That is, we should ideally add some operation to HTML like "destroy all bfcached documents for an origin |origin|", that is publicly exported, and this spec can call that.
However, I guess that's only worth doing if this change has cross-browser consensus. Which seems unlikely since it doesn't account for storage partitioning.
index.src.html
Outdated
1. For each |entry| in the |traversable|'s [=session history entries=]: | ||
|
||
1. Let |state| be |entry|'s <a>document state</a> whose `origin` attribute is identical | ||
to |host|. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence doesn't make sense to me.
I think you want to look at |entry|'s [=session history entry/document state=]'s [=document state/origin=].
Then, you need to compare it to |origin| (not |host|, I don't think??). But you can't just say identical to; you need to use [=same origin=].
And then the idea is you want to [=continue=] the loop if there's no match. You can't just assume there's always a match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2. Let |cache list| be the set of entries from the <a>network cache</a> whose `target URI`
[=url/host=] is identical to |host|.
I was copying the terms from line 583 above, which is not accurate for this BFCache clearing case. I have updated the content following the suggestions.
Do we need to explicitly [=continue=] the iteration in the spec if we don't have any steps to skip over below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to do so.
index.src.html
Outdated
1. Let |state| be |entry|'s <a>document state</a> whose `origin` attribute is identical | ||
to |host|. | ||
|
||
2. Let |document| be |state|'s `document` attribute. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no such thing as a "document
attribute". I think you want the document state's [=document state/document=].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused by the autolink, seems [=document state/document=] couldn't find the right dfn. I added an dfn entry for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's not exported; thus my comment about reaching into the internals in this way.
Thanks @domenic for the review.
@fergal suggested to just leave some high level description of BFCache clearing in this spec, and I thought about just putting "BFCache" together with other types of cache that are only mentioned (like prerender pages and script caches). Do you think that's good enough?
I'm still wondering if we could just have a section in the html spec. Even though there is no such a consensus for the BFCache clearing behavior that we are adding spec for in this PR, but the flushing algorithm itself is standalone right? |
I'm not sure I fully understand the suggestion, but I don't think that's good enough. What you've done here, with a fully specified algorithm, is great. I'm just making a comment about the location of the algorithm.
I don't think we could get consensus to add a privacy-violating algorithm like this one to the HTML spec. |
Are you referring to the storage partition side of this? Specifically this method of attack? |
Yes. |
Can that attack work with BFCache eviction? Unlike CSD for data, BFCache eviction is not persistent. Also, as I understand it, if a we have a tree of frames like
and the That does not seems like it can be an effective attack. It's a bit unclear to me that only evicting if it's top-level is actually the right approach (what does Safari do?) but even if we changed to evicting all BFCached pages that had a |
Even a single bit seems problematic though. |
Can you explain how that 1 bit can be is communicated? I can't see it. Using the terminology of privacycg/storage-partitioning#11, when the user goes to site.example, what should it do so that next time the user visits news.example it can receive even 1 bit of info from site.example? If the user goes to news.example, navigates away and navigates back and the page is restored from BFCache, it cannot tell if
I think this makes communication impossible |
|
In the previous attack, the communication was from site.example to news.example. The attack is to pass site.example's global ID for the user to news.example (or any other site that embed's site.example's JS). I think your example is trying communicate in the other direction. So let's assume that news.example has a boolea
To be specific, let's say it navigates to site.example/nav
Do you mean a link in site.example/nav? If so, these now have an opener relationship and they have no need for CSD as a way of communicating. The interesting case is where the user arrives at news.example entirely independently of having been on site.example. That is what makes the original attack interesting. So from now on, I assume that the arrival on news.example occurs at some random time in the future after being on site.example
The goal is to communicate a bit of information. You need to say what it should do if the
This cannot happen anymore when the navigation is unconnected.
How can site.example determine the value of
It is impossible to reliably determine the value of |
Since there is no agreement on the subframe case, can we just spec that if the header is delivered on a top-level frame, it evicts any other top-levle frame in BFCache from that origin? |
@annevk so I'm a bit confused given #73 (comment) What has webkit shipped? We discussed with @petervanderbeken about this, and our initial reaction is that supporting this for top level only (and only in case same storage is used) seems reasonable. (But privacycg/storage-partitioning#11 is still open and that was the reason for https://bugzilla.mozilla.org/show_bug.cgi?id=1671182 ) |
What do you mean by "and only in case same storage is used"? No storage partitioning? If so, please explain how this could be used to communiate across partitions. As far as I can tell, the argument in privacycg/storage-partitioning#11 requires persistent storage. You cannot simply replace persistent storage with "detect whether BFCaching occurred" as explained above. |
I think for my scenario that does not matter. It could have been noopener through a policy. This is why I later on don't use the opener connection to communicate data back as that would indeed make a lot of it unneeded. It does indeed matter that site.example knows the user navigated to news.example. @smaug---- I'll check. |
In WebKit only a first-party origin can clear bfcache entries. So for #73 (comment) A would not be cleared. (This makes sense to me as session history is tied to top-level navigable for now as all traversable navigables are top-level navigables.) |
@annevk does that mean you support top-level CSD header on example.com causing BFCache eviction of entries with toplevel of example.com? |
Correct. |
As discussed in #73 (comment) , we should add a section in #clear-cache to spec the steps of back/forward cache removal.
@domenic @fergald could you take a look at this? Thanks.
Preview | Diff