Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an API to opt-out of back-forward cache? #5744

Open
rakina opened this issue Jul 20, 2020 · 39 comments
Open

Add an API to opt-out of back-forward cache? #5744

rakina opened this issue Jul 20, 2020 · 39 comments
Labels
addition/proposal New features or enhancements topic: history

Comments

@rakina
Copy link
Member

rakina commented Jul 20, 2020

Currently, we only have "implicit" ways to opt-out of bfcache that might have other side effects. A few examples from Firefox's bfcache page and problems associated with them:

  • The page uses an unload or beforeunload handler
    • This is not an opt-out signal on Safari and Chrome's back-forward cache. It is very easy to "accidentally" opt-out of bfcache with this, as ~66% of pages have an unload handler registered, causing these pages to unintentionally lose potential back-navigation improvements. Existence of these handlers only for the sake of opting out of bfcache is also bad because it might slow down navigations (as the browser need to fire these events), etc.
  • The page sets "cache-control: no-store"
    • This is an opt-out signal on Safari and Chrome's back-forward cache, however, this is doing way more than just opt-out of bfcache. This opts-out the site of getting stored in any cache, which might not be ideal if it only wants to opt-out of the bfcache case. Similar to unload and beforeunload, a lot of websites also use this for non-bfcache reasons, and might unknowingly opt-out of bfcache.

I think having an explicit API to specifically opt-out of bfcache (and does nothing more than that) might be good, since there might be legitimate cases of not wanting to be bfcached (stale data/state of the previous page, logging in/out, etc.) specifically. On top of the above points, web developers have been having a hard time finding a reliable way to do just that. See [1] [2] [3] [4]. Also, as people start adopting the explicit API, we can slowly move away from the implicit opt-outs entirely so that pages aren't unintentionally opting-out of bfcache's potential performance improvements?

Maybe we can have something like history.disableBFCache()? (thanks @annevk for suggesting this!)

This might also be a good starting point into standardizing back-forward cache behavior across browsers. I'm not quite sure how much of the behavior is specced currently, but there certainly are some noticeable behavior difference between Firefox, Safari, and Chrome's implementation of bfcache (even in the opt-out signals, as mentioned above). I hope we can make the behavior more predictable and interoperable in the future :) - update: For other bfcache-related issues, see #5880. This thread will focus specifically on opt-out.

cc @annevk @smaug---- @mystor @cdumez @beidson @hober @altimin @xharaken @fergald

@annevk annevk added addition/proposal New features or enhancements topic: history labels Jul 20, 2020
@smaug----
Copy link

How well is BFCache term known? I think it has had at least two meanings so far, back-forward-cache and blazingly-fast-cache.
Just wondering about the API name, and whether it should have 'bfcache' in it or should it be something more descriptive.

(It is still rather surprising that one can enable bfcache for pages with *unloadevent listeners.)

@mystor
Copy link
Contributor

mystor commented Jul 20, 2020

Given that you're trying to cache pages aggressively, does this mean that the unload event will likely never be fired on any navigation in Chrome anymore? I believe we disable BFCache for registered unload listeners because we're trying to be maximally compatible, and sometimes not firing the event would be surprising behaviour.

If we decide to add an API for disabling the BFCache and ignore unload listeners, it might be worth specifying that we never fire the unload event for a toplevel document unless it has explicitly opted out of the BFCache, so that things which might disable caching today (active webrtc? pop-up?) don't end up causing surprising website breakage in the future.

WRT interoperability, I would be interested in a write-up of the behaviour and edge-cases in Chrome's new BFCache design from a spec & web-compat pov, which we could use as a springboard along with the existing Firefox & Safari BFCache designs to potentially find a more-compatible system.

@rakina
Copy link
Member Author

rakina commented Jul 21, 2020

How well is BFCache term known? I think it has had at least two meanings so far, back-forward-cache and blazingly-fast-cache.
Just wondering about the API name, and whether it should have 'bfcache' in it or should it be something more descriptive

Oh, I didn't know about other uses of the name before, thanks! I think having a more descriptive name makes sense then, maybe something like history.disableBackForwardCache()?

Also, we might want to think on what "disable back-forward cache" means:

  • Are we opting-out the current page only?
  • Should we allow the next page to also say "if we navigate back, don't try to get the page from back-forward cache"?
  • What about having a way to say "remove all bfcache entries for this site (or maybe origin?)"? @cdumez mentioned that this is is desirable in case of log out, etc. This might also be useful for cases where a site knows that old pages will have stale data.

Given that you're trying to cache pages aggressively, does this mean that the unload event will likely never be fired on any navigation in Chrome anymore? I believe we disable BFCache for registered unload listeners because we're trying to be maximally compatible, and sometimes not firing the event would be surprising behaviour.

If we decide to add an API for disabling the BFCache and ignore unload listeners, it might be worth specifying that we never fire the unload event for a toplevel document unless it has explicitly opted out of the BFCache, so that things which might disable caching today (active webrtc? pop-up?) don't end up causing surprising website breakage in the future.

There are some conditions for pages to be eligible to bfcache, but yes, a lot of unload handlers won't get fired. We are providing mitigations to hopefully help with this (including the opt-out), and will reconsider if this proves to be a big problem. This is an interesting problem indeed. I've made another thread to discuss bfcache's interaction with unload & others.

WRT interoperability, I would be interested in a write-up of the behaviour and edge-cases in Chrome's new BFCache design from a spec & web-compat pov, which we could use as a springboard along with the existing Firefox & Safari BFCache designs to potentially find a more-compatible system.

Yes, this will be a great start! @altimin is actually currently working on a document that lays out these behavior differences between Chrome, Firefox, and Safari's bfcache. I think it should be ready sometime this week. We'll update this thread/the other thread with that then.

@annevk
Copy link
Member

annevk commented Jul 21, 2020

Maybe the API name should be disablePagePersistence() (or equivalent) to reflect the PageTransitionEvent class and its persisted member as that's established precedent for this feature.

@mystor
Copy link
Contributor

mystor commented Jul 21, 2020

I like the history.disablePagePersistence() name more than disableBackForwardCache or disableBFCache.

Also, we might want to think on what "disable back-forward cache" means:

  • Are we opting-out the current page only?

This was my intuition, and lines up Firefox's behaviour with the unload event, but perhaps there's a better option.

  • Should we allow the next page to also say "if we navigate back, don't try to get the page from back-forward cache"?

I don't immediately see how that would be useful, so I'm inclined to say no. Do you have a motivating use-case for that feature?

  • What about having a way to say "remove all bfcache entries for this site (or maybe origin?)"? @.cdumez mentioned that this is is desirable in case of log out, etc. This might also be useful for cases where a site knows that old pages will have stale data.

This does seem like a potentially useful API to have. Perhaps we could have an options argument like disablePagePersistence({scope: "origin"}); with values like "page", "origin", and "site-origin"?

@cdumez
Copy link

cdumez commented Jul 21, 2020

If we add an API such as history.disablePagePersistence(), I worry about it getting too widely adopted (abused), thus severely reducing the hit rate of our bfcache. User experience when swiping back and the page is not in the bfcache is bad and we want to avoid that as much as possible. I am also not sure what the true use-case is history.disablePagePersistence(). It seems like a very heavy hammer and I'd rather we address the use cases we know in a way that does not carry as much risk to cripple our bfcache efficiency.

Safari / WebKit shipped with all pages going into the bfcache no matter what (including cache-control: no-store). The only push back we received was the fact that after you log out of a site, you could still go back and see a page you should no longer be able to see. We agreed that this feedback was valid and our short-term fix was to bypass the bfcache when the page uses cache-control: no-store. Sadly, many sites use this and their intention is likely not to prevent the bfcache. This is not something we like for the long term.

To address the use-case we know about, which is the log-out case, I would prefer something that would clear bfcache entries for the origin. I think I should be able to have a fast back/forward experience on my banking site so disabling the bfcache entirely on such sites would be suboptimal. However, I don't think I should be able to go back (using bfcache) after I log out of my banking site. Sadly, while something like history.clearPagePersistence() would take care of this use-case and would be a bit more targeted than history.disablePagePersistence(), it could still be deployed too aggressively and negatively impact our bfcache. Discussing this internally, we think that a way to further limit this API to reduce the risk of abuse would be to provide an API that clear previous history items (not the current one) for the origin, something like history.clear(). The site could call this after logging out, the user would be on the post-logout page and the previous history items for this site would be gone. As a result, the bfcache entries would be gone too (and in the Safari case, so would the swipe gesture snapshots).

Thoughts?

@altimin
Copy link

altimin commented Jul 21, 2020

+1 to Chris's concerns about a potential overuse of any opt-out API that we'll add. There are some tensions here between developer ergonomics and user experience and my main concern here is that by introducing this explicit API we open a way to long-term user experience losses which will increase with time as the adoption of this API increases and will be rather hard to undo.

I really like the history.clear() API proposal — it's very elegant and it seems to solve a few additional problems along the way, like leaking potentially sensitive full URLs from session history in the logout case.

@geoffreygaren
Copy link

I really like the history.clear() API proposal — it's very elegant and it seems to solve a few additional problems along the way, like leaking potentially sensitive full URLs from session history in the logout case.

Yes, that's exactly our thinking: URL, website title, website snapshot, and any other state that a browser may keep in its presentation of back / forward now or in the future -- those are all things a sensitive logout operation will want to clear, and they aren't obviously just about page persistence and caching per se.

@domenic
Copy link
Member

domenic commented Jul 21, 2020

Would history.clear() affect browser UI, or no? E.g. would it prevent using the back button, or would it change what is shown in the back-button dropdown? (And I assume it wouldn't affect things like the dedicated history UI or location bar autocomplete...)

To some extent those considerations are outside of the spec, as is all browser UI, but I think it'd be worth making sure people are envisioning the same thing.

@cdumez
Copy link

cdumez commented Jul 21, 2020

Would history.clear() affect browser UI, or no? E.g. would it prevent using the back button, or would it change what is shown in the back-button dropdown? (And I assume it wouldn't affect things like the dedicated history UI or location bar autocomplete...)

To some extent those considerations are outside of the spec, as is all browser UI, but I think it'd be worth making sure people are envisioning the same thing.

Yes, the idea was that the user would not be about to go back after history.clear() has been called. Therefore, it would make sense to remove the corresponding same-origin (or same site?) entries in the browser UI.

@domenic
Copy link
Member

domenic commented Jul 21, 2020

Interesting. In that case, I was initially worried about this being used for abuse to trap people on a spam site or similar, but I guess that would not be possible since it is origin-restricted. Cool 👍

@beidson
Copy link

beidson commented Jul 21, 2020

I was a little confused about the details of the proposal reading it piecemeal in an email thread.
So I opened this page, and would like to put the proposal as I understand it in once place.
I am also adding a new behavior I think is important to prevent the abuse we're concerned about - limiting it to contiguous entries.

New API - history.clear()

  • Clears all contiguous session history entries from the same origin.
  • Does not clear the current entry
  • Does not affect any other part of browser UI with no web standard manifestation, such as global history entries. By changing the back/forward list, it does inherently change the browser's UI manifestation of that list.
  • Does not clear any entries from the same origin that aren't contiguous to the current entry.
    e.g. If the session history is:
    "example.com"
    "webkit.org"
    "chromium.org"
    "example.com/1.html"
    "example.com/2.html" <---- Current
    And script on the current entry calls history.clear(), then only the previous "example.com/1.html" entry is cleared, not the previous entry to "example.com"

That additional restriction further prevents abuse with regards to a malicious site trying to cover its own tracks from browser activity unrelated to the current interaction with the page.

Have I gotten this right?
(And what do people think about the contiguous restriction?)

@beidson
Copy link

beidson commented Jul 21, 2020

Note: Dominec had explicitly asked "(And I assume it wouldn't affect things like the dedicated history UI or location bar autocomplete...)" and then Chris agreed, but also disagreed. So this is still clearly still a point of contention.

I'm 100% on the side of "This should not affect global history"

@cdumez
Copy link

cdumez commented Jul 21, 2020

Sorry if I wasn't clear in my reply to Domenic: I think it should impact the back button in the UI (simple press or long press to see previous history entries). However, the entries should likely stay in the "History" menu of the browser (i.e. Previously visited sites).

Additionally, I think the contiguous restriction that Brady mentioned is good. That is what I had in mind although I had not written it down.

@smaug----
Copy link

If history.clear() would affect to session history in other tabs too, then it is worrisome. Some unrelated pages from the same origin could end up accidentally break each others. And if it doesn't affect to other tabs, then how would the log-out case work if some other tab has bfcached pages from the site?

@beidson
Copy link

beidson commented Jul 21, 2020

If history.clear() would affect to session history in other tabs too, then it is worrisome.

It would not.

The history object is already well defined to be the session history for the current browsing context. E.g. the current tab.

This operation would and should only clear the contiguous entries from the current browsing context's session history

And if it doesn't affect to other tabs, then how would the log-out case work if some other tab has bfcached pages from the site?

The same way they are affected in logout today - not at all!

If I'm logged in to a bank in multiple tabs today... and I log out in one tab, while other tabs are still ACTIVELY VIEWING MY ACCOUNT INFORMATION... I'm not protected, and it has nothing to do with the back.forward cache

Of course, if a site was security conscious and concerned with this (e.g. a bank) they can coordinatedwith the other tabs through a few different mechanisms to instruct them to logout and history.clear() themselves.

Or even a server side push to the other tabs, telling them to do the same thing.

@smaug----
Copy link

smaug---- commented Jul 21, 2020

Ok, is we don't care about other tabs, but only current one, it is still very easy to break user experience if clearing all the session history entries for one site. One site may have totally unrelated pages. Silly case, but applies to me - I have various tests on my own site. They have basically nothing to do with each others. But some set of pages might be related. So I'd prefer rather some kind of history scope on a site, and then history.clear({scope: "my scope ID"}); or some such.

@beidson
Copy link

beidson commented Jul 21, 2020

That sounds like a nice idea. How to define it is an interesting problem.

Wondering if we can leverage the already existing state object to fit in well.

@geoffreygaren
Copy link

I think we should treat specifying a scope in this api as a “nice to have” and not a requirement. Our motivating examples — sensitive websites that need a secure log out feature — do not care to specify a scope.

Requiring that a site communicate a scope across all navigations would also increase the cost of adoption and testing considerably. And there again, our motivating examples need technologies that are easy to adopt and test.

That said, I don’t object to specifying a limiting scope, if we can make it optional and easy.

@geoffreygaren
Copy link

geoffreygaren commented Jul 22, 2020

I'm 100% on the side of "This should not affect global history"

Can you elaborate on why?

Some reasons that a user agent may want to remove an entry from global history if the entry originated exclusively from the current back forward list and is now gone from the current back forward list:

  1. The entry contains a url and a title, which might leak sensitive information, partially defeating the primary motivation for this feature.

  2. The user agent’s global history may store other sensitive information like a snapshot or text index, also partially defeating the primary motivation for this feature. (At the limit, a user agent might maintain a page cache for all items in global history.)

  3. Consistency.

  4. Navigating to that global history item from a menu or an autocomplete list will probably fail and redirect to a login page.

@beidson
Copy link

beidson commented Jul 22, 2020

I'm 100% on the side of "This should not affect global history"
Can you elaborate on why?

The motivation for us relaxing our nearly 100% bfcache was "banks don't like that information leaks to a live page after logout"

  1. The entry contains a url and a title, which might leak sensitive information, partially defeating the primary motivation for this feature.

We've never had a bank say "Don't leak our URLs or titles to global history"

  1. The user agent’s global history may store other sensitive information like a snapshot or text index, also partially defeating the primary motivation for this feature. (At the limit, a user agent might maintain a page cache for all items in global history.)

We've never had a bank say "don't do all your nice UA features like snapshots and text", probably because the browsers have already been considerate in what they do or do not do here.

  1. Navigating to that global history item from a menu or an autocomplete list will probably fail and redirect to a login page.

This is a problem much larger than the one we are trying to solve here, and making the global history change will not come remotely close to fixing it.

Back to 3...

  1. Consistency.

I see it as very inconsistent to give a web page control over something that is objectively a web technology - their session history entries that are already mutable in JavaScript via the history object - and also have it control something that is 100% a browser UA feature.

Whereas bfcache is something all the engines have converged on and are working to make more and more effective - and therefore this conversation has come up...

...each UA having their own global history implementation and behavior is a place where they can experiment and differentiate from other UAs, all without the risk of causing web compatibility issues.

Global history already has differences in behavior from session history here that makes implementing what you describe harder than you might imagine.

e.g.
In one tab, visit example.com, webkit.org/1.html, then chromium.org, then webkit.org/1.html, then webkit.org/2.html.
Your history menu will show"
webkit.org/2.html <---most recent
webkit.org/1.html
chromium.org
example.com

Because the webkit.org/1.html entry got promoted in recency, not an additional one added.
If webkit.org/2.html does a history.clear(), what's the correct global history to have?

I would argue that the answer is both "not obvious" and "implementing it might require a non-trivial rethink of how global history works"

Now repeat the above experiment intermixing with different browser tabs. Does one session in Tab A get to clear the history entries of another session in Tab B, even if Tab B doesn't want to clear global history?

I would say no.

What about "Top sites" type features?

If I frequently visit "awesomegame.com/startpage.html" and then "awesomegame.com/endgame.html" clears all previous history... As a user I really want awesomegame.com/startpage.html in my top sites. Should the web site be allowed to prevent that?

These are just a couple scenarios I conjured without much deep thought.

TLDR; Session history is a web technology feature. Global history is a UA feature.

@rakina
Copy link
Member Author

rakina commented Jul 22, 2020

A few questions on history.clear():

  • This only covers same-origin navigations. Does this mean for cross-origin cases, we think there are no compelling opt-out use cases? I guess most logouts will be same-origin, and shopping carts that get stale etc. only make sense in cross-origin cases too..
  • This will clear the actual session history entries so that the API won't be used lightly, hoping that pages that don't really need to opt-out will instead fix their pagehide/pageshow handlers instead to take care of bfcache-related stuff. I'm worried that this might be too strong to the point of deterring valid cases from using it, though maybe if we do implement scopes it might be ok. (I actually wonder if the concern of people using it too often is strong enough - is a good navigation performance not a good enough incentive for people to migrate?)

On history.clear({scope: "my scope ID"});, maybe we can just have history.setSessionClearStartPoint() and history.clearSessionEntriesFromLastMarkedStartPoint() or something. I guess this might be confusing with multiple history navigations...

@geoffreygaren
Copy link

  1. The user agent’s global history may store other sensitive information like a snapshot or text index, also partially defeating the primary motivation for this feature. (At the limit, a user agent might maintain a page cache for all items in global history.)

We've never had a bank say "don't do all your nice UA features like snapshots and text", probably because the browsers have already been considerate in what they do or do not do here.

Actually, we have: rdar://problem/64891907.

...each UA having their own global history implementation and behavior is a place where they can experiment and differentiate from other UAs, all without the risk of causing web compatibility issues.

How would clearing global history entries in response to logout cause web compatibility issues? (Seems like the change wouldn't be visible to the website at all.)

e.g.
In one tab, visit example.com, webkit.org/1.html, then chromium.org, then webkit.org/1.html, then webkit.org/2.html.
Your history menu will show"
webkit.org/2.html <---most recent
webkit.org/1.html
chromium.org
example.com

Because the webkit.org/1.html entry got promoted in recency, not an additional one added.
If webkit.org/2.html does a history.clear(), what's the correct global history to have?

My proposal was to remove webkit.org/1.html.

I would argue that the answer is both "not obvious" and "implementing it might require a non-trivial rethink of how global history works"

I tend to agree that history-related user-facing features may differ across browsers and OS's in non-trivial ways that are hard to specify or predict. (Another example of both the need to act on other forms of history and the difficulty of specifying it is macOS's behavior of taking periodic snapshots of whole app windows.)

But difficulty notwithstanding, the fact of the matter is that a UA that keeps a non-trivial record of a sensitive page after logout has created a privacy and security issue for the logged out website, and there will always be some need to resolve that issue.

Perhaps the best compromise here is to specify a "must" behavior for the back-forward list, and a "should" or "should consider" behavior for other kinds of navigation history, to highlight the intention of the spec without getting too into the weeds on specific browser UIs.

TLDR; Session history is a web technology feature. Global history is a UA feature.

Not sure I agree with the distinction being made here.

What matters most is whether a website can implement a secure logout feature -- not how the technology looks from the inside.

@domenic
Copy link
Member

domenic commented Jul 22, 2020

Perhaps I got things sidetracked with the discussion of browser UI. To be clear, browser UI is out of scope for the spec, and we won't write normative text (even of a "should" variety) for features like the back button or global history UI.

I thought it would be an interesting discussion to help clarify the intention of the API, but in the end browser UI features that are not observable through JavaScript are not in scope for what we specify.

@beidson
Copy link

beidson commented Jul 22, 2020

I agree with Domenic that we can shelve the discussion about global history and related features and take it internal, as the spec has nothing to say about them.

@beidson
Copy link

beidson commented Jul 22, 2020

This only covers same-origin navigations. Does this mean for cross-origin cases, we think there are no compelling opt-out use cases? I guess most logouts will be same-origin, and shopping carts that get stale etc. only make sense in cross-origin cases too..

Session history is a sensitive thing, with the potential for pretty powerful abuse. Any modification allowed to it today (e.g. push and replaceState) is only allowed for same origin.
I would be against 3rd parties being allowed to remove session history entries that they weren't allowed to mutate before.

On the shopping cart note: I've never met a site that wanted to clear a shopping cart... ;)

This will clear the actual session history entries so that the API won't be used lightly, hoping that pages that don't really need to opt-out will instead fix their pagehide/pageshow handlers instead to take care of bfcache-related stuff. I'm worried that this might be too strong to the point of deterring valid cases from using it, though maybe if we do implement scopes it might be ok.

Personal take:
The bfcache is important to performance of stateful browsing, and there are ways to interact with it properly, so... We don't want people to use this.

I believe it solves the use case for logging out fully.
If other folks would really like to opt out of the bfcache for some reason, but this is the wrong hammer for them so they refuse to do it... that is perfectly fine with me. I can personally refer them to documentation on pagehide/show if it will help.

@rakina
Copy link
Member Author

rakina commented Jul 22, 2020

Session history is a sensitive thing, with the potential for pretty powerful abuse. Any modification allowed to it today (e.g. push and replaceState) is only allowed for same origin.
I would be against 3rd parties being allowed to remove session history entries that they weren't allowed to mutate before.

On the shopping cart note: I've never met a site that wanted to clear a shopping cart... ;)

Right, definitely not suggesting cross-origin session history modification - maybe more of a way for a page to say before leaving that it does not want to be cached (or maybe even say "you can't go back to me at all"). But thinking about this again, maybe there are really no big use cases for this, hmm...

@jakub-g
Copy link

jakub-g commented Nov 18, 2020

To provide some counterpoint to the earlier discussion:

The discussion before focuses on history.clear() JS API; it sounds interesting, but I'm wondering if the logout scenario couldn't be handled by extending Clear-Site-Data header with a new value like "bfcache"? Would it make sense to extend that header for this case?

(edit: or maybe Clear-Site-Data: "cache" already wipes the bfcache?)

Yes, history.clear() is more versatile on one hand and can accommodate for more use cases; but on the other hand, having two ways (one via JS, one via a header) to clear different-but-similar kinds of data would feel weird to me.

@geoffreygaren
Copy link

To provide some counterpoint to the earlier discussion:

The discussion before focuses on history.clear() JS API; it sounds interesting, but I'm wondering if the logout scenario couldn't be handled by extending Clear-Site-Data header with a new value like "bfcache"? Would it make sense to extend that header for this case?

(edit: or maybe Clear-Site-Data: "cache" already wipes the bfcache?)

Yes, history.clear() is more versatile on one hand and can accommodate for more use cases; but on the other hand, having two ways (one via JS, one via a header) to clear different-but-similar kinds of data would feel weird to me.

Clear-site-data could work here.

One detail: this proposal is about clearing the back forward cache and the back-forward list (which might hold data like a title or a snapshot). So, you’d want a value like ‘bflist’ or ‘history’. The cache comes along for the ride with the list.

@rakina
Copy link
Member Author

rakina commented Nov 19, 2020

Modifying Clear-Site-Data sounds good. I think I agree that it's better to add a separate history value or something, which also help ensure that existing uses don't trigger this :)

One thing to keep in mind is making this be a header instead of a JS API is that it's harder to use (per this poll at least), which might be a good thing to discourage using this unless necessary? Also it's only possible to do it by the page itself (instead of imported script, etc). The use case we're interested in (clearing on logouts) should be fine with this though.

@fergald
Copy link

fergald commented Apr 1, 2021

history.clear() will change (negatively) users' session history UX on sites that are concerned about BFCache privacy. I don't see a good justification for that.

Clear-Site-Data sounds better to me as it targets only the cache. If header vs JS API is a real concern then that is also a concern for the existing modes of Clear-Site-Data and if we wanted to provide a JS API for that, we could.

I have one concern with both of these though - they all require action, so if you somehow don't call the API or supply the header, your page stays in BFCache. E.g. if

  • the site crashes
  • I just type a new URL into the address bar
  • I go back in my session history to before my bank's site

then nothing is cleared from BFCache. Is that OK?

@smaug----
Copy link

smaug---- commented Apr 1, 2021

What does "site crashes" mean? What would be stored in the bfcache?

Typing new url in to address bar should by default put the current page to bfcache - that is what I'd expect.
And what is the problem with the bank case? I'd expect bank may want to ensure its page doesn't enter bfcache, or any other cache, by using no-store (and other) header(s) or something and it is clearly a bug in the banking site if they don't do it, but what is wrong with keeping the previous page in bfcache?

@fergald
Copy link

fergald commented Apr 1, 2021

What does "site crashes" mean? What would be stored in the bfcache?

In Chrome at least we can have a session history that looks like
1 a.com/foo - process 1, in BFCache
2 a.com/bar - process 2, the currently visible page of the tab

If process 2 crashes, it has no impact on process 1. It often happens that they will be in the same process but there's no guarantee.

Typing new url in to address bar should by default put the current page to bfcache - that is what I'd expect.
And what is the problem with the bank case? I'd expect bank may want to ensure its page doesn't enter bfcache, or any other cache, by using no-store (and other) header(s) or something and it is clearly a bug in the banking site if they don't do it, but what is wrong with keeping the previous page in bfcache?

"bank" here was just shorthand for "a site that would use the API we are discussing.

The point is that we are supposing some site that is OK with having pages in BFCache as long as it has a way to kick them out. The 3 scenarios I described lead to a situation where the site still has pages in BFCache but no longer has any active page that can kick them out.

Bank example - "what is wrong with keeping the previous page in bfcache?"
I'm not sure what is wrong with keeping the previous page in bfcache but here's my bank's current behaviour (no BFCache). My bank makes me log in again if I just go to some other site (using that tab) but it doesn't make me log in again if I navigate within the contiguous block of logged-into-my-bank pages in the session history. Given that, I assume that if I leave the contiguous block, they would not want one of their pages being in BFCache either.

Do we actually have any feedback from or contact with a site that is currently disabling BFCache and would use an API like the ones that have been proposed?

@jakearchibald

This comment has been minimized.

@annevk
Copy link
Member

annevk commented May 6, 2021

See w3c/webappsec-clear-site-data#68 though.

@jakearchibald

This comment has been minimized.

@jakearchibald
Copy link
Contributor

jakearchibald commented May 6, 2021

Places logged-in data could exist after log-out:

  • HTTP cache, unless no-store was used. no-cache entries will be used without revalidation on history traversal. fetch() can also access no-cache entries without revalidation.
  • Back-forward page cache, unless page was not salvageable.
  • Cookies.
  • Origin storage.
  • URLs & titles of history entries.
  • history.state (and the app-history equivalent).
  • window.name, but that seems unlikely.

Clear-Site-Data seems like the right place for a 'full' solution.

@fergald
Copy link

fergald commented May 10, 2021

@beidson

If other folks would really like to opt out of the bfcache for some reason, but this is the wrong hammer for them so they refuse to do it... that is perfectly fine with me. I can personally refer them to documentation on pagehide/show if it will help.

I had been thinking about pagehow as the last-resort for those who do not want to be bfcached, just delete everything from the doc and reload or show an error. In Chromium, the pageshow handlers runs synchronously as part of restore and I think 4.6.4 of the spec implies that it should. Does safari cache the pixels of the bfcached page? That would mean that pages that update on pageshow would get a flash of old content first.

@fergald
Copy link

fergald commented Jun 2, 2021

Please take a look at #5879 which wants to clarify what the overall behaviour with respect to no-store and BFCache is.

At this point, I think we've established that we don't want an opt out but there is a legitimate case to allow sites to remove pages from BFCache. The only legit case for that seems to be dropping them after a logout. Is there any other? Is everything else "fix it in pageload"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements topic: history
Development

No branches or pull requests