-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bucket hooks #18
Comments
On IRC Jake suggested that we could just have "bucket has an associated X" where X could be service worker registrations and such. This assumes that when clearing a bucket is replaced with a new one (allowing X effectively to be GC'd as there are no more references to it). Is that the model we want? Currently we just say a bucket is cleared. One problem is that we'd have to copy some state over from the old bucket, such as persistence and potentially more in the future once we start expanding the concept. At least, I think if you clear, you don't necessarily expect to have to invoke Thoughts? An alternative is that a bucket has something like a specification-level GetStorageHandler(Identifier, optional ClearCallback) operation that returns a StorageHandler in which you can store stuff. |
"A bucket has storages", and it's the storages that become detached. Adding callback steps for cleanup is fine unless the order becomes observable. |
I think the order will be observable given the combination of https://w3c.github.io/webappsec-clear-site-data/#abstract-opdef-clear-dom-accessible-storage-for-origin deals with this through enumeration (though doesn't list the Cache API). My idea with the identifier was that we'd first sort lexicographically and then invoke the ClearCallback, but perhaps it's better to just list everything in the Storage Standard and require it to be updated as new things are added. |
That seems fine. Doesn't hurt to have all origin storage referenced from one place. |
I'd agree that this is the right approach. |
FWIW, I have the feeling I'm missing a simpler solution here and as you can tell this is very much a sketch. Would love to hear your thoughts. The idea here is to define existing storage APIs, such as service workers and localStorage, on top of these primitives so we get a well-defined Clear-Site-Data and hopefully some other benefits too. I suspect this architecture might also work for the Storage Access API in due course, though it depends a bit on how all that will pan out. Storage APIs (e.g., localStorage) need to define:
The Storage Standard needs to define: A registry of all storage identifiers and an easy way to get from one to its corresponding replace algorithm. A storage bucket holds a map of storage identifiers to storage areas. A storage area is a struct consisting of map and a proxy map pointer set. (The idea is that storage area's map holds the actual storage. It's in a map because those are easy to work with. How the map is persisted is implementation-defined. How to make it available across process boundaries is implementation-defined.) A proxy map has identical operations to a map and performs those on its underlying map. (We hand out a proxy map to a storage API so we can replace the actual map behind the scenes.) New algorithms: To obtain a storage bucket area map, given a storage identifier identifier and an environment environment, run these steps:
(The above algorithm is intended for storage APIs. They would invoke this upon initialization to get a map to store things in.) To replace a storage bucket old with a storage bucket new, run these steps:
(There's a couple things that need to be filled out here including what kind of details the replace algorithm might need to clean up the relevant APIs.) |
Just to be crystal clear (still waking up ☕ vs. multiple levels of indirection), the usage of the storage area's map is up to the particular storage API, i.e. for localStorage the map's keys/values are literally the (local) storage area's keys/values; for Indexed DB the keys/values would be database names/database constructs, for Cache Storage the keys/values would be cache names/caches, etc. Or a storage API could have a single entry in its storage area, and put all of its structure inside the single value. The need for this map is just because it's a common pattern across all storage APIs. "Storage area" as a term seems to conflict with HTML's use for localstorage, but maybe they can coalesce? Or HTML can get a new term as part of refactoring to align with this. (I don't think it's formally defined in HTML?) I think this proposal works for Indexed DB. (From a spec level; haven't thought about implementation impact, especially the replacement part.) |
Overall, @annevk, your sketch looks really good to me. One really basic question:
I imagine the obtain a storage key from an environment algorithm could return a (registrable domain, registrable domain) tuple for partitioned storage, and a registrable domain otherwise? |
|
Relaying some discussion from IRC:
I assume most of the time the storage key will be an origin. But not always. In particular this step will allow us to define both double-keying and blocking of storage. Note that currently storage is blocked in opaque origins on a per-API basis (e.g. localStorage, idb.open()). Those mechanisms should probably be subsumed here, so that if environment is an opaque origin, key is failure, and the rest of the algorithm fails. This also allows other scenarios to block storage by intervening at the "obtain a key" stage. |
Generally I think this all looks good. |
I think I uncovered how #86 is my WIP PR to define all this. Probably best to keep high-level discussion here for now until I've made it somewhat more concrete, but feedback welcome on what is there now. (Note that there isn't much there yet compared to my comment above, but there is a bit. I hope to get to the remainder tomorrow/next week (tomorrow is a holiday I just realized).) |
@jakearchibald @jungkees hey! I was wondering what kind of hooks you need to make it clear e.g., service worker registrations and the Cache API are stored in a box.
In #4 we are discussing the cleanup steps for when a box gets closed, but maybe we should also have formal language for actually storing something inside?
The text was updated successfully, but these errors were encountered: