-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StructuredClone serialize / deserialize #935
Comments
We discussed this a bit more in #jslang, from a slightly different perspective. If we want to be able to structured clone promises and streams between workers and the main thread, we need some distinction which doesn't allow those to be serialized to IndexedDB, since that doesn't work or make sense. My proposed framework is then:
An example of Entangle for promises is, creating a new promise p in targetRealm, then doing basically An example of Entangle for streams is basically doing what we are doing in fetch/service workers today to pass things over the service worker boundary to the main thread. If this work is done carefully we could probably replace all that spec text. |
Two complications: 1) Serialize does not mean persist to IDB, e.g., SharedArrayBuffer. 2) Serialize and Deserialize happen in different tasks and can each fail. We currently do not handle failure for the latter. |
Can you expand on
? @inexorabletash was talking about something similar but my understanding is that this hasn't been decided exactly how SAB and IDB should interact. My thought for something like SharedArrayBuffer or Blob is that the result of serialization would be a record like { [[Type]]: "SharedArrayBuffer", [[BackingData]]: the [[ArrayBufferData]] of the source } or { [[Type]]: "Blob", [[BackingData]]: a pointer to the data }. The actual process of serializing these to disk (i.e. translating them into bytes) would be implementation-specific. |
@lars-t-hansen told me that was desired. |
Right. But why does that mena "Serialize does not mean persist to IDB"? |
It means "storage/persist" and Serialize are distinct. Supporting the latter does not imply the former. |
OK. I don't understand your point then or how it relates to SAB. SharedArray would support Serialize and would be storable/persistable. |
Briefly, for me the question is, which operation does postMessage use? It can't be Serialize if that makes a copy of shared memory as it would have to for IDB. |
Serialize does not make a copy of shared memory, neither for postMessage nor for IDB. It just serializes to the record of the form { [[Type]]: "SharedArrayBuffer", [[BackingData]]: the [[ArrayBufferData]] of the source }. When this is Deserialized by postMessage, it just creates a new SAB with the pointer to [[BackingData]]. When IDB performs its implementation-specific records-to-bytes translation, it makes a copy of the data. |
@domenic all I'm saying is that we wouldn't allow SharedArrayBuffer to be stored. So the theoretical Store/Persist abstract operation that takes the result of the Serialize operation would fail on SharedArrayBuffer records. (I suspect this is because the underlying data is much more mutable than with other kind of objects. Only File objects have somewhat similar characteristics, but those would require the user to go in and modify the underlying resource.) |
And IDB doesn't make a copy of the data necessarily. For |
Why wouldn't we allow SAB to be stored?? |
What would storing it mean? Would it mean copying the underlying data? It's not really a primitive that's defined nor necessarily desired for v1. |
Yes, copying the underlying data, exactly the same as for ABs. It is indeed not defined how that happens, since it's implementation specific. |
Clearly the SAB's data can be stored (though of course there's no guarantee that the data are stable while the storing is going on :) The SAB itself can't be stored in the same way that a closure can't usefully be stored, though. I probably don't care very much how this is resolved, so long as we're clear on what the misc operations do. I think it's Weird that IDB has an implementation-specific mechanism for persisting data that could accept a type in one embedding and not in another (IIUC), but I don't think it matters for my purposes. |
To be clear what I mean by implementation-specific: I mean that the actual byte pattern on disk/in memory used is implementation specific. The "records-and-lists" structure is meant to be an implementation-agnostic (and, more importantly, realm-agnostic) representation of the data, but how that gets translated into a byte pattern is necessarily implementation specific. This implementation-specific serialization is not just for IDB; it's also used in IPC for postMessage and friends. |
It's not the same as regular AB since the underlying data isn't shared for regular AB. |
That doesn't impact anything relevant here. |
It does, because the SAB you get back may or may not point to a different buffer depending on how storage is defined. |
Tangent, but IDB implementations do copy the File contents - basically the same work as for Blobs. Even outside IDB the behavior of a File when the file on disk is modified is inconsistent between browsers. (At some point the spec effectively required doing a full data copy on file selection, but no browser does that.) |
We should get that defined. |
w3c/FileAPI#47 and possibly others. |
The messageerror event is used when deserialization fails. E.g., when an ArrayBuffer object cannot be allocated. This also removes StructuredCloneWithTransfer as deserializing errors now need to be handled on their own. Tests: web-platform-tests/wpt#5567. Service workers follow-up: w3c/ServiceWorker#1116. Fixes part of #2260 and fixes #935.
The messageerror event is used when deserialization fails. E.g., when an ArrayBuffer object cannot be allocated. This also removes StructuredCloneWithTransfer as deserializing errors now need to be handled on their own. Tests: web-platform-tests/wpt#5567. Service workers follow-up: w3c/ServiceWorker#1116. Fixes part of #2260 and fixes #935.
The messageerror event is used when deserialization fails. E.g., when an ArrayBuffer object cannot be allocated. This also removes StructuredCloneWithTransfer as deserializing errors now need to be handled on their own. Tests: web-platform-tests/wpt#5567. Service workers follow-up: w3c/ServiceWorker#1116. Fixes part of whatwg#2260 and fixes whatwg#935.
The messageerror event is used when deserialization fails. E.g., when an ArrayBuffer object cannot be allocated. This also removes StructuredCloneWithTransfer as deserializing errors now need to be handled on their own. Tests: web-platform-tests/wpt#5567. Service workers follow-up: w3c/ServiceWorker#1116. Fixes part of whatwg#2260 and fixes whatwg#935.
This rewrites most of the cloneable and transferable object infrastructure to better reflect the reality that structured cloning requires separate serialization and deserialization steps, instead of a single operation that creates a new object in the target Realm. This is most evident in the case of MessagePorts, as noted in whatwg#2277. It also allows us to avoid awkward double-cloning with an intermediate "user-agent defined Realm", as seen in e.g. history.state or IndexedB; instead we can simply store the serialized form and later deserialize. Concretely, this: * Replaces the concept of cloneable objects with serializable objects. For platform objects, instead of defining a [[Clone]]() internal method, serializable platform objects are annotated with the new [Serializable] IDL attribute, and include serialization and deserialization steps in their definition. * Updates the concept of transferable objects. For platform objects, instead of defining a [[Transfer]]() internal method, transferable platform objects are annotated with the new [Transferable] IDL attribute, and include transfer and transfer-receiving steps. Additionally, the [[Detached]] internal slot for such objects is now managed more automatically. * Removes the StructuredClone() abstract operation in favor of separate StructuredSerialize() and StructuredDeserialize() abstract operations. In practice we found that performing a structured clone alone is never necessary in specs. It is always either coupled with a transfer list, for which StructuredCloneWithTransfer() can be used, or it is best expressed as separate serialization and deserialization steps. * Removes IsTransferable() and Transfer() abstract operations. When defined more properly, these became less useful by themselves, so they were inlined into the rest of the machinery. * Introduces StructuredSerialzieWithTransfer() and StructuredDeserializeWithTransfer(), which can be used by other specifications which need to define their own postMessage()-style algorithm but for which StructuredCloneWithTransfer() is not sufficient. Closes whatwg#785. Closes whatwg#935. Closes whatwg#2277. Closes whatwg#1162. Sets the stage for whatwg#936 and whatwg#2260/whatwg#2361.
The messageerror event is used when deserialization fails. E.g., when an ArrayBuffer object cannot be allocated. This also removes StructuredCloneWithTransfer as deserializing errors now need to be handled on their own. Tests: web-platform-tests/wpt#5567. Service workers follow-up: w3c/ServiceWorker#1116. Fixes part of whatwg#2260 and fixes whatwg#935.
See tc39/proposal-ecmascript-sharedmem#39 (comment). The way we define structured cloning right now is not correct.
What we need is some kind of "Agent Message Record" which holds a serialization of an object graph (probably also defined as a record). Those records are then passed around between agents/realms (using
postMessage()
) and can sit still for a while in aMessagePort'
queue until its started for a particular agent/realm. (Although maybe it needs to be bigger if we actually want to carefully define how those tasks get transferred between event loops too. Or we pretend tasks are magic.)tc39/proposal-ecmascript-sharedmem#39 (comment) has thoughts on how to approach this.
@lars-t-hansen, let me know when this becomes higher priority.
@jungkees @jakearchibald this is probably also important for service workers for when we'll start specifying the messages going to and from clients in more detail.
The text was updated successfully, but these errors were encountered: