-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion of garbage collection of resolved kernel promises #1049
Comments
Chip and I talked this through today. Some notes: The fundamental difficulty is that we're trying to make two claims at the same time:
We think the problem he found arose from liveslots trying to serialize a cycle without being allowed to have identities that remained stable for the entire duration of the serialization. It was asked to resolve Serializing any cyclic data structure requires some notion of pointers or identities that can be compared to break the cycles. These identities must be stable until we've finished traversing the graph. In liveslots terms, this means we can't retire the VPIDs until we've finished all the resolves we're doing in a given crank. Liveslots doesn't currently get to know when the crank is done. We deny it access to As a result, liveslots doesn't know when it becomes safe to retire the VPIDs. Our tentative plan is:
We might need some new terminology. Each "crank" is now followed by a second phase, maybe the "post-crank cleanup" or something. This second phase has the same kind of timing as the normal crank (it consists of one or more turns, and ends when the promise queue is empty), but it has a second dispatch call. The first dispatch call is always a Some alternatives/variants we discussed:
We'll talk this through with @dtribble tomorrow. I suspect his intuitions on easier ways to solve this might depend upon properties of Midori or E in which resolved promises are effectively identical to the thing they resolve to, which (for various reasons) is not what we're currently doing in swingset. |
A couple of additional points on this:
|
Three options to generally resolve this
|
Dean's note was recording three phases of our planned solution: phase 1: drop references that have no embedded promisesIn both directions (liveslots resolves the promise and tells the kernel via
We think of this as a "90% solution" (ok maybe 80%). It doesn't require a lot of deep thinking and should still get us a useful performance win (in the form of reduced RAM and state-vector usage). It retires any result promise that's resolved to a single Presence, or to records with plain data and/or Presences. A lot of simple functions do this. We defer implementing the new phase 2: drop references that have no already exported promisesThis is the "95% solution". The general idea is that resolutions which only reference brand new promises cannot introduce cycles that involve previously-existing promises, and wouldn't cause the recursion problem. Dean said a lot of ERTP/Zoe functions will return a record full of (new) Promises, and this would let their result promises be retired. For the sending side (vat does On the receiving side (kernel does phase 3: something more cleverThis would be the part where we add a new syscall and/or dispatch method, and have liveslots accumulate the slots to drop at the end of the crank. This would be maybe a 98% case: it covers more situations, but we still need improved GC to release everything that's no longer reachable. I (warner) am keen on making the drops be explicit. My thought is that which slots to drop is a policy decision, made based upon various heuristics about utility/performance-savings and soundness, and that explicit syscalls would make the invariants easier to maintain (if the kernel only drops entries in response to a I think Dean felt it was better to use implicit/inferred drops for the easy cases of resolved promises, and explicit calls for the more complicated cases we figure out later. He was concerned about the performance costs of having more syscalls, as well as what I think of as "debuggor ergonomics": how hard is it for the human (the "debuggor", you know, like "operator") performing the task of debugging (using a "debugger", the assistant software) to focus on the data they need. More syscalls means more noise in the logs, more things to ignore, more single-step iterations to get through to the important parts, etc. |
I retitled this issue from the bug that it originated with to reflect the more general discussion of the kernel promise GC problem that it morphed into. We can use this issue to track work on that. |
Superseded by #1124. Closing. |
Vat Bob:
Vat Alice:
Fourth crank, where the
usePromises
message gets delivered, yields this in the console log:... repeats until interrupted from the outside, with the promise IDs incrementing the whole time.
This appears to be a liveslots issue. Aside from whatever the underlying bug is, this points out the need for the liveslots code to be subjected to metering the same as everything else that runs in a vat.
Note that the single-promise analog of this, where Bob is:
and Alice is:
Terminates successfully after 5 cranks with a single kernel promise whose value refers to itself in an array:
This is the kind of unreferenced loop that we don't expect reference counts to catch.
The text was updated successfully, but these errors were encountered: