Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dynamic vats aren't replayed when kernel is restarted #1480

Closed
warner opened this issue Aug 15, 2020 · 4 comments · Fixed by #1481
Closed

dynamic vats aren't replayed when kernel is restarted #1480

warner opened this issue Aug 15, 2020 · 4 comments · Fixed by #1481
Assignees
Labels
bug Something isn't working SwingSet package: SwingSet

Comments

@warner
Copy link
Member

warner commented Aug 15, 2020

Describe the bug

@FUDCo discovered that our dynamic vat support is incomplete. We're saving the transcripts and c-lists of dynamic vats into the kernel state DB, but not their code. At startup, the kernel does not regenerate the dynamic vats, so other vats which interact with them will fail.

This wouldn't be noticable if a contract vat completes its business before the host is shut down (i.e. a validator node reboots), but any contract that is still operating at that time would be non-functional when the node starts back up again.

To fix this, we'll need to save the dynamic vat's code (and/or the bundleName used to create it) in the DB. And we'll need to change the kernel startup code to look for dynamic vats and regenerate them, including replaying their transcripts.

@warner warner added bug Something isn't working SwingSet package: SwingSet labels Aug 15, 2020
@warner warner self-assigned this Aug 15, 2020
@FUDCo
Copy link
Contributor

FUDCo commented Aug 15, 2020

From my perspective it's less an issue of capturing the code as capturing the fact that the dynamic vat was created at all (and what parameters it was created with). The code for static vats is taken from the file system (which has its issues but we've managed to live with it) so I don't see why dynamic vats are materially different in that regard (well, if their bundles came in over the network from outside the swingset I guess it's an issue, but in that case the bundle is in the transcript already).

@warner
Copy link
Member Author

warner commented Aug 15, 2020

Yeah, contract bundles will come in over the network, and the vat which spawned the dynamic vat may have that code in its transcript, but that doesn't help us, because that vat (like all others) are disconnected from their syscalls while being replayed. Since each vat is replayed independently, we really have to save the source (or the bundleName, if any) in the DB. (of course this is a fine candidate for a better DB approach, like add the code to a hash-indexed blob store and then only stash the hash in the DB).

I've got a branch started, with unit tests that fail.

warner added a commit that referenced this issue Aug 15, 2020
Dynamic vats aren't getting reloaded properly: we aren't saving the source
bundle used to create them, nor are we loading them at startup time.

This adds test to exercise the problem, which fail ("unknown vatID") when we
start a new swingset from the old state vector. We'll fix the problem in the
next commit.

refs #1480
@FUDCo
Copy link
Contributor

FUDCo commented Aug 15, 2020

The bundle is literally in the callNow syscall that creates the vat. Although we don't currently do anything with the logged syscalls except verify that the replay generates the same ones, in the process of doing that verification we are in a position to observe the act of vat creation. Would it make sense for replay to be able to do something with device calls?

@warner
Copy link
Member Author

warner commented Aug 15, 2020

I don't think so. We're going to want to be able to reload vats independently and lazily, as messages arrive for them, and we'd have no idea which of the vatAdmin's syscalls might have created the vat we're asked to reload. So needing to replay the entire vatAdmin history first, just to glean the bundles for the dynamic vats, would be a lot of coupling between things that ought to be independent. Most of those vats (hopefully) will have been retired by the time the kernel restarts, so we wouldn't want to recreated every one that the vatAdmin ever knew about: we want to only rebuild the ones that are still active. We may want to migrate a vat from one swingset to another: we'll need to be able to get both its source code and its current state from the DB, without actually loading it right then.

The direction I'm heading is:

//
// vat.names = JSON([names..])
// vat.dynamicIDs = JSON([vatIDs..])
// vat.name.$NAME = $vatID = v$NN
// vat.nextID = $NN
// device.names = JSON([names..])
// device.name.$NAME = $deviceID = d$NN
// device.nextID = $NN

// dynamic vats have these too:
// v$NN.source = JSON({ bundle }) or JSON({ bundleName })
// v$NN.options = JSON

I've got the kernel.js -side part in place. The biggest piece is splitting dynamicVat.js up to separate the "allocate things the first time" part from the "build a dynamic vat once we know the allocations" part, so I can add a third piece to use on the replay pass.

warner added a commit that referenced this issue Aug 15, 2020
Dynamic vats aren't getting reloaded properly: we aren't saving the source
bundle used to create them, nor are we loading them at startup time.

This adds test to exercise the problem, which fail ("unknown vatID") when we
start a new swingset from the old state vector. We'll fix the problem in the
next commit.

refs #1480
warner added a commit that referenced this issue Aug 15, 2020
This saves the dynamic vat data (source bundle or bundleName, dynamicOptions)
to the kernel state DB when the vat is first created. Then, during replay, we
reload the dynamic vat from that data.

This refactors dynamic vat creation slightly, to merge create-by-bundle and
create-by-bundle-name into a single function that takes a `source` argument
which contains either `{ bundle }` or `{ bundleName }`. By passing `source`
to `createVatDynamically`, rather than first converting it into a full
bundle, we have enough information to store the much smaller bundleName in
the DB. Userspace code still sees the same API (`vatAdminSvc~.createVat(bundle)`
or `vatAdminSvc~.createVatByName(bundleName)`), but the kernel internals
and the vat-admin device endowments have changed.

We add a new `vat.dynamicIDs` entry to the kernelKeeper, which contains a
single JSON-encoded Array of dynamic vatIDs. This isn't great, but it's
expedient, and our DB doesn't really support enumerating anything more clever
anyways. We also add a pair of keys to the vatKeeper for each dynamic vat:
`v$NN.source` and `v$NN.options`, to record the bundle/bundleName and
dynamicOptions that were used the first time around, so we can use them on
subsequent calls too.

`createVatDynamically` normally sends a message (to the vatAdminVat) when the
new vat is ready, so it can inform the original caller and give them the new
root object. We suppress this message when reloading the kernel, because it
was already delivered the first time around. One side-effect is that failures
to reload the dynamic vat must panic the kernel, because there is nobody
ready to report the error to. This could happen if the new host application
failed to configure the same source bundles as the previous one did. At some
point we'll want to store all these bundles in the DB, which should remove
this failure mode.

closes #1480
@warner warner linked a pull request Aug 15, 2020 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working SwingSet package: SwingSet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants