Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent validator from using mismatched agoric-sdk version #8471

Closed
mhofman opened this issue Oct 24, 2023 · 2 comments
Closed

Prevent validator from using mismatched agoric-sdk version #8471

mhofman opened this issue Oct 24, 2023 · 2 comments
Labels
cosmic-swingset package: cosmic-swingset enhancement New feature or request liveslots requires vat-upgrade to deploy changes SwingSet package: SwingSet xsnap the XS execution tool

Comments

@mhofman
Copy link
Member

mhofman commented Oct 24, 2023

What is the Problem Being Solved?

Some agoric-sdk versions may be mostly compatible between each other so that most operations remain in consensus, but some others cause a divergence. For example, we may fix an xsnap bug which has divergent execution depending on the environment (like #7829), or introduce a change that causes the supervisor or lockdown bundle to have a different hash (e.g. when we bumped from 11 to 11wf), which would only be picked up when starting a new vat.

#7012 describes requirements for enforcing the xsnap-worker version, however this is not sufficient as we've observed a validator on devnet falling off consensus when a core eval executed and started a new vat. The xsnap version checks implemented in #7831 require that the version included in the binary be explicitly bumped to reflect an expected software upgrade version, and thus didn't cover this case.

Description of the Design

The enforcement of version likely needs to remain piecemeal. For example it may not make sense to enforce that the cosmos or cosmic-swingset code matches exactly between all validators as we've found benefits to be able to locally patch that part of the software as long as the behavior remains in consensus.

For the supervisor and lockdown bundles, we know those are consensus affecting on start vat. These should only change during a planned software upgrade, and as such we should be able to fail starting the node if these bundles are not yet known, unless we're re-starting because of an upgrade. This would remove the ticking time bomb effect that these incompatible bundles currently have.

The xsnap version likely falls in the same bucket as the chain software upgrade, and having the ability to locally patch the worker may be valuable. Embedding the expected chain software upgrade version is considered a hack, and is a weird inversion of dependency. We likely want to instead report the Moddable SDK version the worker is based on, and include an "agoric suffix" to capture behavioral patches we want to enforce.

Security Considerations

While we avoid it as much as possible, retaining the ability to distribute a hotpatch to validators is valuable. While we can always disable / coerce versions checks in a hotpatch, it's easier if the implemented version checks allow this kind of patch as much as possible.

Scaling Considerations

None

Test Plan

TBD

Upgrade Considerations

This only affects chain software upgrades and ensures that all validators use the expected version.

@mhofman mhofman added enhancement New feature or request SwingSet package: SwingSet cosmic-swingset package: cosmic-swingset xsnap the XS execution tool liveslots requires vat-upgrade to deploy changes labels Oct 24, 2023
@mhofman
Copy link
Member Author

mhofman commented Jul 2, 2024

With #9618, we now enforce that the xsnap binary matches the one expected by the xsnap package at start of the process.

However I believe the supervisor and lockdown bundles are still only sampled during vat start. Unlike xsnap, a dependency change which results in a different bundle may not affect the versioning of the package exposing these bundles. Furthermore if we implement a check that the bundle matches some expected value, it should happen before any state is modified (and preferably without any state being available since it's hard to know that a state is meant to be upgraded or not).

Finally I believe the golang side does not itself verify that the state it's working against (either genesis or existing application state) is for the current version of the software, but that might be out of scope if the cosmos-sdk doesn't have such a protection mechanism. At best we should guarantee that our chain software is internally consistent.

@mhofman
Copy link
Member Author

mhofman commented Jul 2, 2024

Closing in favor of #9644

@mhofman mhofman closed this as not planned Won't fix, can't repro, duplicate, stale Jul 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cosmic-swingset package: cosmic-swingset enhancement New feature or request liveslots requires vat-upgrade to deploy changes SwingSet package: SwingSet xsnap the XS execution tool
Projects
None yet
Development

No branches or pull requests

1 participant