You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During the PismoD network soft-patch, some validators diverged and fell out of consensus because they did not upgrade their node software at the same time as the 2/3+ validators did.
Comparing SwingSet transcripts between in- and out-of-consensus nodes, @mhofman found the divergence: a failed request from a SwingSet vat to a Golang module resulted in error messages that were sensitive to the soft-patch:
unpatched: insufficient funds [agoric-labs/cosmos-sdk@v0.45.11-alpha.agoric.1/x/bank/keeper/send.go:186] (note the version: alpha.agoric.1)
patched: insufficient funds [agoric-labs/cosmos-sdk@v0.45.11-alpha.agoric.1.1/x/bank/keeper/send.go:186] (note the extra "dot one": alpha.agoric.1 .1)
Description of the Design
A suspect idiom in our Golang code is using fmt.Errorf("... %w", ..., err) to create a fresh error based on err with some descriptive text. It turns out that %w renders cosmos-sdk’s attached error stack frame (which has the version information in it). Changing it to %s instead only propagates the error message.
However, we can still use %w for parts of the agd implementation that are not within Tendermint consensus (such as client code), where the stack information is useful for debugging.
Security Considerations
This issue can cause divergent behaviour between validators running a consensus-breaking soft-patch when it was not expected to be. This is a risk to availability, as such a soft-patch could potentially halt the chain. The mitigating factor is that clear communication with the validator community can help coordinate the necessary upgrades.
Scaling Considerations
n/a
Test Plan
Write unit tests demonstrating the absence and presence of stack strings with the %w and %s format specifiers.
The text was updated successfully, but these errors were encountered:
What is the Problem Being Solved?
During the PismoD network soft-patch, some validators diverged and fell out of consensus because they did not upgrade their node software at the same time as the 2/3+ validators did.
Comparing SwingSet transcripts between in- and out-of-consensus nodes, @mhofman found the divergence: a failed request from a SwingSet vat to a Golang module resulted in error messages that were sensitive to the soft-patch:
insufficient funds [agoric-labs/cosmos-sdk@v0.45.11-alpha.agoric.1/x/bank/keeper/send.go:186]
(note the version: alpha.agoric.1)insufficient funds [agoric-labs/cosmos-sdk@v0.45.11-alpha.agoric.1.1/x/bank/keeper/send.go:186]
(note the extra "dot one": alpha.agoric.1 .1)Description of the Design
A suspect idiom in our Golang code is using
fmt.Errorf("... %w", ..., err)
to create a fresh error based on err with some descriptive text. It turns out that%w
renders cosmos-sdk’s attached error stack frame (which has the version information in it). Changing it to%s
instead only propagates the error message.However, we can still use
%w
for parts of theagd
implementation that are not within Tendermint consensus (such as client code), where the stack information is useful for debugging.Security Considerations
This issue can cause divergent behaviour between validators running a consensus-breaking soft-patch when it was not expected to be. This is a risk to availability, as such a soft-patch could potentially halt the chain. The mitigating factor is that clear communication with the validator community can help coordinate the necessary upgrades.
Scaling Considerations
n/a
Test Plan
Write unit tests demonstrating the absence and presence of stack strings with the
%w
and%s
format specifiers.The text was updated successfully, but these errors were encountered: