Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dynamic Protocol State] Refactor Epoch Commitment Deadline Enforcement #5108

Conversation

jordanschalm
Copy link
Member

@jordanschalm jordanschalm commented Dec 5, 2023

This PR addresses #5104, using this suggestion:

Alternatively, we could implement the check entirely in the constructor or the mutator. We would essentially just add another case, where we instantiate an epochFallbackStateMachine here:

Changes

  • EFM caused by exceeding the epoch commitment deadline is observed strictly in the protocol state stateMutator module
  • Adds tests for the stateMutator constructor

jordanschalm and others added 21 commits November 29, 2023 13:12
1. Update bootstrapping to use DynamicProtocolState rather than
   protocol.Epoch API, resolving a TODO and simplifying that method
2. Removes the committed_epoch_final_view metric. We originally added
   this metric with the idea that it would provide a way to see how much
   time you have minimally, until the next time when EFM could possibly
   be triggered. However:
        - it is not used
        - it is somewhat misleading, as EFM can actually be triggered at
          other points than the final view of the committed epoch in
          some circumstances
        - if we do want to alert on something like this in the future,
          it can be created from other metrics:
            - cur_epoch_final_view, cur_view, cur_epoch_phase
- remove expectation of removed metric
- fix call to indexFirstHeight in bootstrapping
- fix boolean conditional in EpochPhase method
Co-authored-by: Yurii Oleksyshyn <yuraolex@gmail.com>
@jordanschalm jordanschalm changed the base branch from master to jordan/4649-epoch-status December 5, 2023 16:54
@jordanschalm jordanschalm marked this pull request as ready for review December 6, 2023 22:34
Base automatically changed from jordan/4649-epoch-status to feature/dynamic-protocol-state December 11, 2023 19:22
@codecov-commenter
Copy link

codecov-commenter commented Dec 11, 2023

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (1668c30) 56.51% compared to head (6ba3c78) 56.50%.

Files Patch % Lines
...sensus/hotstuff/cruisectl/block_time_controller.go 0.00% 2 Missing ⚠️
utils/unittest/fixtures.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@                        Coverage Diff                         @@
##           feature/dynamic-protocol-state    #5108      +/-   ##
==================================================================
- Coverage                           56.51%   56.50%   -0.01%     
==================================================================
  Files                                 987      987              
  Lines                               92539    92531       -8     
==================================================================
- Hits                                52298    52289       -9     
- Misses                              36396    36400       +4     
+ Partials                             3845     3842       -3     
Flag Coverage Δ
unittests 56.50% <87.50%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@durkmurder durkmurder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice PR, great tests. 🏅

Co-authored-by: Yurii Oleksyshyn <yuraolex@gmail.com>
Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very clean and amazing tests. Thank you 🙇

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There a lots of mentions of "EECC" in this file. Not sure, did we want to replace those with "EFM" 😅 (?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a project-wide replace in 8e69496

@@ -269,3 +273,16 @@ func (m *stateMutator) transitionToEpochFallbackMode(results []*flow.ExecutionRe
}
return dbUpdates, nil
}

// epochFallbackTriggeredByIncorporatingCandidate checks whether incorporating the input block
// would trigger epoch fallback mode (EFM) along the current fork. In particular, we trigger epoch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in scientific literature, it is common practise to set abbreviations su ch as EFM in square brackets. Would suggest to adopt this convention also in our code:

// would trigger epoch fallback mode [EFM] along the current fork.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// epochFallbackTriggeredByIncorporatingCandidate checks whether incorporating the input block
// would trigger epoch fallback mode (EFM) along the current fork. In particular, we trigger epoch
// fallback mode when:
// 1. The next epoch has not been committed as of B (EpochPhase < flow.EpochPhaseCommitted) AND
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be inclined to use the same representation as in code:

if phase == flow.EpochPhaseCommitted { // Requirement 1

Suggested change
// 1. The next epoch has not been committed as of B (EpochPhase < flow.EpochPhaseCommitted) AND
// 1. The next epoch has not been committed as of B (EpochPhase flow.EpochPhaseCommitted) AND

This holds independently of the internal order of EpochPhase

Comment on lines 277 to 288
// epochFallbackTriggeredByIncorporatingCandidate checks whether incorporating the input block
// would trigger epoch fallback mode (EFM) along the current fork. In particular, we trigger epoch
// fallback mode when:
// 1. The next epoch has not been committed as of B (EpochPhase < flow.EpochPhaseCommitted) AND
// 2. B is the first incorporated block with view greater than or equal to the epoch commitment
// deadline for the current epoch
func epochFallbackTriggeredByIncorporatingCandidate(candidateView, safetyThreshold, finalView uint64, phase flow.EpochPhase) bool {
if phase == flow.EpochPhaseCommitted { // Requirement 1
return false
}
return candidateView+safetyThreshold >= finalView // Requirement 2
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Algorithmically it is correct I think, but I got stuck on it because there were a few details that I wasn't aware of. I think we can increase code clarity and documentation coverage notably if we changed the signature of the method just a bit.

Suggested change
// epochFallbackTriggeredByIncorporatingCandidate checks whether incorporating the input block
// would trigger epoch fallback mode (EFM) along the current fork. In particular, we trigger epoch
// fallback mode when:
// 1. The next epoch has not been committed as of B (EpochPhase < flow.EpochPhaseCommitted) AND
// 2. B is the first incorporated block with view greater than or equal to the epoch commitment
// deadline for the current epoch
func epochFallbackTriggeredByIncorporatingCandidate(candidateView, safetyThreshold, finalView uint64, phase flow.EpochPhase) bool {
if phase == flow.EpochPhaseCommitted { // Requirement 1
return false
}
return candidateView+safetyThreshold >= finalView // Requirement 2
}
// epochFallbackTriggeredByIncorporatingCandidate checks whether incorporating the input block B
// would trigger epoch fallback mode [EFM] along the current fork. We trigger epoch fallback mode
// when:
// 1. The next epoch has not been committed as of B (EpochPhase ≠ flow.EpochPhaseCommitted) AND
// 2. B is the first incorporated block with view greater than or equal to the epoch commitment
// deadline for the current epoch
//
// In protocol terms, condition 1 means that an EpochCommit service event for the upcoming epoch has
// not yet been sealed as of block B. Formally, a service event S is considered sealed as of block B if:
// - S was emitted during execution of some block A, s.t. A is an ancestor of B.
// - The seal for block A was included in some block C, s.t C is an ancestor of B.
//
// For further details see `params.EpochCommitSafetyThreshold()`.
func epochFallbackTriggeredByIncorporatingCandidate(candidateView uint64, params protocol.GlobalParams, parentState *flow.RichProtocolStateEntry) bool {
if parentState.EpochPhase() == flow.EpochPhaseCommitted { // Requirement 1
return false
}
return candidateView+params.EpochCommitSafetyThreshold() >= parentState.CurrentEpochSetup.FinalView // Requirement 2
}

Here, I have used params and parentState instead of the derived values, because then I can easily refer to them in the documentation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 6ba3c78

// epoch service event or an invalid state transition previously in this fork.
// Case 2: Incorporating the candidate block is itself an invalid epoch transition.
//
// In either case, Epoch Fallback Mode (EFM) has been tentatively triggered on this fork,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// In either case, Epoch Fallback Mode (EFM) has been tentatively triggered on this fork,
// In either case, Epoch Fallback Mode [EFM] has been tentatively triggered on this fork,

Comment on lines +270 to +272
// if a state machine constructor returns an error, the stateMutator constructor should fail
// and propagate the error to the caller
s.Run("state machine constructor returns error", func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the also testing the error path 🤩

@jordanschalm jordanschalm merged commit 292b541 into feature/dynamic-protocol-state Dec 15, 2023
53 checks passed
@jordanschalm jordanschalm deleted the jordan/5104-refactor-epoch-commit-deadline branch December 15, 2023 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants