Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use headstate for recent checkpoints #13746

Merged
merged 3 commits into from
Mar 17, 2024
Merged

Use headstate for recent checkpoints #13746

merged 3 commits into from
Mar 17, 2024

Conversation

potuz
Copy link
Contributor

@potuz potuz commented Mar 14, 2024

In the event a peer hasn't imported slot 0 and attests for slot 31 as target during slot 0. All prysm nodes that have seen the block in slot 0 and are receiving these attestations, go through the following in gettAttPrestate

	// If the attestation is recent and canonical we can use the head state to compute the shuffling.
	headEpoch := slots.ToEpoch(s.HeadSlot())
	if c.Epoch == headEpoch {

This code path is missed since c.Epoch > headEpoch.

Later in the code we hit:

	// Do not process attestations for old non viable checkpoints otherwise
	ok, err := s.cfg.ForkChoiceStore.IsViableForCheckpoint(&forkchoicetypes.Checkpoint{Root: [32]byte(c.Root), Epoch: c.Epoch})
	if err != nil {
		return nil, errors.Wrap(err, "could not check checkpoint condition in forkchoice")
	}
	if !ok {
		return nil, errors.Wrap(ErrNotCheckpoint, fmt.Sprintf("epoch %d root %#x", c.Epoch, c.Root))
	}

Forkchoice does not take c as viable for checkpoint because the only child (slot 0) does not satisfy the condition in it:

	for _, child := range node.children {
		if child.slot > epochStart {
			return true, nil
		}
	}

Thus we are flooded with the message

	ErrNotCheckpoint = errors.New("not a checkpoint in forkchoice")

This should result in poor attestation performance in slot 0 and inclusion in slot 1.

This PR implements that if the incoming checkpoint has epoch higher than the current head root and the block is canonical (this can only happen if the checkpoint is the actual headroot) then we return as state the head state advanced to the current slot.

Reviewer be advised: check about possible performance problems because of the slot processing.

@potuz potuz added the Bug Something isn't working label Mar 14, 2024
@potuz potuz requested a review from a team as a code owner March 14, 2024 19:19
@potuz potuz requested review from kasey, rauljordan and saolyn March 14, 2024 19:19
Comment on lines 43 to 54
slot, err := slots.EpochStart(c.Epoch)
if err != nil {
return nil
}
st, err := s.HeadState(ctx)
if err != nil {
return nil
}
st, err = transition.ProcessSlotsUsingNextSlotCache(ctx, st, c.Root, slot)
if err != nil {
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't we want to cache the result of this in the check point attestation cache? Seems to me if there are N attestations trigger this condition then we will need to do this N times which could become a dos vector. Are state copy and process slots here that cheap?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the NSC cache is a hit wouldn't this be already in the checkpoint cache? hmm probably better to follow on Slack

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, we're getting these messages on attestations up to slot 5, so they are not in the NSC, I updated the checkpoint cache here.

@potuz potuz force-pushed the compute_prestate_recent branch from e6196f0 to 0ddbdfe Compare March 16, 2024 11:33
if err != nil {
return nil
}
st, err = transition.ProcessSlotsUsingNextSlotCache(ctx, st, c.Root, slot)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am worried about advancing slots this way, this becomes expensive. why can't we just return the read only head state here ? If slot 0 is missed, the head state should still be able to verify an attestation from slot 31

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the attestation is not from slot 31, it's from slot 0, you need to check that the attester is in the right committee

@potuz potuz force-pushed the compute_prestate_recent branch from 8dbad79 to b96a52b Compare March 16, 2024 12:27
return nil
}
// Try if we have already set the checkpoint cache
epochKey := strconv.FormatUint(uint64(c.Epoch), 10 /* base 10 */)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have to move it up before you fetch the head state.

@potuz potuz force-pushed the compute_prestate_recent branch from b96a52b to 7d1d045 Compare March 16, 2024 12:34
Copy link
Member

@terencechain terencechain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The conditions make sense. We should clean up getAttPreState with getRecentPreState to remove the overlapping code later on

@potuz potuz added this pull request to the merge queue Mar 17, 2024
Merged via the queue into develop with commit ec8b67c Mar 17, 2024
16 of 17 checks passed
@potuz potuz deleted the compute_prestate_recent branch March 17, 2024 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants