-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix committee test race #12338
Fix committee test race #12338
Conversation
beacon-chain/cache/sync_committee.go
Outdated
@@ -162,6 +181,9 @@ func (s *SyncCommitteeCache) UpdatePositionsInCommittee(syncCommitteeBoundaryRoo | |||
|
|||
s.lock.Lock() | |||
defer s.lock.Unlock() | |||
if clearCount != s.cleared.Load() { | |||
return errors.New("cache rotated during async committee update operation") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any thoughts on logging the error and returning nil vs. returning the error?
Question is whether this error should fail block processing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bummer that we even need to consider this, but I agree that I don't want to create an issue in block processing, because strict ordering of cache updates so far seems to only be an issue in tests. But I don't like how the current design requires so much care to avoid races, so I'm inclined to go ahead and log for now, then return to improving this design when we audit cache implementations.
What type of PR is this?
Bug fix
What does this PR do? Why is it needed?
Helps eliminate a class of unit tests races caused by asyncronous updating of the sync committee cache. This is accomplished by more carefully managing access to the cache value, canceling the async update if the underlying cache has been re-initialized, which only happens in the test code path.