Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix yet another epoch slot issue #123

Closed
wants to merge 3 commits into from

Conversation

mratsim
Copy link
Contributor

@mratsim mratsim commented Feb 19, 2019

This fixes #101 and add several more tracing, the gist of it is that we were making our updates on a copy of our state.

It does not seem like the copy is needed anymore (pending CI).

In summary I think that this goes one step beyond the branch https://github.com/status-im/nim-beacon-chain/tree/yglukhov-somewhat-stable-simulation

I definitely felt the tangle while trying to debug it (cf: #117 (comment)). I think we need a refactoring as soon as phase 0 stabilizes.

Next steps

Note that we still have block slot issues, but the new one seems to need moving or reordering core part of the processing as we:

  • proposed a block
  • received our own proposal a couple of slots later
  • crash with unexpected block slot.

deepinscreenshot_select-area_20190219163745

With the assert being at the soon infamous :) "Could this fail somehow"

https://github.com/status-im/nim-beacon-chain/blob/e1efdd4349a35903861a5384df0a96f07005fa56/beacon_chain/beacon_node.nim#L220-L229

Copy link
Member

@arnetheduck arnetheduck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems a bit wrong in that if you update node.beaconState directly, that updated slot number will stay when processBlocks is called, causing us to skip the actual update.. no?

notice "Unexpected block slot number - stackTrace incoming",
blockSlot = humaneSlotNum blck.slot,
stateSlot = humaneSlotNum state.slot
writeStackTrace()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove debugging cruft?

@@ -397,23 +399,29 @@ proc processBlock(
return false

if not processRandao(state, blck, flags):
notice "Randao failure"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should already log at the error site, with more pertinent information (where the initial error is detected)

debug "TRACE - updateState",
oldStateSlot = humaneSlotNum state.slot,
proposedBlockSlot = if new_block.isSome(): new_block.get.slot.humaneSlotNum
else: 1010101010
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what?

addTimer(at) do (p: pointer):
# Chronicles is not GC-safe
debugEcho "TRACE - in closure: currentSlot ", humaneSlotNum node.beaconState.slot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..

@arnetheduck
Copy link
Member

#125 probably fixes this issue

@mratsim mratsim closed this Feb 19, 2019
@mratsim
Copy link
Contributor Author

mratsim commented Feb 19, 2019

Actually I wanted to get more input from @zah to make sure the copy was still needed, hence I left the debug output so that he can reproduce and check.

@mratsim mratsim deleted the fix-yet-another-epoch-slot-issue branch March 14, 2019 10:15
etan-status pushed a commit that referenced this pull request May 12, 2022
Deprecated asyncDiscard() procedure.
Bump version to 2.5.2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AssertionError in get_crosslink_committees_at_slot after slot 64
2 participants