forked from erigontech/erigon
-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alpha4 #525
Merged
Merged
Alpha4 #525
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…tech#11887) Previously I tried to re-use `fixCanonicalChain` in the astrid stage for handling fork choice update however after more testing I realised that was wrong since it causes the below issue upon unwinds: ``` INFO[09-05|08:27:38.289] [4/6 Execution] Unwind Execution from=11588734 to=11588733 EROR[09-05|08:27:38.289] Staged Sync err="[4/6 Execution] domains.GetDiffset(11588734, 0x04f1528479c5efae05ac05e38e1402c4e59155049ff44ce6bf5302acb2c25fdb): not found" ``` That is due the fact that `fixCanonicalChain` updates the canonical hash as it traverses the header chain backwards. In the context of update fork choice, this is something that should be done only after the Unwind has succeeded, otherwise we will get the above error when unwinding execution. This also causes the 2nd error below which is a result of the Execution Unwind failing and rolling back the RwTx of the stage loop (insert block changes get lost). This situation will be further stabilised when working on erigontech#11533 This PR fixes the first problem by creating a new function `connectTip` which traverses the header chain backwards, collects new nodes and bad nodes but does not update the canonical hash while doing so. Instead the canonical hashes get updated in `updateForkChoiceForward` after the unwind has been successfully executed. Full Logs ``` INFO[09-05|08:27:36.210] [2/6 PolygonSync] forward progress=11588734 DBUG[09-05|08:27:38.222] [bridge] processing new blocks from=11588729 to=11588729 lastProcessedBlockNum=11588720 lastProcessedBlockTime=1725521200 lastProcessedEventID=2702 DBUG[09-05|08:27:38.222] [sync] inserted blocks len=1 duration=1.882458ms DBUG[09-05|08:27:38.286] [bridge] processing new blocks from=11588734 to=11588734 lastProcessedBlockNum=11588720 lastProcessedBlockTime=1725521200 lastProcessedEventID=2702 DBUG[09-05|08:27:38.287] [sync] inserted blocks len=1 duration=945.75µs DBUG[09-05|08:27:38.287] [bor.heimdall] synchronizing spans... blockNum=11588734 DBUG[09-05|08:27:38.287] [bridge] synchronizing events... blockNum=11588734 lastProcessedBlockNum=11588720 INFO[09-05|08:27:38.287] [2/6 PolygonSync] update fork choice block=11588734 age=1s hash=0x04f1528479c5efae05ac05e38e1402c4e59155049ff44ce6bf5302acb2c25fdb INFO[09-05|08:27:38.287] [2/6 PolygonSync] new fork - unwinding and caching fork choice unwindNumber=11588733 badHash=0x3e1f67072996aec05806d298de3bb281bdcf23566da0dc254c4670ac385768d4 cachedTipNumber=115 88734 cachedTipHash=0x04f1528479c5efae05ac05e38e1402c4e59155049ff44ce6bf5302acb2c25fdb cachedNewNodes=1 DBUG[09-05|08:27:38.289] UnwindTo block=11588733 block_hash=0x3e1f67072996aec05806d298de3bb281bdcf23566da0dc254c4670ac385768d4 err=nil stack="[sync.go:171 stage_polygon_syn c.go:1425 stage_polygon_sync.go:1379 stage_polygon_sync.go:1538 stage_polygon_sync.go:494 stage_polygon_sync.go:175 default_stages.go:479 sync.go:531 sync.go:410 stageloop.go:249 stageloop.go:101 asm_arm6 4.s:1222]" DBUG[09-05|08:27:38.289] [2/6 PolygonSync] DONE in=2.078894042s INFO[09-05|08:27:38.289] [4/6 Execution] Unwind Execution from=11588734 to=11588733 EROR[09-05|08:27:38.289] Staged Sync err="[4/6 Execution] domains.GetDiffset(11588734, 0x04f1528479c5efae05ac05e38e1402c4e59155049ff44ce6bf5302acb2c25fdb): not found" INFO[09-05|08:27:38.792] [4/6 Execution] Unwind Execution from=11588734 to=11588733 INFO[09-05|08:27:38.792] aggregator unwind step=24 txUnwindTo=38128876 stepsRangeInDB="accounts:0.4, storage:0.4, code:0.4, commitment:0.0, logaddrs: 0.4, logtopics: 0.4, tracesfrom: 0.4, tracesto: 0.4" DBUG[09-05|08:27:38.818] [1/6 OtterSync] DONE in=2.958µs INFO[09-05|08:27:38.818] [2/6 PolygonSync] forward progress=11588733 INFO[09-05|08:27:38.818] [2/6 PolygonSync] new fork - processing cached fork choice after unwind cachedTipNumber=11588734 cachedTipHash=0x04f1528479c5efae05ac05e38e1402c4e59155049ff44ce6bf5302acb2c25fdb cachedNewNodes=1 DBUG[09-05|08:27:39.083] [bor.heimdall] block producers tracker observing span id=1812 DBUG[09-05|08:27:43.532] Error while executing stage err="[2/6 PolygonSync] stopped: parent's total difficulty not found with hash 04f1528479c5efae05ac05e38e1402c4e59155049ff44ce6bf5302acb2c25fdb and height 11588734: <nil>" EROR[09-05|08:27:43.532] [2/6 PolygonSync] stopping node err="parent's total difficulty not found with hash 04f1528479c5efae05ac05e38e1402c4e59155049ff44ce6bf5302acb2c25fdb and height 11588734: <nil>" DBUG[09-05|08:27:43.534] rpcdaemon: the subscription to pending blocks channel was closed INFO[09-05|08:27:43.534] Exiting... INFO[09-05|08:27:43.535] HTTP endpoint closed url=127.0.0.1:8545 INFO[09-05|08:27:43.535] RPC server shutting down ```
I used simple "semaphore" to limit the number of goroutines to 4. Co-authored-by: shota.silagadze <shota.silagadze@taal.com>
…1906) Don't need to process attestations from gossip if the committee index associated with the attestation is not subscribed or doesn't require aggregation.
We are trying to optimise `AggregateAndProofService`. After profiling the service, I see that most of the CPU time is spent on signature verifications. From the graph, overall, the function took 6.6% (74 seconds) of all the time (not just execution time. Percentage would be much higher if we took just cpu time) see the screenshot: <img width="1470" alt="Screenshot 2024-09-01 at 10 28 38" src="https://github.com/user-attachments/assets/929ce103-2bf3-43d9-a0fa-ca504e4b58bb"> Now we are trying to aggregate all the signatures and verify them altogether with `bls.VerifyMultipleSignatures` function in an async way and run the final functions if verifications succeed. I basically removed all the code where we verified those three signatures and instead gathered them for verifying later. After profiling that I see the following output: <img width="1468" alt="Screenshot 2024-09-01 at 10 44 31" src="https://github.com/user-attachments/assets/abb842a3-0b4f-4640-8a88-791a2d0af62b"> Now most of the time, as I see, is spent on public key aggregation when we are verifying validator aggregated signatures. But I guess there is no way to optimise that one. As we are spending most of the time on `NewPublicKeyFromBytes` maybe we could cache constructed keys but I think bls already does that. So, that's as good as it gets. --------- Co-authored-by: shota.silagadze <shota.silagadze@taal.com>
Extended pprof read API to include: goroutine, threadcreate, heap, allocs, block, mutex
…r amoy network (erigontech#11902) [Polygon] Bor: Added Ahmedabad HF related configs and block number for amoy network This is the Ahmedabad block number - [11865856](https://amoy.polygonscan.com/block/countdown/11865856) PR in bor - [bor#1324](maticnetwork/bor#1324)
…rigontech#11914) This reverts commit d7d9ded. based on advice from @bretep https://discord.com/channels/687972960811745322/983710221308416010/1281647424569344141 relates to erigontech#11890
HI there, I found that the old testcase is deleted in the repo now. So here is my advice 1 used the old link with commit hash 2 delete the link Im using the first solution now.
- move logic to `state/sqeeze.go` - enable code.kv compression - only values - increase MaxLimit of compress pattern - because it shows better comprass ratio of code.kv - with smaller dictionary
- Create `heimdall.Reader` for future use in `bor_*` API - Make `AssembleReader` and `NewReader` leaner by not requiring full `BorConfig`
Support `bor_*` RPCs when using `polygon.sync` when rpcdaemon with datadir.
…rigontech#11929) Run into an issue since we started pruning total difficulty. ``` EROR[09-09|10:58:03.057] [2/6 PolygonSync] stopping node err="parent's total difficulty not found with hash 9334099de5d77c0d56afefde9985d44f8b4416db99dfe926908d5501fa8dbd9e and height 11736178: <nil> ``` It happened for checkpoint [9703](https://heimdall-api-amoy.polygon.technology/checkpoints/9703). Our start block was in the middle of the checkpoint range which meant we have to fetch all the 8k blocks in this checkpoint to verify the checkpoint root hash when receiving blocks from the peer. The current logic will attempt to insert all these 8k blocks and it will fail with a missing parent td error because we only keep the last 1000 parent td records. This PR fixes this by enhancing the block downloader to not re-insert blocks behind the `start` block. This solves the parent td error and also is saving some unnecessary inserts on the first waypoint processing on startup.
on empty request see error `can't find blockNumber by txnID=1235`
…ontech#11873) comment docker-build-check job as mentioned in the issue [11872](erigontech#11872) -- it will save us 5-6 mins time of waiting for the routine check for each workflow run (faster PR check, etc). get rid of "skip-build-cache" which is removed since v5
…ch#11938) Switch to go builder 1.23.1, introduce docker provenance attest and SBOM. Which should expectedly increase docker image score to A.
More issues surfaced on chain tip when testing astrid: 1. context deadline exceeded when requesting new block event at tip from peer - can happen, safe to ignore event and continue instead of crashing: ``` [EROR] [09-06|03:41:00.183] [2/6 PolygonSync] stopping node err="await *eth.BlockHeadersPacket66 response interrupted: context deadline exceeded" ``` 2. Noticed we do not penalise peers for penalize-able errors when calling `FetchBlocks` - added that in 3. We got another error that crashed the process - `ErrNonSequentialHeaderNumbers` - it is safe to ignore new block event if this happens and continue ``` EROR[09-05|20:26:35.141] [2/6 PolygonSync] stopping node err="non sequential header numbers in fetch headers response: current=11608859, expected=11608860" ``` 4. Added all other p2p errors that may happen and are safe to ignore at tip event processing 5. Added debug logging for better visibility into chain tip events 6. Fixed missing check for whether we have already processed a new block event (ie if its hash is already contained in the canonical chain builder)
Move Astrid bridge functions to its own gRPC server and client to not rely on existing block reader infrastructure.
This PR for erigontech#11417 includes: 1. splitting segments into dirtySegments and visibleSegments 2. dirtySegments are updated in background and not accessible for APP. 3. dirtySegments will be added to visibleSegments when: - there's no gap/overlap/garbage - all types of segments are created and indexed at that height 4. add unit test: `TestCalculateVisibleSegments` --------- Co-authored-by: lupin012 <58134934+lupin012@users.noreply.github.com> Co-authored-by: Alex Sharov <AskAlexSharov@gmail.com> Co-authored-by: Ilya Mikheev <54912776+JkLondon@users.noreply.github.com> Co-authored-by: JkLondon <ilya@mikheev.fun> Co-authored-by: shashiy <shaashiiy@gmail.com> Co-authored-by: Elias Rad <146735585+nnsW3@users.noreply.github.com> Co-authored-by: awskii <awskii@users.noreply.github.com> Co-authored-by: blxdyx <125243069+blxdyx@users.noreply.github.com> Co-authored-by: Giulio rebuffo <giulio.rebuffo@gmail.com> Co-authored-by: Shota <silagadzeshota@gmail.com> Co-authored-by: shota.silagadze <shota.silagadze@taal.com> Co-authored-by: Dmytro Vovk <vovk.dimon@gmail.com> Co-authored-by: Massa <massarinoaa@gmail.com>
(erigontech#12066) align with go-ethereum of detailed oog reason, ref: https://github.com/ethereum/go-ethereum/blob/b018da9d02513ab13de50d63688c465798bd0e14/core/vm/interpreter.go#L273-L275 ```go dynamicCost, err = operation.dynamicGas(in.evm, contract, stack, mem, memorySize) cost += dynamicCost // for tracing if err != nil { return nil, fmt.Errorf("%w: %v", ErrOutOfGas, err) } if !contract.UseGas(dynamicCost, in.evm.Config.Tracer, tracing.GasChangeIgnored) { return nil, ErrOutOfGas } ``` --------- Signed-off-by: jsvisa <delweng@gmail.com>
closes erigontech#11707 --------- Co-authored-by: JkLondon <ilya@mikheev.fun>
closes erigontech#11974 --------- Co-authored-by: JkLondon <ilya@mikheev.fun> Co-authored-by: Dmytro Vovk <vovk.dimon@gmail.com>
Co-authored-by: JkLondon <ilya@mikheev.fun>
made format change in interpreter so that `make docker` task runs (it doesn't run if the changes are only in .github/ dir)
…age push workflow (erigontech#12115) - revert changes to ci-cd-main-branch-docker-images.yml - do those changes instead in ci-cd-main-branch-docker-images2.yml (temp copy) for quicker testing off PRs.
Aces42020
approved these changes
Oct 5, 2024
# Conflicts: # turbo/rpchelper/helper.go
MatusKysel
approved these changes
Oct 15, 2024
setunapo
approved these changes
Oct 15, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.