Alpha4 #525

blxdyx · 2024-09-29T11:35:59Z

No description provided.

…tech#11887) Previously I tried to re-use `fixCanonicalChain` in the astrid stage for handling fork choice update however after more testing I realised that was wrong since it causes the below issue upon unwinds: ``` INFO[09-05|08:27:38.289] [4/6 Execution] Unwind Execution from=11588734 to=11588733 EROR[09-05|08:27:38.289] Staged Sync err="[4/6 Execution] domains.GetDiffset(11588734, 0x04f1528479c5efae05ac05e38e1402c4e59155049ff44ce6bf5302acb2c25fdb): not found" ``` That is due the fact that `fixCanonicalChain` updates the canonical hash as it traverses the header chain backwards. In the context of update fork choice, this is something that should be done only after the Unwind has succeeded, otherwise we will get the above error when unwinding execution. This also causes the 2nd error below which is a result of the Execution Unwind failing and rolling back the RwTx of the stage loop (insert block changes get lost). This situation will be further stabilised when working on erigontech#11533 This PR fixes the first problem by creating a new function `connectTip` which traverses the header chain backwards, collects new nodes and bad nodes but does not update the canonical hash while doing so. Instead the canonical hashes get updated in `updateForkChoiceForward` after the unwind has been successfully executed. Full Logs ``` INFO[09-05|08:27:36.210] [2/6 PolygonSync] forward progress=11588734 DBUG[09-05|08:27:38.222] [bridge] processing new blocks from=11588729 to=11588729 lastProcessedBlockNum=11588720 lastProcessedBlockTime=1725521200 lastProcessedEventID=2702 DBUG[09-05|08:27:38.222] [sync] inserted blocks len=1 duration=1.882458ms DBUG[09-05|08:27:38.286] [bridge] processing new blocks from=11588734 to=11588734 lastProcessedBlockNum=11588720 lastProcessedBlockTime=1725521200 lastProcessedEventID=2702 DBUG[09-05|08:27:38.287] [sync] inserted blocks len=1 duration=945.75µs DBUG[09-05|08:27:38.287] [bor.heimdall] synchronizing spans... blockNum=11588734 DBUG[09-05|08:27:38.287] [bridge] synchronizing events... blockNum=11588734 lastProcessedBlockNum=11588720 INFO[09-05|08:27:38.287] [2/6 PolygonSync] update fork choice block=11588734 age=1s hash=0x04f1528479c5efae05ac05e38e1402c4e59155049ff44ce6bf5302acb2c25fdb INFO[09-05|08:27:38.287] [2/6 PolygonSync] new fork - unwinding and caching fork choice unwindNumber=11588733 badHash=0x3e1f67072996aec05806d298de3bb281bdcf23566da0dc254c4670ac385768d4 cachedTipNumber=115 88734 cachedTipHash=0x04f1528479c5efae05ac05e38e1402c4e59155049ff44ce6bf5302acb2c25fdb cachedNewNodes=1 DBUG[09-05|08:27:38.289] UnwindTo block=11588733 block_hash=0x3e1f67072996aec05806d298de3bb281bdcf23566da0dc254c4670ac385768d4 err=nil stack="[sync.go:171 stage_polygon_syn c.go:1425 stage_polygon_sync.go:1379 stage_polygon_sync.go:1538 stage_polygon_sync.go:494 stage_polygon_sync.go:175 default_stages.go:479 sync.go:531 sync.go:410 stageloop.go:249 stageloop.go:101 asm_arm6 4.s:1222]" DBUG[09-05|08:27:38.289] [2/6 PolygonSync] DONE in=2.078894042s INFO[09-05|08:27:38.289] [4/6 Execution] Unwind Execution from=11588734 to=11588733 EROR[09-05|08:27:38.289] Staged Sync err="[4/6 Execution] domains.GetDiffset(11588734, 0x04f1528479c5efae05ac05e38e1402c4e59155049ff44ce6bf5302acb2c25fdb): not found" INFO[09-05|08:27:38.792] [4/6 Execution] Unwind Execution from=11588734 to=11588733 INFO[09-05|08:27:38.792] aggregator unwind step=24 txUnwindTo=38128876 stepsRangeInDB="accounts:0.4, storage:0.4, code:0.4, commitment:0.0, logaddrs: 0.4, logtopics: 0.4, tracesfrom: 0.4, tracesto: 0.4" DBUG[09-05|08:27:38.818] [1/6 OtterSync] DONE in=2.958µs INFO[09-05|08:27:38.818] [2/6 PolygonSync] forward progress=11588733 INFO[09-05|08:27:38.818] [2/6 PolygonSync] new fork - processing cached fork choice after unwind cachedTipNumber=11588734 cachedTipHash=0x04f1528479c5efae05ac05e38e1402c4e59155049ff44ce6bf5302acb2c25fdb cachedNewNodes=1 DBUG[09-05|08:27:39.083] [bor.heimdall] block producers tracker observing span id=1812 DBUG[09-05|08:27:43.532] Error while executing stage err="[2/6 PolygonSync] stopped: parent's total difficulty not found with hash 04f1528479c5efae05ac05e38e1402c4e59155049ff44ce6bf5302acb2c25fdb and height 11588734: <nil>" EROR[09-05|08:27:43.532] [2/6 PolygonSync] stopping node err="parent's total difficulty not found with hash 04f1528479c5efae05ac05e38e1402c4e59155049ff44ce6bf5302acb2c25fdb and height 11588734: <nil>" DBUG[09-05|08:27:43.534] rpcdaemon: the subscription to pending blocks channel was closed INFO[09-05|08:27:43.534] Exiting... INFO[09-05|08:27:43.535] HTTP endpoint closed url=127.0.0.1:8545 INFO[09-05|08:27:43.535] RPC server shutting down ```

Implements ethereum/execution-apis#570

Fixes: erigontech#11905

I used simple "semaphore" to limit the number of goroutines to 4. Co-authored-by: shota.silagadze <shota.silagadze@taal.com>

…1906) Don't need to process attestations from gossip if the committee index associated with the attestation is not subscribed or doesn't require aggregation.

We are trying to optimise `AggregateAndProofService`. After profiling the service, I see that most of the CPU time is spent on signature verifications. From the graph, overall, the function took 6.6% (74 seconds) of all the time (not just execution time. Percentage would be much higher if we took just cpu time) see the screenshot: <img width="1470" alt="Screenshot 2024-09-01 at 10 28 38" src="https://github.com/user-attachments/assets/929ce103-2bf3-43d9-a0fa-ca504e4b58bb"> Now we are trying to aggregate all the signatures and verify them altogether with `bls.VerifyMultipleSignatures` function in an async way and run the final functions if verifications succeed. I basically removed all the code where we verified those three signatures and instead gathered them for verifying later. After profiling that I see the following output: <img width="1468" alt="Screenshot 2024-09-01 at 10 44 31" src="https://github.com/user-attachments/assets/abb842a3-0b4f-4640-8a88-791a2d0af62b"> Now most of the time, as I see, is spent on public key aggregation when we are verifying validator aggregated signatures. But I guess there is no way to optimise that one. As we are spending most of the time on `NewPublicKeyFromBytes` maybe we could cache constructed keys but I think bls already does that. So, that's as good as it gets. --------- Co-authored-by: shota.silagadze <shota.silagadze@taal.com>

relates to erigontech#11890

Extended pprof read API to include: goroutine, threadcreate, heap, allocs, block, mutex

…r amoy network (erigontech#11902) [Polygon] Bor: Added Ahmedabad HF related configs and block number for amoy network This is the Ahmedabad block number - [11865856](https://amoy.polygonscan.com/block/countdown/11865856) PR in bor - [bor#1324](maticnetwork/bor#1324)

@bretep

…rigontech#11914) This reverts commit d7d9ded. based on advice from @bretep https://discord.com/channels/687972960811745322/983710221308416010/1281647424569344141 relates to erigontech#11890

`go1.23.1` released, means time to drop `go1.21` support

…rver (erigontech#11693) Fixes erigontech#11485

HI there, I found that the old testcase is deleted in the repo now. So here is my advice 1 used the old link with commit hash 2 delete the link Im using the first solution now.

- move logic to `state/sqeeze.go` - enable code.kv compression - only values - increase MaxLimit of compress pattern - because it shows better comprass ratio of code.kv - with smaller dictionary

- Create `heimdall.Reader` for future use in `bor_*` API - Make `AssembleReader` and `NewReader` leaner by not requiring full `BorConfig`

Support `bor_*` RPCs when using `polygon.sync` when rpcdaemon with datadir.

…rigontech#11929) Run into an issue since we started pruning total difficulty. ``` EROR[09-09|10:58:03.057] [2/6 PolygonSync] stopping node err="parent's total difficulty not found with hash 9334099de5d77c0d56afefde9985d44f8b4416db99dfe926908d5501fa8dbd9e and height 11736178: <nil> ``` It happened for checkpoint [9703](https://heimdall-api-amoy.polygon.technology/checkpoints/9703). Our start block was in the middle of the checkpoint range which meant we have to fetch all the 8k blocks in this checkpoint to verify the checkpoint root hash when receiving blocks from the peer. The current logic will attempt to insert all these 8k blocks and it will fail with a missing parent td error because we only keep the last 1000 parent td records. This PR fixes this by enhancing the block downloader to not re-insert blocks behind the `start` block. This solves the parent td error and also is saving some unnecessary inserts on the first waypoint processing on startup.

…rigontech#11931) erigontech#11898

on empty request see error `can't find blockNumber by txnID=1235`

…ontech#11873) comment docker-build-check job as mentioned in the issue [11872](erigontech#11872) -- it will save us 5-6 mins time of waiting for the routine check for each workflow run (faster PR check, etc). get rid of "skip-build-cache" which is removed since v5

…ch#11938) Switch to go builder 1.23.1, introduce docker provenance attest and SBOM. Which should expectedly increase docker image score to A.

More issues surfaced on chain tip when testing astrid: 1. context deadline exceeded when requesting new block event at tip from peer - can happen, safe to ignore event and continue instead of crashing: ``` [EROR] [09-06|03:41:00.183] [2/6 PolygonSync] stopping node err="await *eth.BlockHeadersPacket66 response interrupted: context deadline exceeded" ``` 2. Noticed we do not penalise peers for penalize-able errors when calling `FetchBlocks` - added that in 3. We got another error that crashed the process - `ErrNonSequentialHeaderNumbers` - it is safe to ignore new block event if this happens and continue ``` EROR[09-05|20:26:35.141] [2/6 PolygonSync] stopping node err="non sequential header numbers in fetch headers response: current=11608859, expected=11608860" ``` 4. Added all other p2p errors that may happen and are safe to ignore at tip event processing 5. Added debug logging for better visibility into chain tip events 6. Fixed missing check for whether we have already processed a new block event (ie if its hash is already contained in the canonical chain builder)

…1939)

Move Astrid bridge functions to its own gRPC server and client to not rely on existing block reader infrastructure.

This PR for erigontech#11417 includes: 1. splitting segments into dirtySegments and visibleSegments 2. dirtySegments are updated in background and not accessible for APP. 3. dirtySegments will be added to visibleSegments when: - there's no gap/overlap/garbage - all types of segments are created and indexed at that height 4. add unit test: `TestCalculateVisibleSegments` --------- Co-authored-by: lupin012 <58134934+lupin012@users.noreply.github.com> Co-authored-by: Alex Sharov <AskAlexSharov@gmail.com> Co-authored-by: Ilya Mikheev <54912776+JkLondon@users.noreply.github.com> Co-authored-by: JkLondon <ilya@mikheev.fun> Co-authored-by: shashiy <shaashiiy@gmail.com> Co-authored-by: Elias Rad <146735585+nnsW3@users.noreply.github.com> Co-authored-by: awskii <awskii@users.noreply.github.com> Co-authored-by: blxdyx <125243069+blxdyx@users.noreply.github.com> Co-authored-by: Giulio rebuffo <giulio.rebuffo@gmail.com> Co-authored-by: Shota <silagadzeshota@gmail.com> Co-authored-by: shota.silagadze <shota.silagadze@taal.com> Co-authored-by: Dmytro Vovk <vovk.dimon@gmail.com> Co-authored-by: Massa <massarinoaa@gmail.com>

(erigontech#12066) align with go-ethereum of detailed oog reason, ref: https://github.com/ethereum/go-ethereum/blob/b018da9d02513ab13de50d63688c465798bd0e14/core/vm/interpreter.go#L273-L275 ```go dynamicCost, err = operation.dynamicGas(in.evm, contract, stack, mem, memorySize) cost += dynamicCost // for tracing if err != nil { return nil, fmt.Errorf("%w: %v", ErrOutOfGas, err) } if !contract.UseGas(dynamicCost, in.evm.Config.Tracer, tracing.GasChangeIgnored) { return nil, ErrOutOfGas } ``` --------- Signed-off-by: jsvisa <delweng@gmail.com>

closes erigontech#11707 --------- Co-authored-by: JkLondon <ilya@mikheev.fun>

…#12044)

closes erigontech#11974 --------- Co-authored-by: JkLondon <ilya@mikheev.fun> Co-authored-by: Dmytro Vovk <vovk.dimon@gmail.com>

Co-authored-by: JkLondon <ilya@mikheev.fun>

made format change in interpreter so that `make docker` task runs (it doesn't run if the changes are only in .github/ dir)

…age push workflow (erigontech#12115) - revert changes to ci-cd-main-branch-docker-images.yml - do those changes instead in ci-cd-main-branch-docker-images2.yml (temp copy) for quicker testing off PRs.

# Conflicts: # turbo/rpchelper/helper.go

taratorio and others added 30 commits September 6, 2024 12:05

E3: Prune TotalDifficulty canonical markers (erigontech#11809)

a273520

Implements ethereum/execution-apis#570

Fix bor event check start-up (erigontech#11909)

5ae250f

Fixes: erigontech#11905

Use transactions snapshots as baseSegType (erigontech#11895)

a6c8874

use simple semaphore to limit peer connection opening (erigontech#11861)

375b29e

I used simple "semaphore" to limit the number of goroutines to 4. Co-authored-by: shota.silagadze <shota.silagadze@taal.com>

Reducing unnecessary attestation processing from gossip (erigontech#1…

55eb461

…1906) Don't need to process attestations from gossip if the committee index associated with the attestation is not subscribed or doesn't require aggregation.

rpchelper: limits for filters by default (erigontech#11911)

d7d9ded

relates to erigontech#11890

diagnostics: all pprofs (erigontech#11891)

a899646

Extended pprof read API to include: goroutine, threadcreate, heap, allocs, block, mutex

Revert "rpchelper: limits for filters by default (erigontech#11911)" (e…

5eef966

…rigontech#11914) This reverts commit d7d9ded. based on advice from @bretep https://discord.com/channels/687972960811745322/983710221308416010/1281647424569344141 relates to erigontech#11890

drop go1.21 (erigontech#11897)

a3e4b4b

`go1.23.1` released, means time to drop `go1.21` support

up x deps (erigontech#11843)

cabea06

polygon/bridge: Support Astrid bridge on standalone and privateapi se…

79d6617

…rver (erigontech#11693) Fixes erigontech#11485

docs: update the test action path (erigontech#11919)

ebfa41c

HI there, I found that the old testcase is deleted in the repo now. So here is my advice 1 used the old link with commit hash 2 delete the link Im using the first solution now.

e3: simplify findShortenedKey (erigontech#11915)

5f42f0d

code.kv: compress values (erigontech#11876)

f931a43

- move logic to `state/sqeeze.go` - enable code.kv compression - only values - increase MaxLimit of compress pattern - because it shows better comprass ratio of code.kv - with smaller dictionary

polygon/heimdall: Split read functions into reader (erigontech#11924)

1158ad1

- Create `heimdall.Reader` for future use in `bor_*` API - Make `AssembleReader` and `NewReader` leaner by not requiring full `BorConfig`

rpcdaemon: Integrate heimdall.Service (erigontech#11900)

4a992a3

Support `bor_*` RPCs when using `polygon.sync` when rpcdaemon with datadir.

E3: fix "minimal" and "full" pruning (erigontech#11920)

64ab751

Upgrade goreleaser-cross to 1.22.7 and alpine base image to 3.20.3 (e…

7789131

…rigontech#11931) erigontech#11898

trace_filter: return error at empty request (erigontech#11935)

4a9b166

on empty request see error `can't find blockNumber by txnID=1235`

re-introduce StateCache (erigontech#11925)

25a3c7c

Switch to go builder 1.23.1 and improve docker scout score. (erigonte…

d599dc8

…ch#11938) Switch to go builder 1.23.1, introduce docker provenance attest and SBOM. Which should expectedly increase docker image score to A.

polygon/sync: add sync to tip time measurement info log (erigontech#1…

c409588

…1939)

rpcdaemon: Create gRPC server for bridge functions (erigontech#11937)

66dfabe

Move Astrid bridge functions to its own gRPC server and client to not rely on existing block reader infrastructure.

jsvisa and others added 21 commits September 25, 2024 13:44

lru incerase default limits (erigontech#11738)

9451f8d

prevention of strange logs due to uint underflow (erigontech#12101)

c73a448

closes erigontech#11707 --------- Co-authored-by: JkLondon <ilya@mikheev.fun>

ibs: less allocs of logs array and free pointers at reset (erigontech…

fec5607

…#12044)

New default ports (erigontech#12102)

777c0c9

closes erigontech#11974 --------- Co-authored-by: JkLondon <ilya@mikheev.fun> Co-authored-by: Dmytro Vovk <vovk.dimon@gmail.com>

Attention to v-prefix. gh cli rework. (erigontech#12100)

679c806

Bump version to alpha4 (erigontech#12110)

3c96154

small fix of metrics default port (erigontech#12111)

4de4ecb

Co-authored-by: JkLondon <ilya@mikheev.fun>

changelog file (erigontech#12112)

e63e77c

fix checkout in kurtosis ci (erigontech#12084)

90e9705

made format change in interpreter so that `make docker` task runs (it doesn't run if the changes are only in .github/ dir)

Remove duplicate version definition (erigontech#12113)

d20bd4d

prep work: getting kurtosis assertoor ci to depend on every docker im…

cfa497b

…age push workflow (erigontech#12115) - revert changes to ci-cd-main-branch-docker-images.yml - do those changes instead in ci-cd-main-branch-docker-images2.yml (temp copy) for quicker testing off PRs.

Get rid of windows build stages. (erigontech#12117)

c0d9a2b

merge alpha4

75ab332

remove the dependency of geth

8ea74a4

merge alpah4

900e3c2

refactor patch

d518e48

refactor patch

53ece7a

fix rpc

51fa0de

fix historical_trace_worker.go

6a2496c

fix receipt

c316799

Aces42020 approved these changes Oct 5, 2024

View reviewed changes

blxdyx added 5 commits October 8, 2024 15:39

default generate snapshots

8023202

default generate snapshots

6dd35d9

rm logs

982c117

fix system tx exec

c8a5f69

Merge remote-tracking branch 'node-real/main' into alpha4

9f3ca6b

# Conflicts: # turbo/rpchelper/helper.go

MatusKysel approved these changes Oct 15, 2024

View reviewed changes

setunapo approved these changes Oct 15, 2024

View reviewed changes

setunapo merged commit aba2217 into node-real:main Oct 15, 2024
7 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alpha4 #525

Alpha4 #525

blxdyx commented Sep 29, 2024

Alpha4 #525

Alpha4 #525

Conversation

blxdyx commented Sep 29, 2024