Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement delayed chunks application for Stateless Validation #9982

Open
Tracked by #9292 ...
pugachAG opened this issue Oct 20, 2023 · 3 comments
Open
Tracked by #9292 ...

Implement delayed chunks application for Stateless Validation #9982

pugachAG opened this issue Oct 20, 2023 · 3 comments
Assignees
Labels
A-chain Area: Chain, client & related A-stateless-validation Area: stateless validation Near Core

Comments

@pugachAG
Copy link
Contributor

See the doc for more context.

This includes delaying applying block chunks until the next block is processed or a new chunk is needed to be produced.

@Longarithm
Copy link
Member

It seemed to me that I can achieve all three sub-goals from Miro board for #9679 if I apply right modifications to get_apply_chunk_job_new_chunk. Unfortunately, all my attempts ended up with couple of integration-tests failing due to different reasons, including

  • ChunkExtra to be stored for the block which chunk was executed
  • misalignment with staking txs processing in epoch manager

So I decided to come back to previous non-async approach and iterate from it slower. Pair of fixes already helped to bring number of nayduck failures "just" to 60. I plan to reduce it further and then apply chunks asynchronously again.

github-merge-queue bot pushed a commit that referenced this issue Nov 6, 2023
For stateless validation, chunk execution will be delayed until next
block is processed: #9982. This impacts several tests assuming that
block processing includes chunk processing as well. For these tests, we
need to produce and process one more block to get execution results like
ChunkExtra and ExecutionOutcome.

To allow production of longer forks, I want to extend client API by
`produce_block_on` which can produce a block not just on top of head,
but on top of any existing block. As this block isn't immediately saved
or processed, it even doesn't break any guarantees.

## Testing

Impacted tests should still pass. Later stateless validation PRs will
rely on it.

---------

Co-authored-by: Longarithm <the.aleksandr.logunov@gmail.com>
github-merge-queue bot pushed a commit that referenced this issue Dec 12, 2023
This is a next step for #9982.
Here I introduce jobs which will perform stateless validation of newly
received chunk by executing txs and receipts.
Later they should be executed against state witness, but for now I just
set a foundation by running these jobs against state data in DB. All
passing tests verify that old and new jobs generate the same result.
The final switch will happen when stateful jobs will be replaced with
stateless ones.

### Details

This doesn't introduce any load on stable version. On nightly version
there will be `num_shards` extra jobs which will check that stateless
validation results are consistent with stateful execution. But as we use
nightly only for testing, it shouldn't mean much overhead.

I add more fields to `ShardContext` structure to simplify code. Some of
them are needed to break early if there is resharding, and the logic is
the same for both kinds of jobs.

`StorageDataSource::DbTrieOnly` is introduced to read data only from
trie in stateless jobs. This is annoying but still needed if there are a
lot of missing chunks and flat storage head moved above the block at
which previous chunk was created. When state witness will be
implemented, `Recorded` will be used instead.

## Testing

* Failure to update current_chunk_extra on the way leads to >20 tests
failing in process_blocks, with errors like `assertion `left == right`
failed: For stateless validation, chunk extras for block
CMV88CBcnKoxa7eTnkG64psLoJzpW9JeAhFrZBVv6zDc and shard s3.v2 do not
match...`
* If I update current_chunk_extra only once,
`tests::client::resharding::test_latest_protocol_missing_chunks_high_missing_prob`
fails which was specifically introduced for that. Actually this helped
to realize that `validate_chunk_with_chunk_extra` is still needed but I
will introduce it later.
* Nayduck: ~https://nayduck.near.org/#/run/3293 - +10 nightly tests
failing, will take a look~ https://nayduck.near.org/#/run/3300

---------

Co-authored-by: Longarithm <the.aleksandr.logunov@gmail.com>
@Longarithm
Copy link
Member

Longarithm commented Feb 2, 2024

Quick update: we are not fully doing this anymore in scope of SV mainnet release.
Main reason is that applying of old chunks is more problematic than we thought.
We still want to do it, but the fix must include, together:

  • replacing "apply old chunks" concept with "update validator accounts for empty chunk range"
  • as new chunk in block appears, it should fix set of incoming receipts to be executed, which will be from prev chunk, included, to new chunk, excluded. Now we take "previous" range
  • to be able to apply chunks eagerly, it should be possible to reapply old chunk every time when new chunk is missing, because set of receipts to execute is changing.

IIRC this is valuable as we won't have to apply receipts to make just a state sync and protocol logic becomes MUCH more consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-chain Area: Chain, client & related A-stateless-validation Area: stateless validation Near Core
Projects
None yet
Development

No branches or pull requests

3 participants