Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: archiver identifies prune #8666

Merged
merged 6 commits into from
Sep 26, 2024
Merged

refactor: archiver identifies prune #8666

merged 6 commits into from
Sep 26, 2024

Conversation

LHerskind
Copy link
Contributor

@LHerskind LHerskind commented Sep 20, 2024

Fixes #8620, by using the status function to figure out when a prune have happened and unwind state. It do NOT tell the world state synchronizer about the unwinding and that should be addressed as part of #8665.

Adds tests to e2e_synching making sure that an archiver will correctly catch the prune happening and delete the blocks and accompanying data. Also adds e2e_synching to earthfile.

Removes the proven_store since we learn this from the status and it is not more closely linked to the fetching of blocks.

Note that we are not handling L1 re-orgs specifically in here, but they will for some cases be caught as well. But not all, so there is still #8621 as well.

Fixes an issue where asking for block 0 would return block 1, as there is no block 0. Explicitly return undefined if block 0 is requested now, and gives an error if using getBlocks from a block that is before the initial block.

Copy link
Contributor Author

LHerskind commented Sep 20, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @LHerskind and the rest of your teammates on Graphite Graphite

Copy link
Member

@Maddiaa0 Maddiaa0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, missing unit test + semantics question

this.log.verbose(`L2 prune have occurred, unwind state`);

let tipAfterUnwind = localPendingBlockNumber;
while (true) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition can be hit if the Pending chain reorgs from a prune

In which case the entire epoch will be dropped, correct me if im wrong but if this is the median case then we would make less requests if we start from the proven chain then move forward?

Copy link
Contributor Author

@LHerskind LHerskind Sep 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially yes. I think I had a weird edge case where I needed to figure out whether to use the local proven or the destination proven and were like, I will just take the simple thing that works and then deal with optimisations later if they are really needed. As the expected max depth should be 2*epoch I seems ok to potentially do that when a prune happens, as it would not happen frequently in time, e.g., you cannot do 2 prunes right in a row because there is nothing to prune etc, so doing up to 60 reads at most every 30 minutes seems fine 🤷.

If running the updateProvenBlock in the case of localPendingBlockInChain that should probably be covered though.

Separately, I think when I was doing it, part of having it like this was the case where the l2 prune was caused by an L1 reorg such that the proven local chain was invalid and would also end up being unwound here, but that would not actually update our local proven block number in the same round which is kinda odd 🤷. Actually the updateProvenBlock should probably not only update if provenBlock have moved ahead since that is going to kinda nuke itself with a reorg caused by L1.

* @param blocksToUnwind - The number of blocks we are to unwind
* @returns True if the operation is successful
*/
unwindBlocks(from: number, blocksToUnwind: number): Promise<boolean>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not tested in the archiver_store test

@@ -50,6 +50,15 @@ export interface ArchiverDataStore {
*/
addBlocks(blocks: L1Published<L2Block>[]): Promise<boolean>;

/**
* Unwinds blocks from the database
* @param from - The tip of the chain, passed for verification purposes,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "tip of the chain" in the implementation is the most recently synced L2 block, but the two might be different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify a bit. I'm not following. Both of the values are based on the synched L2 block, e.g., the from and the value we are checking against, so we should not be updating it in the middle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it- my question/concern was around how does the caller know what to pass as "from", and how would it know that it is the tip.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caller should only really be the archiver, and it will be able to just fetch the information from the store directly. It so mainly to avoid someone doing a call to it directly.

* @returns The requested L2 block.
*/
public async getBlock(number: number): Promise<L2Block | undefined> {
// If the number provided is -ve, then return the latest block.
if (number < 0) {
number = await this.store.getSynchedL2BlockNumber();
}
if (number == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this genesis block? Should it be undefined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a "genesis" state, but there is not actually a block that you have.

});
}
}

private async handleL2blocks(blockUntilSynced: boolean, blocksSynchedTo: bigint, currentL1BlockNumber: bigint) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really hard to follow all the logic branches in this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will look at clearing it up.

@@ -321,7 +352,7 @@ export class Archiver implements ArchiveSource {
);

if (retrievedBlocks.length === 0) {
await this.store.setBlockSynchedL1BlockNumber(currentL1BlockNumber);
// We are not calling `setBlockSynchedL1BlockNumber` because it may cause sync issues if based off infura.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this comment?

Also, shouldn't we expect that we retrieve blocks if my local pending block num is less than what the rollup says?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look higher up in the archiver there is a larger description around infura and consistency. So yes, while we SHOULD receive something. Infura might not return it to us because of the eventual consistency.

if (block === undefined) {
throw new Error(`Cannot remove block ${blockNumber} from the store, we don't have it`);
}
void this.#blocks.delete(block.data.number);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we delete all the logs/contracts etc associated with the unwound blocks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I forgot, different store.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all of these are added per block, seems like the apis could be made more convenient if we just have an addBlock and unwindBlock on the archiver_store and handle the logic in there. The archiver don't really need to know about these, as long as the store does.

@LHerskind
Copy link
Contributor Author

Going to create a pr that will be "before" this in the stack looking at refactoring the archiver and its stores with api updates.

@LHerskind LHerskind changed the base branch from master to lh/archiver-refactor September 24, 2024 11:11
@LHerskind LHerskind force-pushed the lh/archiver-refactor branch 2 times, most recently from 0c7b804 to 6822f3f Compare September 24, 2024 22:27
@LHerskind LHerskind force-pushed the lh/8620-handle-prune branch 2 times, most recently from 46715c6 to c79a869 Compare September 25, 2024 12:03
Base automatically changed from lh/archiver-refactor to master September 25, 2024 17:07
await rollup.write.prune();
};
sequencer?.updateSequencerConfig({ minTxsPerBlock: variant.txCount, maxTxsPerBlock: variant.txCount });
const txs = await variant.createAndSendTxs();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm kind of surprised this works.

The node has had its archive pruned, but the world state is still unpruned. So we would expect this to fail to prove/verify at the moment right? And the roots it is publishing are garbage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at the start of that test.

 it.skip('node following prunes and can extend chain', async () => {
      // @todo This test is to be activated when we can unwind the world state
      // It will currently stall forever as the state will never match.

@LHerskind LHerskind merged commit 4cf0f70 into master Sep 26, 2024
93 checks passed
@LHerskind LHerskind deleted the lh/8620-handle-prune branch September 26, 2024 17:51
Rumata888 pushed a commit that referenced this pull request Sep 27, 2024
Fixes #8620, by using the status function to figure out when a prune
have happened and unwind state. It do NOT tell the world state
synchronizer about the unwinding and that should be addressed as part of
#8665.

Adds tests to `e2e_synching` making sure that an archiver will correctly
catch the prune happening and delete the blocks and accompanying data.
Also adds e2e_synching to earthfile.

Removes the `proven_store` since we learn this from the status and it is
not more closely linked to the fetching of blocks.

Note that we are not handling L1 re-orgs specifically in here, but they
will for some cases be caught as well. But not all, so there is still
#8621 as well.

Fixes an issue where asking for block 0 would return block 1, as there
is no block 0. Explicitly return undefined if block 0 is requested now,
and gives an error if using `getBlocks` from a block that is before the
initial block.
Rumata888 pushed a commit that referenced this pull request Sep 27, 2024
Fixes #8620, by using the status function to figure out when a prune
have happened and unwind state. It do NOT tell the world state
synchronizer about the unwinding and that should be addressed as part of
#8665.

Adds tests to `e2e_synching` making sure that an archiver will correctly
catch the prune happening and delete the blocks and accompanying data.
Also adds e2e_synching to earthfile.

Removes the `proven_store` since we learn this from the status and it is
not more closely linked to the fetching of blocks.

Note that we are not handling L1 re-orgs specifically in here, but they
will for some cases be caught as well. But not all, so there is still
#8621 as well.

Fixes an issue where asking for block 0 would return block 1, as there
is no block 0. Explicitly return undefined if block 0 is requested now,
and gives an error if using `getBlocks` from a block that is before the
initial block.
Rumata888 pushed a commit that referenced this pull request Sep 27, 2024
Fixes #8620, by using the status function to figure out when a prune
have happened and unwind state. It do NOT tell the world state
synchronizer about the unwinding and that should be addressed as part of
#8665.

Adds tests to `e2e_synching` making sure that an archiver will correctly
catch the prune happening and delete the blocks and accompanying data.
Also adds e2e_synching to earthfile.

Removes the `proven_store` since we learn this from the status and it is
not more closely linked to the fetching of blocks.

Note that we are not handling L1 re-orgs specifically in here, but they
will for some cases be caught as well. But not all, so there is still
#8621 as well.

Fixes an issue where asking for block 0 would return block 1, as there
is no block 0. Explicitly return undefined if block 0 is requested now,
and gives an error if using `getBlocks` from a block that is before the
initial block.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
e2e-all CI: Enables this CI job.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: Add status to contract and use it to catch prune
4 participants