-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] feat(consensus): committees data extractor (BFT-443) #2518
[WIP] feat(consensus): committees data extractor (BFT-443) #2518
Conversation
core/lib/dal/migrations/20240725194916_l1_batches_consensus_committees.up.sql
Outdated
Show resolved
Hide resolved
number: BatchNumber, | ||
validator_committee: ValidatorCommittee, | ||
attester_committee: AttesterCommittee, | ||
) -> anyhow::Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we are going to query the last batch with committees/certificate, then keeping them in the same table makes little sense, because we will need extra indices for efficient lookups. Do we really want to go this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Querying the last batch with committees is a one-time read on server start, so I didn’t thought it’s a big deal. And I don’t see such an index for the query of the last batch with certificate (unless it’s applied internally via the toolchain).
In the overall, I’m not against separating the tables, and originally suggested that. IIRC @RomanBrodetski thought it’s an overkill, and that having all info in respect to a given batch in one place is useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no index, because the the table serves as index for itself when it has just 1 non null column. With 2 columns, it won't be the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we will need extra indices for efficient lookups.
why exactly? select ... order by ID desc where certificate is not null
will only scan the rows for which there is committee but not certificate. Given that it happens on startup, it shouldn't be a problem - am I missing smt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no index, because the the table serves as index for itself when it has just 1 non null column. With 2 columns, it won't be the case.
Right, I overlooked that in that case there's no need for WHERE
clause.
why exactly?
select ... order by ID desc where certificate is not null
will only scan the rows for which there is committee but not certificate. Given that it happens on startup, it shouldn't be a problem - am I missing smt?
If column X is NOT NULL, there's no need for WHERE X IS NOT NULL
, and SELECT MAX(PK)
alone is sufficient.
Since the NULL column is the certificate, and not the committee, this won't happen just on startup. However, I'm not sure about the frequency of this query and whether it justifies an index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I would keep the committee in a separate table. As I understand we are doing this without concern for performance, but there are many ways to store the committee, and some don't involve repeating the entire committee in every batch, only when it changes. To keep our options open for future changes, I would keep them separate from the beginning, rather than start messing around with existing constraints to make room for this in the same table, having to revisit the (luckily few) existing queries where we now have to add filtering, turning inserts into updates.
Ok(Some(zksync_protobuf::serde::deserialize(certificate)?)) | ||
} | ||
|
||
pub async fn batch_committee( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to state it in docstrings that this returns the committee which should attest to the batch_number
, and not the one which is the result of executing the batch, ie. the ones attesting for batch_number+1
.
.execute(self.storage) | ||
.await?; | ||
if res.rows_affected().is_zero() { | ||
tracing::debug!(l1_batch_number = ?number, "duplicate batch certificate"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracing::debug!(l1_batch_number = ?number, "duplicate batch certificate"); | |
tracing::debug!(l1_batch_number = ?number, "duplicate batch committee"); |
@@ -491,6 +552,45 @@ impl ConsensusDal<'_, '_> { | |||
// and doesn't have prunning enabled. | |||
Ok(attester::BatchNumber(0)) | |||
} | |||
|
|||
pub async fn get_last_batch_committee_number( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some docstrings to explain what this method is supposed to do? It sounds like "the committee number of the last batch" but there is no such thing. Is it the batch number of the last executed batch, for which a committee is known? Isn't that available somewhere else?
certificate = $2, | ||
updated_at = NOW() | ||
WHERE | ||
l1_batch_number = $1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The insertion of committees happens asynchronously. In theory the certificate could arrive before the polling loop tries to extract the committee from the VM, in which case this update would not do anything, and the certificate would be lost.
Also the logging that happens as "duplicate certificate" when 0 rows are affected would be untrue, it would apply for a case described above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought this shouldn't happen: before the certificate for batch N can be verified the node needs the committee from the execution of batch N-1, so I assume there is plenty of time between when the committee is extracted from N-1 and inserted for N, especially because we need the commitment for batch N too.
Still it feels like having two tables would avoid any confusion.
let mut next_batch = self.next_batch_to_extract_committee(ctx).await?; | ||
loop { | ||
self.wait_batch(ctx, next_batch).await?; | ||
self.extract_batch_committee(ctx, next_batch).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So wait_batch
calls get_sealed_l1_batch_number which is the last batch in the table, but it's not necessarily executed yet, or is it? What would happen if the contract was called with a number that it doesn't have data for, would it fail, return an empty result, or return something based on results up to the previous block?
tracing::info!("Running zksync_node_consensus::storage::CommitteeExtractor"); | ||
let res = scope::run!(ctx, |ctx, s| async { | ||
let mut next_batch = self.next_batch_to_extract_committee(ctx).await?; | ||
loop { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible at all to hook into the batch execution itself, so instead of polling and eventually loading the data from here to there, the derived data would be inserted within the same transaction?
self.pool | ||
.connection(ctx) | ||
.await? | ||
.insert_batch_committee(ctx, batch_number, db_attester_committee) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking my understanding of which batch number we mean here.
I thought the committee
in l1_batches_consensus
for batch number N means which is the committee that is expected to sign batch N. This is the result of executing batch N-1.
So we call extract_batch_committee
with the next_batch
that we haven't extracted yet, and then figure out what is the last L2 block included in this batch. We execute the eth_call
there to get the committee after the execution of the last L2 block. Then we insert the committee with the same batch number? Shouldn't it be batch_number.inc()
?
closing this pr, I'll continue this work in #2684 |
Offshoot of #2416
What ❔
Adding state extractor for committees data, reading from the consensus registry contract to be inserted into
l1_batches_consensus
DB table.Why ❔
Reading directly from VM storage (with
zksync_node_consensus::storage::registry_contract::VMReader
) is inefficient, and is using a (slightly) different data format. A background-running denormalization process would ease ongoing per-batch reads.Changes
zksync_node_consensus::storage::registry_contract::CommitteeExtractor
l1_batches_consensus
tablebatch_committee()
&insert_batch_committee()
,get_last_batch_committee_number
inconsensus_dal
sqlx::query
ofinsert_batch_certificate()
,get_last_batch_certificate_number
inconsensus_dal
zksync_node_consensus::storage::testonly::VMWriter
to support writes across multiple batchestest_committee_extractor
TODOs
zksync_dal/migrations
: Update the CHECK constraint oncertificate
column to support nullable column (the constraint name is automatically generated)Checklist
zk fmt
andzk lint
.