[indexer] index protocol configs and feature flags #18450

emmazzz · 2024-06-28T05:59:14Z

Description

This PR adds two tables protocol_configs and feature_flags that get populated at indexer startup time and every epoch change time if protocol version has changed.

Test plan

Tested locally against devnet.

Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

Protocol:
Nodes (Validators and Full nodes):
Indexer: adds two new indexer tables that stores protocol configs and features flags of different versions.
JSON-RPC:
GraphQL: uses the stored data to query for protocol configs instead of native configs stored in the binary.
CLI:
Rust SDK:

vercel · 2024-06-28T05:59:18Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
sui-docs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Aug 21, 2024 6:04pm

3 Skipped Deployments

Name	Status	Preview	Updated (UTC)
multisig-toolkit	⬜️ Ignored (Inspect)	Visit Preview	Aug 21, 2024 6:04pm
sui-kiosk	⬜️ Ignored (Inspect)	Visit Preview	Aug 21, 2024 6:04pm
sui-typescript-docs	⬜️ Ignored (Inspect)	Visit Preview	Aug 21, 2024 6:04pm

amnn

Looks good! My main pending question is about the relationship between the task that's responsible for filling the protocol config table and the checkpoint watermark -- it seemed to me like they were decoupled (which we don't want, if that's the case), but I am by no means the expert there.

amnn · 2024-06-28T10:20:45Z

crates/sui-indexer/src/store/pg_indexer_store.rs

+        while {
+            chain_id_opt = self.get_chain_identifier()?;
+            chain_id_opt.is_none()
+        } {}


The busy wait here is a little heavyweight, can we add some small sleep duration to avoid the service hammering the DB?

amnn · 2024-06-28T10:28:21Z

crates/sui-indexer/src/store/pg_indexer_store.rs

+
+        // We only store protocol configs until the version for the latest epoch, because config
+        // values for future epochs, even when they already exist, are subject to change.
+        let end_version = self.get_latest_epoch_protocol_version()?.unwrap_or(1);


nit: The two reads from the DB could be coalesced into one, that returns both latest protocol versions (that may also address the potential confusion between the two that you mention in the doc comments).

amnn · 2024-06-28T12:57:35Z

crates/sui-indexer/src/indexer.rs

@@ -108,6 +108,10 @@ impl Indexer {
            )
            .await?;

+        // Index protocol configs for protocol versions not yet in the db
+        let store_clone = store.clone();
+        spawn_monitored_task!(store_clone.persist_protocol_configs_and_feature_flags());


Does this mean that protocol configs are filled by a parallel process (i.e. not adhering/contributing to the same watermark as the "main" indexing task? That may pose an issue for consistency (where someone issues a query for an epoch and its protocol config and gets one but not the other).

I understand your concern. This issue primarily arises during the startup of an indexer instance. Specifically, the persist_protocol_configs_and_feature_flags call at epoch change time is not a parallel process. However, if a user makes a query right after the indexer restarts after being down for more than an epoch, they could encounter this problem.

I agree that the process should be blocking and completed before we start indexing everything else. I have attempted to implement this, but ran into an issue. In my setup, we read the chain identifier from the database in persist_protocol_configs_and_feature_flags. This identifier is the digest of the first checkpoint, which requires the indexing of the first checkpoint to be completed in the database. That's why we wait in that while loop for the other process to finish indexing checkpoint 0.

Ideally, the process should follow this sequential order:

Index the first checkpoint, if it has not been indexed yet.

Call persist_protocol_configs_and_feature_flags.

Index the rest of the checkpoints.

One potential solution to alleviate this issue is to perform the process in parallel only if the watermark is 0. This way, we only encounter the consistency issue at the beginning with an empty database, which no user will ever encounter. What do you think?

Is it possible to pass the chain identifier to the catch-up process in memory rather than having the job fetch it from the DB? That's one way I can think of to keep the reads consistent always while maintaining this read order.

An alternative is to add a configuration parameter to the indexer that sets which chain it is supposed to be indexing for, which is then used to initialise the protocol config table. That is simpler to implement but then we'll need to update all the deployment configs, and it adds a layer of complexity for starting up the indexer as well, I suppose.

if a user makes a query right after the indexer restarts after being down for more than an epoch, they could encounter this problem

is the problem about protocol config being outdated on indexer, isn't that expected as all other tables are outdated too?

The issue is not so much about it being out-of-date, it's about it being out-of-sync: They could query for an epoch and its protocol config and the protocol config not exist in the DB yet because it's being filled in parallel.

I've changed the code to pass in the chain identifier, and to index protocol configs at the same time we index checkpoint 0, so no busy waiting or separate process anymore.

crates/sui-indexer/src/store/pg_indexer_store.rs

gegaowp · 2024-07-03T15:53:52Z

crates/sui-indexer/src/indexer.rs

@@ -108,6 +108,10 @@ impl Indexer {
            )
            .await?;

+        // Index protocol configs for protocol versions not yet in the db
+        let store_clone = store.clone();
+        spawn_monitored_task!(store_clone.persist_protocol_configs_and_feature_flags());


if a user makes a query right after the indexer restarts after being down for more than an epoch, they could encounter this problem

is the problem about protocol config being outdated on indexer, isn't that expected as all other tables are outdated too?

gegaowp · 2024-07-03T16:01:32Z

crates/sui-indexer/src/store/pg_indexer_store.rs

+        read_only_blocking!(&self.blocking_cp, |conn| {
+            checkpoints::dsl::checkpoints
+                .select(checkpoints::checkpoint_digest)
+                .filter(checkpoints::sequence_number.eq(0))


this always reads the sequence_number.eq(0), what if we have pruned this or restore only the latest 30D data?

Yes this is indeed an issue. We read from checkpoint 0 for chain identifier in other queries too. I think the right thing to do here is to have a separate chain_identifier singleton table that stores just the chain identifier and never gets pruned away. Wdyt? @gegaowp

fixed in #18825

gegaowp · 2024-07-03T16:07:44Z

crates/sui-indexer/src/store/pg_indexer_store.rs

+
+        // We only store protocol configs until the version for the latest epoch, because config
+        // values for future epochs, even when they already exist, are subject to change.
+        let end_version = self.get_latest_epoch_protocol_version()?.unwrap_or(1);


the end_version can be derived from epoch_data and passed to this function in committer, so that we can avoid counting on the DB read here.

In addition to epoch change time, we also call persist_protocol_configs_and_feature_flags at indexer startup time, where epoch_data is not available.

if we can avoid this call at indexer startup time then it looks like we can pass the value in?

gegaowp · 2024-07-03T16:10:11Z

crates/sui-indexer/src/store/pg_indexer_store.rs

+        );
+
+        // Gather all protocol configs and feature flags for all versions between start and end.
+        for version in start_version..=end_version {


is this for for backfill, in other words, if indexer starts along with genesis, here each time we will always run only once?

if indexer starts along with genesis, here each time we will always run only once?

Yes. If indexer is in sync, this line will be run once per epoch so each time the loop will iterate at most once.

ok i see, i suppose if we don't do this at indexer startup, then the first time we hit this loop we may be doing this for a while

wlmyng · 2024-08-09T15:35:45Z

crates/sui-indexer/src/indexer.rs

+        // If we already have chain identifier indexed (i.e. the first checkpoint has been indexed),
+        // then we persist protocol configs for protocol versions not yet in the db.
+        // Otherwise, we would do the persisting in `commit_checkpoint` while the first cp is
+        // being indexed.
+        if let Some(chain_id) = store.get_chain_identifier().await? {
+            store.persist_protocol_configs_and_feature_flags(chain_id)?;
+        }
+


is this really necessary since it'll get picked up on the committer side?

The target use case for this one is for when an up-to-date db (like our backfill dbs) wants to add protocol configs tables for the first time. If we don't do it here, we'll need to wait until epoch change.

wlmyng · 2024-08-09T15:37:30Z

crates/sui-indexer/src/store/pg_indexer_store.rs

+
+        // We only store protocol configs until the version for the latest epoch, because config
+        // values for future epochs, even when they already exist, are subject to change.
+        let end_version = self.get_latest_epoch_protocol_version()?.unwrap_or(1);


if we can avoid this call at indexer startup time then it looks like we can pass the value in?

wlmyng · 2024-08-09T15:37:58Z

crates/sui-indexer/src/store/pg_indexer_store.rs

+
+        // We only store protocol configs until the version for the latest epoch, because config
+        // values for future epochs, even when they already exist, are subject to change.
+        let end_version = self.get_latest_epoch_protocol_version()?.unwrap_or(1);


maybe a comment explaining the diff between this and start_version?

wlmyng · 2024-08-09T15:40:29Z

crates/sui-indexer/src/store/pg_indexer_store.rs

+        );
+
+        // Gather all protocol configs and feature flags for all versions between start and end.
+        for version in start_version..=end_version {


ok i see, i suppose if we don't do this at indexer startup, then the first time we hit this loop we may be doing this for a while

wlmyng · 2024-08-09T15:46:15Z

crates/sui-indexer/src/handlers/committer.rs

+    if is_epoch_end {
+        // The epoch has advanced so we update the configs for the new protocol version, if it has changed.
+        let chain_id = state
+            .get_chain_identifier()
+            .await
+            .expect("Failed to get chain identifier")
+            .expect("Chain identifier should have been indexed at this point");
+        let _ = state.persist_protocol_configs_and_feature_flags(chain_id);
+    }


should we do this before we persist_checkpoints, maybe right after state.advance_epoch under if let Some(epoch_data) = epoch?

We actually have to do this after persist_checkpoints, in the event where the first epoch is committed before the first checkpoints. In that scenario, chain identifier doesn't exist yet in the db when we advance epoch.

amnn

Some nits, but looks good!

amnn · 2024-08-12T14:04:16Z

crates/sui-graphql-rpc/src/types/protocol_config.rs

@@ -30,7 +30,9 @@ pub(crate) struct ProtocolConfigFeatureFlag {

 #[derive(Clone, Debug)]
 pub(crate) struct ProtocolConfigs {
-    native: NativeProtocolConfig,
+    version: u64,
+    configs: BTreeMap<String, Option<String>>,


Do we need the Option here? (Maybe the rest of the PR will answer this question for me).

(Attempting to answer this for myself -- our GraphQL schema differentiates between a config that doesn't exist at any version, and a config that exists but hasn't been set, so that's why we need this Option).

Close. But it's not due to GraphQL differentiating that - protocol config values themselves are Options, likely due to the reason you mentioned.

amnn · 2024-08-12T14:08:45Z

crates/sui-graphql-rpc/src/types/protocol_config.rs

+                ))
+            })?;
+
+        let version = protocol_version.unwrap_or(latest_version as u64);


This code always issues the read to get the latest protocol version, even if you don't use it, you can avoid that with:

let version = if let Some(version) = protocol_version { version } else { let latest: i64 = db .execute(move |conn| { /* ... */ }) .await .map_err(|e| { /* ... */ })?; latest as u64 }

amnn · 2024-08-12T14:10:41Z

crates/sui-graphql-rpc/src/types/protocol_config.rs

+
+        let version = protocol_version.unwrap_or(latest_version as u64);
+
+        let configs: BTreeMap<String, Option<String>> = db


Is there a neat way to get all this data in a single DB query? We should at least re-use the connection.

We could implement a DataLoader for this, but that's not a blocker for this PR.

## Description This PR adds two tables `protocol_configs` and `feature_flags` that get populated at indexer startup time and every epoch change time if protocol version has changed. ## Test plan Tested locally against devnet. --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [x] Indexer: adds two new indexer tables that stores protocol configs and features flags of different versions. - [ ] JSON-RPC: - [x] GraphQL: uses the stored data to query for protocol configs instead of native configs stored in the binary. - [ ] CLI: - [ ] Rust SDK:

vercel bot deployed to Preview – sui-docs June 28, 2024 06:01 View deployment

emmazzz force-pushed the emma/add-protocol-configs branch from 45218c2 to 61ddce9 Compare June 28, 2024 06:37

emmazzz marked this pull request as ready for review June 28, 2024 06:37

emmazzz requested review from gegaowp, amnn and wlmyng June 28, 2024 06:37

vercel bot deployed to Preview – sui-docs June 28, 2024 06:38 View deployment

amnn reviewed Jun 28, 2024

View reviewed changes

wlmyng reviewed Jun 28, 2024

View reviewed changes

crates/sui-indexer/src/store/pg_indexer_store.rs Outdated Show resolved Hide resolved

crates/sui-indexer/src/store/pg_indexer_store.rs Outdated Show resolved Hide resolved

gegaowp reviewed Jul 3, 2024

View reviewed changes

emmazzz force-pushed the idx-breaking-change-park branch from ef8a189 to aa76e71 Compare August 3, 2024 00:54

emmazzz requested review from stefan-mysten and suiwombat as code owners August 3, 2024 00:54

emmazzz force-pushed the emma/add-protocol-configs branch from 61ddce9 to 53a0188 Compare August 7, 2024 13:13

emmazzz requested review from a team, ronny-mysten and mystenmark as code owners August 7, 2024 13:13

emmazzz changed the base branch from idx-breaking-change-park to main August 7, 2024 13:13

emmazzz removed request for a team, mystenmark and ronny-mysten August 7, 2024 13:13

vercel bot deployed to Preview – sui-docs August 7, 2024 13:23 View deployment

wlmyng approved these changes Aug 9, 2024

View reviewed changes

vercel bot deployed to Preview – sui-docs August 11, 2024 23:17 View deployment

amnn approved these changes Aug 12, 2024

View reviewed changes

emmazzz force-pushed the emma/add-protocol-configs branch from 97ee85e to 838381a Compare August 20, 2024 18:29

vercel bot deployed to Preview – sui-docs August 20, 2024 18:34 View deployment

vercel bot deployed to Preview – sui-docs August 20, 2024 19:18 View deployment

vercel bot deployed to Preview – sui-docs August 21, 2024 05:48 View deployment

emmazzz force-pushed the emma/add-protocol-configs branch from 0e02187 to b39c45c Compare August 21, 2024 06:08

vercel bot deployed to Preview – sui-docs August 21, 2024 06:12 View deployment

emmazzz added 6 commits August 21, 2024 07:23

[indexer] index protocol configs and feature flags

1fbbd27

[graphql] use indexed protocol configs

89c0323

use new chain identifier table

4a4245d

address comments

741fc9c

address more comments

0b130ee

fix mysql clippy

d4b6fee

emmazzz force-pushed the emma/add-protocol-configs branch from b39c45c to d4b6fee Compare August 21, 2024 14:23

vercel bot deployed to Preview – sui-docs August 21, 2024 14:27 View deployment

increase data ingestion test running time

0535b78

vercel bot deployed to Preview – sui-docs August 21, 2024 18:04 View deployment

emmazzz merged commit b1976e6 into main Aug 21, 2024
48 checks passed

emmazzz deleted the emma/add-protocol-configs branch August 21, 2024 20:34


		let version = protocol_version.unwrap_or(latest_version as u64);

		let configs: BTreeMap<String, Option<String>> = db

[indexer] index protocol configs and feature flags #18450

[indexer] index protocol configs and feature flags #18450

Conversation

emmazzz commented Jun 28, 2024 • edited Loading

Description

Test plan

Release notes

vercel bot commented Jun 28, 2024 • edited Loading

amnn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amnn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

emmazzz commented Jun 28, 2024 •

edited

Loading

vercel bot commented Jun 28, 2024 •

edited

Loading