-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MTG-12] merging the asset data into a single column with change to flatbuffer for serialization #298
base: main
Are you sure you want to change the base?
Conversation
* feat: add more metrics to the synchronizer * feat: more metrics
* feat: move fung dump to different thread * feat: add more metrics
* feat: save fungible updates in regular sync * feat: change migration file and add indexes drop/create during dump load * fix: fungible tokens * chore: code style
Now all the assets-related data is persisted in a single column
…ment over the reference, or over 50% over the initial merge
Current benchmarking results for 1M assets: Dumping Group/2k batch size time: [4.8857 s 4.9288 s 4.9785 s] Dumping Group/get_assets time: [5.0407 ms 5.0694 ms 5.0968 ms] Dumping Group/get_assets_individually (taking 1000 assets in a loop one after another) time: [31.137 ms 33.730 ms 38.050 ms]
…MTG-12-flatbuffer-merge # Conflicts: # nft_ingester/src/api/dapi/change_logs.rs # rocks-db/src/batch_client.rs # rocks-db/src/dump_client.rs # rocks-db/src/storage_traits.rs
rocks-db/Cargo.toml
Outdated
@@ -41,6 +41,7 @@ usecase = { path = "../usecase" } | |||
tempfile = { workspace = true } | |||
bubblegum-batch-sdk = { git = "https://github.com/metaplex-foundation/bubblegum-batch-sdk.git", rev = "0d529f5" } | |||
num-traits = { workspace = true } | |||
flatbuffers = { version="24.3.25", features = ["serialize"]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This version is used as the generated code seems to be compatible with it, but not with 23... used in the project. It's either the code may be regenerated with 23, or existing updated to 24, which I'm not aware will be easy/possible.
@@ -377,6 +377,18 @@ where | |||
.collect::<Vec<_>>() | |||
} | |||
|
|||
pub fn pairs_iterator<'a>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a handy method that's unused by this PR, it was used for iterations with a previous refactor
@@ -74,6 +94,48 @@ impl Storage { | |||
.await?; | |||
Ok(()) | |||
} | |||
|
|||
pub async fn apply_migration_merge(&self) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This migration is not fully integrated and is not called by default.
Added slot to data generation in tests added a valid onchain data in test generation fixed a no longer valid test
…etadata. fixed another test setup
…MTG-12-flatbuffer-merge + updated the data read to get dynamic details # Conflicts: # rocks-db/src/bin/leaf_checker/main.rs
…MTG-12-flatbuffer-merge # Conflicts: # rocks-db/Cargo.toml
…MTG-12-flatbuffer-merge # Conflicts: # nft_ingester/tests/process_accounts.rs
This PR restructures the asset data to store it within a single column, transitioning from the previous format to Flatbuffers for serialization. Key enhancements include:
This rework aims for more efficient data handling and improved asset synchronization across large datasets.
This PR introduces performance improvements to asset retrieval and database operations, as well as new merging capabilities. Here is a breakdown of key benchmarks comparing the old code (pre-optimization) and the updated code (post-optimization):
New Merging Functionality and Performance
The PR also adds a merging function:
Note on Downgrade: There is a slight performance downgrade in merge performance when using the Flatbuffer object due to the simpler merger’s handling complexity. However, this adds functionality not previously available.
In summary, this PR yields substantial performance gains in batch and individual asset retrieval while introducing new merging options, particularly useful for scenarios requiring optimized data processing and merging flexibility within large datasets.
Some benchmarks to include:
old code before performing:
Dumping Group/2k batch size (db with 1M assets)
time: [10.282 s 10.352 s 10.430 s]
get_assets (1000 assets batch get)
time: [9.0335 ms 9.4210 ms 9.7307 ms]
get_assets_individually (1000 gets of asset)
time: [82.344 ms 85.425 ms 90.889 ms]
get_asset_indexes
time: [5.6599 ms 5.7022 ms 5.7292 ms]
after the refactoring:
Dumping Group/2k batch size
time: [5.2559 s 5.3231 s 5.3912 s]
get_assets
time: [5.1584 ms 5.2776 ms 5.4180 ms]
get_assets_individually
time: [29.979 ms 33.406 ms 36.333 ms]
get_asset_indexes
time: [4.4278 ms 4.4707 ms 4.5293 ms]
There is a downgrade in merge performance - 1000 merges for the same object:
merge with a bincode object - similar to previous as the old code didn't have a merge for the same objects
time: [1.6429 ms 1.6450 ms 1.6471 ms]
merge with a flatbuffer object via a simpler merger - the new code
time: [4.1065 ms 4.1116 ms 4.1171 ms]
Note: percent values on screenshots should be ignored, only the time measurements are relevant