Using warp sync on parachain triggers OOM #5053

Dinonard · 2024-07-17T15:28:05Z

Is there an existing issue?

I have searched the existing issues

Experiencing problems? Have you tried our Stack Exchange first?

This is not a support question.

Description of bug

As the title suggest, using warp sync on parachain causes OOM, crashing the client.

We've had this problem on Astar for a few months, and have recently uplifted to polkadot-sdk version v1.9.0 but are still seeing the problem. There are no outstanding traces in the log, it just explodes at some point.

There's an issue opened in our repo, AstarNetwork/Astar#1110, with steps to reproduce as well as images of resource consumption before the crash.

We haven't been able to find similar issues or discussion related to the topic.

Steps to reproduce

Run latest Astar node as described in the linked issue.

The text was updated successfully, but these errors were encountered:

bkchr · 2024-07-22T08:45:25Z

Hey,
This is basically a duplicate of: #4

The problem is that right now we keep the entire state in memory while downloading. For chains that have a big state (not sure what the state size of Astar is), this can lead to OOM on machines with not enough main memory.

Dinonard · 2024-07-22T08:55:46Z

Hey @bkchr,
I'm not sure this is the same as the issue you linked - the memory consumption suddenly spikes, it doesn't grow slowly over the time.

Please check the image here.
There are huge spikes that happen abruptly.

bkchr · 2024-07-22T09:06:24Z

Hmm yeah. I still think it is related to having everything in memory.

Did you try to run the node with gdb? To get some stacktrace when it OOMs?

Dinonard · 2024-07-22T09:08:12Z

We haven't but I'll ask our devops team to do that and I'll post the traces here.

bkchr · 2024-07-22T09:11:16Z

Ty.

Dinonard · 2024-07-24T14:52:29Z

Does this help?
threads.log

We've used this command to generate it:
sudo gdb --batch --quiet -ex "thread apply all bt full" -p <...>

bkchr · 2024-07-24T18:28:43Z

@Dinonard but this is not from the point when OOMs. You need to run the node with gdb all the time attached.

Dinonard · 2024-07-25T06:22:32Z

TBH I checked the logs myself and couldn't see anything wrong, but AFAIK this was run immediately when the node started. Let me get back to you.

Dinonard · 2024-07-26T09:41:52Z

I've run it on the same server as before with gdb now properly encapsulating the entire service.
And it's weird:
Thread 12 "tokio-runtime-w" received signal SIGPIPE, Broken pipe.

gdb_dump.log

Sorry about the missing symbols but I haven't used GDB with any Rust app before.
I figured that building with debug flag (-g) would include the debug info (and symbols?) into the binary itself but it doesn't seem to have helped.

liuchengxu · 2024-08-19T08:59:47Z

Just to share my experiences on the state sync with a chain having a huge state. The subcoin node crashed the first time when importing the downloaded state at a certain height. And then I reran it with the same command syncing the state at the same height, with gdb attached. Unfortunately (or fortunately :P), it didn't crash with gdb. I observed that this successful state importing almost absorbed my entire memory (my machine has 128GiB of memory) at the peak.

2024-08-19 16:35:01.747  INFO tokio-runtime-worker sync: State sync is complete (5438 MiB), restarting block sync.

In light of recent developments, it has become evident that fully syncing to the tip of the Bitcoin network and enabling new nodes to perform fast sync to the latest Bitcoin state is more challenging than initially anticipated, caused by the huge state of UTXO set (over 12GiB). As a result, I propose adjusting the delivery goal for this milestone. The most significant known blocker is paritytech/polkadot-sdk#4. Other underlying issues may also contribute to the difficulty. Recent experiments have shown that fast sync from around block height 580,000 is currently infeasible, succeeding only on machines with 128GiB of memory (paritytech/polkadot-sdk#5053 (comment)), which is impractical for most users. Nevertheless, we have successfully demonstrated that decentralized fast sync is possible within a prototype implementation. While syncing to the Bitcoin network's tip remains a future target, addressing the existing technical challenges will require substantial R&D efforts. We remain committed to exploring potential solutions, including architectural changes and contributing to resolving issue paritytech/polkadot-sdk#4,

liuchengxu · 2024-08-29T03:15:23Z

diff --git a/substrate/primitives/trie/src/lib.rs b/substrate/primitives/trie/src/lib.rs
index ef6b6a5743..e0a2cf3b30 100644
--- a/substrate/primitives/trie/src/lib.rs
+++ b/substrate/primitives/trie/src/lib.rs
@@ -296,23 +296,30 @@ where
        V: Borrow<[u8]>,
        DB: hash_db::HashDB<L::Hash, trie_db::DBValue>,
 {
-       {
+       // {
                let mut trie = TrieDBMutBuilder::<L>::from_existing(db, &mut root)
                        .with_optional_cache(cache)
                        .with_optional_recorder(recorder)
                        .build();

+               tracing::info!("====================== Collecting delta");
                let mut delta = delta.into_iter().collect::<Vec<_>>();
+               tracing::info!("====================== Finished Collecting delta: {}", delta.len());
                delta.sort_by(|l, r| l.0.borrow().cmp(r.0.borrow()));
+               tracing::info!("====================== Sorted delta");

-               for (key, change) in delta {
+               tracing::info!("====================== Starting to write trie, mem usage: {:.2?}GiB", memory_stats::memory_stats().map(|usage| usage.physical_mem as f64 / 1024.0 / 1024.0 / 1024.0));
+               for (index, (key, change)) in delta.into_iter().enumerate() {
                        match change.borrow() {
                                Some(val) => trie.insert(key.borrow(), val.borrow())?,
                                None => trie.remove(key.borrow())?,
                        };
                }
-       }
+               tracing::info!("====================== Finished writing delta to trie, mem usage: {:.2?}GiB", memory_stats::memory_stats().map(|usage| usage.physical_mem as f64 / 1024.0 / 1024.0 / 1024.0));
+               drop(trie);
+       // }

+       tracing::info!("====================== End of delta_trie_root, mem usage: {:.2?}GiB", memory_stats::memory_stats().map(|usage| usage.physical_mem as f64 / 1024.0 / 1024.0 / 1024.0));
        Ok(root)
 }

2024-08-29 10:52:53 ====================== Collecting delta
2024-08-29 10:52:55 ====================== Finished Collecting delta: 104674943
2024-08-29 10:52:56 ====================== Sorted delta
2024-08-29 10:52:57 ====================== Starting to write trie, mem usage: Some(26.23)GiB
2024-08-29 10:54:09 ====================== Finished writing delta to trie, mem usage: Some(76.36)GiB
2024-08-29 11:00:28 ====================== End of delta_trie_root, mem usage: Some(90.53)GiB

I added some logging for the memory usage in the block import pipeline. It turned out that https://github.com/subcoin-project/polkadot-sdk/blob/13ca1b64692b05b699f49e729d0522ed4be730b9/substrate/primitives/trie/src/lib.rs#L285 is the culprit. The memory usage surged from 26 GiB to 76 GiB after the trie was built. Importing the same state does not always succeed, if it crashes due to OOM, Finished writing delta to trie... would be printed and then it crashes, End of delta_trie_root won't be printed. Any clue for the further investigation and potential fix? @bkchr

liuchengxu · 2024-09-03T01:35:26Z

Constructing the entire trie from the state at a specific height in memory seems to be the primary cause of the OOM issue. This is a critical design flaw, in my opinion, especially since the chain state will continue to grow over time. While Polkadot may be fine for now, it will inevitably face the same problem in the long run if we don't address this. Please prioritize pushing this issue forward. @bkchr

bkchr · 2024-09-03T06:55:50Z

Constructing the entire trie from the state at a specific height in memory seems to be the primary cause of the OOM issue.

Yeah sounds reasonable. I need to think about on how to improve this, but yeah this should not happen ;)

liuchengxu · 2024-09-24T06:10:26Z

Hey @bkchr, I understand this is a non-trivial issue, but I wanted to highlight that it’s a critical blocker for the Subcoin fast sync feature. I'm eager to collaborate closely with the Parity team to help push this forward. Let me know how I can contribute!

bkchr · 2024-09-24T07:08:49Z

Yeah that would be nice!

I looked briefly into it. I still want to solve this with #4 together. I have the following rough plan:

Support updating the state of a block. This means instead of importing the entire state with reset_storage in db/lib.rs we are supporting to add new keys to the same state and recalculate the state root. Currently the state root etc is build here. Instead of doing it there, we should forward the key/value to the db layer and then we insert them there to the trie. Updating the trie on the "fly".
Instead of collecting the intermediate key/value pairs in the state sync handler in memory, we send them to the db directly. This way we should solve both problems.
When state sync is finished, we import the header and ensure that the state root matches.

@liuchengxu Do you think that you could start looking into this? I think starting with the db part should be doable in parallel and can be its own pr.

liuchengxu · 2024-09-24T13:39:25Z

@bkchr This makes sense to me, I'll look into the part of updating the state directly using the new keys.

Dinonard added I10-unconfirmed Issue might be valid, but it's not yet known. I2-bug The node fails to follow expected behavior. labels Jul 17, 2024

bkchr closed this as completed Jul 22, 2024

bkchr reopened this Jul 22, 2024

liuchengxu mentioned this issue Aug 22, 2024

Amendment for Subcoin milestone3 w3f/Grants-Program#2376

Merged

bkchr added this to SDK Node Sep 3, 2024

github-project-automation bot moved this to backlog in SDK Node Sep 3, 2024

liuchengxu mentioned this issue Sep 19, 2024

Tracking Issue: Full Fast Sync Support subcoin-project/subcoin#56

Open

5 tasks

liuchengxu mentioned this issue Sep 29, 2024

Support updating the trie changes directly into the database #5862

Open

liuchengxu mentioned this issue Oct 26, 2024

Persistent State Storage for Improved State Syncing #4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using warp sync on parachain triggers OOM #5053

Using warp sync on parachain triggers OOM #5053

Dinonard commented Jul 17, 2024 •

edited

Loading

bkchr commented Jul 22, 2024

Dinonard commented Jul 22, 2024

bkchr commented Jul 22, 2024

Dinonard commented Jul 22, 2024

bkchr commented Jul 22, 2024

Dinonard commented Jul 24, 2024

bkchr commented Jul 24, 2024

Dinonard commented Jul 25, 2024

Dinonard commented Jul 26, 2024

liuchengxu commented Aug 19, 2024

liuchengxu commented Aug 29, 2024

liuchengxu commented Sep 3, 2024

bkchr commented Sep 3, 2024

liuchengxu commented Sep 24, 2024

bkchr commented Sep 24, 2024

liuchengxu commented Sep 24, 2024

Using warp sync on parachain triggers OOM #5053

Using warp sync on parachain triggers OOM #5053

Comments

Dinonard commented Jul 17, 2024 • edited Loading

Is there an existing issue?

Experiencing problems? Have you tried our Stack Exchange first?

Description of bug

Steps to reproduce

bkchr commented Jul 22, 2024

Dinonard commented Jul 22, 2024

bkchr commented Jul 22, 2024

Dinonard commented Jul 22, 2024

bkchr commented Jul 22, 2024

Dinonard commented Jul 24, 2024

bkchr commented Jul 24, 2024

Dinonard commented Jul 25, 2024

Dinonard commented Jul 26, 2024

liuchengxu commented Aug 19, 2024

liuchengxu commented Aug 29, 2024

liuchengxu commented Sep 3, 2024

bkchr commented Sep 3, 2024

liuchengxu commented Sep 24, 2024

bkchr commented Sep 24, 2024

liuchengxu commented Sep 24, 2024

Dinonard commented Jul 17, 2024 •

edited

Loading