Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: ability to prune the old ancient blockchain data #26596

Closed
jsvisa opened this issue Feb 3, 2023 · 1 comment
Closed

Feature request: ability to prune the old ancient blockchain data #26596

jsvisa opened this issue Feb 3, 2023 · 1 comment

Comments

@jsvisa
Copy link
Contributor

jsvisa commented Feb 3, 2023

Rationale

I'm running a new snap-sync node, after the syncing progress, found the local chaindata consuming 800+GB, and half of the disk is used to store the ancient data:

$ du --max-depth=1 -h data/geth/chaindata
422G    data/geth/chaindata/ancient
830G    data/geth/chaindata

The old ancient data is useless in most cases, so if we support the ancient data pruning, we can use fewer disks.

Implementation

Seems the binance smartchain has supported this feature(merged in #543) maybe we can backport this feature into go-ethereum.

$ ./bin/bsc snapshot prune-block --help
prune-block [command options]

geth offline prune-block for block data in ancientdb.
The amount of blocks expected for remaining after prune can be specified via block-amount-reserved in this command,
will prune and only remain the specified amount of old block data in ancientdb.
the brief workflow is to backup the the number of this specified amount blocks backward in original ancientdb
into new ancient_backup, then delete the original ancientdb dir and rename the ancient_backup to original one for replacement,
finally assemble the statedb and new ancientDb together.
The purpose of doing it is because the block data will be moved into the ancient store when it
becomes old enough(exceed the Threshold 90000), the disk usage will be very large over time, and is occupied mainly by ancientDb,
so it's very necessary to do block data prune, this feature will handle it.


ETHEREUM OPTIONS:
                                      --datadir value                       Data directory for the databases and keystore (default: "/home/amber/.ethereum")
                                      --datadir.ancient value               Data directory for ancient chain segments (default = inside chaindata, '${datadir}/geth/chaindata/ancient/')
                                      --block-amount-reserved value         Sets the expected remained amount of blocks for offline block prune (default: 0)
                                      --triesInMemory value                 The layer of tries trees that keep in memory (default: 128)
                                      --check-snapshot-with-mpt             Enable checking between snapshot and MPT
@rjl493456442
Copy link
Member

rjl493456442 commented Feb 3, 2023

There is an EIP for it https://eips.ethereum.org/EIPS/eip-4444, the challenge of it is how can we have a strong guarantee that the dropped historical chain data can still be retrievable. I believe this challenge is not resolved yet.

While as a short-term solution, you can specify the ancient directory to a HDD-based location, it's still performant enough (our freezer design has O(1) read/write complexity) but kind of cheaper to use HDD.

@karalabe karalabe closed this as completed Feb 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants