Full chain archive sync at protocol level #3092

johndavies24 · 2019-10-08T13:36:49Z

The fact that there is no trustless, permissionless method of full blockchain archive sync creates an unfair data disparity of valuable information. At the very least, this creates a barrier if entry to building blockchain tools. At the very worst, this data has value that can be used to the benefit or detriment of users.

Describe the solution you'd like
Archive nodes should have to opt-out of sharing rather than opt-in. As in only archive nodes not participating in outgoing communications should be able to avoid sharing their blocks. Maybe refusal to share archive blocks should result in protocol level blacklisting of the node and all archive syncs should be reconciled to the largest dataset so all archive nodes have the same data.

Describe alternatives you've considered
Until there is time/bandwidth to code this a full archive snapshot could be hosted by grin or grin community members.

antiochp · 2019-10-09T12:07:31Z

Thanks for opening this issue.

Archive nodes should have to opt-out of sharing rather than opt-in.

Agreed. 👍 When we were discussing opt-in/opt-out I was thinking in terms of archive/non-archive. I was assuming that archive nodes would opt-in by default to sharing historical blocks. But you would need to explicitly opt-in to being an archive node in the first place.

all archive syncs should be reconciled to the largest dataset so all archive nodes have the same data.

Agreed. I think this will naturally happen if we make the sync robust enough. i.e. You keep asking other archive nodes for missing blocks and eventually they propagate through the network. If I receive a block that I was originally missing I can now make this available to others etc.

Maybe refusal to share archive blocks should result in protocol level blacklisting of the node

We may want to do this but it may be hard to do reliably until a majority of archive nodes have reconciled to the largest data set. Otherwise we cannot reliably differentiate between refusal to provide these blocks and inability to provide them as they are missing.

We don't ban regular nodes for refusing to provide blocks as this may be for a variety of reasons. We only ban currently if peers provide "bad" data, i.e. invalid blocks.

johndavies24 · 2019-10-09T16:23:11Z

I'm not exactly sure how nodes store data during any of the dandelion phases. But one of the things I meant by largest dataset is to synchronize data that a node might have captured prior to any tx aggregation would have to share that information. But I really don't understand how it works or if this idea is valid at all

antiochp · 2019-10-10T08:17:56Z

You want to consider including tx data in scope for "archive"?
Interesting - definitely needs some more thought around this.

I think it would be technically possible to do this for "fluffed" (post Dandelion) broadcast transactions.

I suspect we would not want to consider Dandelion stem phase txs "in scope" here - it would defeat one of the aims of Dandelion. Only a limited subset of nodes (on the stem path) see a particular unaggregated tx and that is by design.

If archive nodes were to start archiving these and sharing this information we would quickly find that non-archive nodes would simply refuse to relay to archive nodes when stemming transactions.
Archive nodes would be excluded from participating in Dandelion and would never see these txs.

DavidBurkett · 2019-10-10T08:41:11Z

I doubt anyone other than chainalysis and the NSA would even still have the original tx boundaries. Likely more hassle than it's worth. I like the idea of providing archive sync because it allows anyone to run a block explorer, validate the history, etc. But I don't think we should go out of our way to leak more privacy than a typical block explorer would.

johndavies24 · 2019-10-10T11:56:03Z

Only a limited subset of nodes (on the stem path) see a particular unaggregated tx and that is by design.

Do these nodes store this data? Obviously they can with custom code but is it possible with the software offered by grin github? If they dont then I dont want this information to become stored in future releases.

I guess this phase is prior to making it into a block so it shouldnt create a situation where one archive node has different data than another archive node.

If archive nodes were to start archiving these and sharing this information we would quickly find that non-archive nodes would simply refuse to relay to archive nodes when stemming transactions.
Archive nodes would be excluded from participating in Dandelion and would never see these txs.

This would be a good idea if any of this data is stored without custom code/scripts. We cant really stop people from finding non-supported solutions for acquiring more data and it's pointless to try because "non-archive" nodes could just build custom archive solutions from the data their nodes sees.

My issue proposal is only within the scope of "officially" supported data storage for archive nodes.

antiochp · 2019-10-10T15:45:27Z

Do these nodes store this data?

Ok no - official nodes only maintain the txpool (and stempool) in memory. So if you only want to keep things in scope for "archive" that are stored (presumably on disk) then we only really care about block data.

MCM-Mike · 2020-03-30T17:11:54Z

I am also in favor or having the possibility to run an archive node as @DavidBurkett said "I like the idea of providing archive sync because it allows anyone to run a block explorer, validate the history, etc."

We had the idea of posting monthly snap-shots but would make more sense have it build in to opt-in for a full archive-node.

JustAResearcher · 2020-03-30T17:50:20Z

I am also in favor or having the possibility to run an archive node as @DavidBurkett said "I like the idea of providing archive sync because it allows anyone to run a block explorer, validate the history, etc."

We had the idea of posting monthly snap-shots but would make more sense have it build in to opt-in for a full archive-node.

Agreed.

MCM-Mike · 2020-03-31T16:28:09Z

A general decision has been made by the dev meeting today: mimblewimble/grin-pm#248

Out of scope

9	P?	Node	Full chain archive sync at protocol level	#3092	No dev taking task for 4.0.0. Not consensus breaking, can be done in future release.

Lets try it again at next release

.
General question:
As its not implemented in Grin v4.0.0, is it even technically possible to sync older blocks once its being implemented?

lehnberg · 2020-03-31T18:03:43Z

Lets try it again at next release

In the meanwhile, a strong way to improve the chances of getting this implemented is to begin an RFC writing process, with motivation, alternatives, high level requirements, pros and cons, and try to begin building consensus around a particular approach.

antiochp added enhancement help wanted research labels Oct 9, 2019

MCM-Mike mentioned this issue Mar 31, 2020

Release Planning: Grin v4.0.0 mimblewimble/grin-pm#248

Closed

antiochp mentioned this issue Feb 2, 2021

Tracking Issue - Full (Block) Archive Support #3552

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Full chain archive sync at protocol level #3092

Full chain archive sync at protocol level #3092

johndavies24 commented Oct 8, 2019

antiochp commented Oct 9, 2019

johndavies24 commented Oct 9, 2019

antiochp commented Oct 10, 2019

DavidBurkett commented Oct 10, 2019

johndavies24 commented Oct 10, 2019

antiochp commented Oct 10, 2019

MCM-Mike commented Mar 30, 2020

JustAResearcher commented Mar 30, 2020

MCM-Mike commented Mar 31, 2020

lehnberg commented Mar 31, 2020

Full chain archive sync at protocol level #3092

Full chain archive sync at protocol level #3092

Comments

johndavies24 commented Oct 8, 2019

antiochp commented Oct 9, 2019

johndavies24 commented Oct 9, 2019

antiochp commented Oct 10, 2019

DavidBurkett commented Oct 10, 2019

johndavies24 commented Oct 10, 2019

antiochp commented Oct 10, 2019

MCM-Mike commented Mar 30, 2020

JustAResearcher commented Mar 30, 2020

MCM-Mike commented Mar 31, 2020

lehnberg commented Mar 31, 2020