Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eds: Blockstore provided by the Store should support Put #2424

Closed
Tracked by #1099
Wondertan opened this issue Jul 3, 2023 · 2 comments · Fixed by #2532
Closed
Tracked by #1099

eds: Blockstore provided by the Store should support Put #2424

Wondertan opened this issue Jul 3, 2023 · 2 comments · Fixed by #2532
Assignees

Comments

@Wondertan
Copy link
Member

Wondertan commented Jul 3, 2023

Context

During the design and implementation of #1099, we missed the necessity of caching the inner and leaf nodes during reconstruction via the Blockservice, leading to Bitswap having no access to them. This is due to all the Puts being discarded on the Blockstore.

Problems

  • FNs do not share proofs and data during reconstruction between each other.
  • FNs cannot generate BEFPs for withheld blocks.

Solution

The proposed solution here is changing Blockstore to support Put operations:

  • Blockstore over on-disk Datastore

    • Dagstore's Blockstore would write to disk and, on reads, check the index first and then the Datastore.
    • The default badger Datastore scales well for write-heavy loads, and considering sparse read patterns for reconstruction and BEFP, it should also suit read requirements.
    • The only complication is clean-up, which can be naively solved by iterating over each stored piece with the DeleteBlock method on Blockstore at the end of the reconstruction session(or even after we put an indexed eds to avoid interruption of serving data to other nodes over bitswap)
  • Blockstore over in-mem Datastore

    • Similar cleanup complication.
    • Similar implementation time estimate as for on-disk
    • Less robust. Does not tolerates restarts.

Testing

We have a (flaky) Swamp reconstruction test that asserts that a FN can reconstruct a block from light nodes only, but we also need to add a test similar to the test below, which validates that FNs can collectively rebuild the block.

func TestShareAvailable_DisconnectedFullNodes(t *testing.T) {

Once we know that this works fine in Swamp, we should also replicate the same scenario on TG.

@musalbas
Copy link
Member

musalbas commented Jul 3, 2023

I assume that Put would only be called if shrexeds fails to get the full ODS, thus falling back to sampling anyway, right?

@Wondertan
Copy link
Member Author

@musalbas, right. The sampling/reconstruction is started once we cannot get ODS after a while.

@musalbas musalbas mentioned this issue Jul 25, 2023
3 tasks
distractedm1nd added a commit that referenced this issue Aug 28, 2023
Closes #2424 

Alternative Design Discussions:
- Using in-memory instead of in on disk: During an unexpected shutdown
or in the case of AsyncGetter, cleanup would not occur and blocks would
be stuck in the store indefinitely. Using an in-memory blockstore would
fix this issue upon restarts
- Using a local Blockgetter to pass to NewErrByzantine: Instead of
putting the blocks into the EDS blockstore, the retrieval session could
make an in-memory blockstore on the fly to hand to NewErrByzantine when
needed. This would look cleaner in the code/make sense architecturally,
but it would mean full nodes cannot share these shares with each other
during reconstruction.
walldiss pushed a commit to walldiss/celestia-node that referenced this issue Sep 22, 2023
Closes celestiaorg#2424

Alternative Design Discussions:
- Using in-memory instead of in on disk: During an unexpected shutdown
or in the case of AsyncGetter, cleanup would not occur and blocks would
be stuck in the store indefinitely. Using an in-memory blockstore would
fix this issue upon restarts
- Using a local Blockgetter to pass to NewErrByzantine: Instead of
putting the blocks into the EDS blockstore, the retrieval session could
make an in-memory blockstore on the fly to hand to NewErrByzantine when
needed. This would look cleaner in the code/make sense architecturally,
but it would mean full nodes cannot share these shares with each other
during reconstruction.

(cherry picked from commit 196e849)
walldiss pushed a commit that referenced this issue Sep 22, 2023
Closes #2424

Alternative Design Discussions:
- Using in-memory instead of in on disk: During an unexpected shutdown
or in the case of AsyncGetter, cleanup would not occur and blocks would
be stuck in the store indefinitely. Using an in-memory blockstore would
fix this issue upon restarts
- Using a local Blockgetter to pass to NewErrByzantine: Instead of
putting the blocks into the EDS blockstore, the retrieval session could
make an in-memory blockstore on the fly to hand to NewErrByzantine when
needed. This would look cleaner in the code/make sense architecturally,
but it would mean full nodes cannot share these shares with each other
during reconstruction.

(cherry picked from commit 196e849)
walldiss pushed a commit that referenced this issue Sep 25, 2023
Closes #2424

Alternative Design Discussions:
- Using in-memory instead of in on disk: During an unexpected shutdown
or in the case of AsyncGetter, cleanup would not occur and blocks would
be stuck in the store indefinitely. Using an in-memory blockstore would
fix this issue upon restarts
- Using a local Blockgetter to pass to NewErrByzantine: Instead of
putting the blocks into the EDS blockstore, the retrieval session could
make an in-memory blockstore on the fly to hand to NewErrByzantine when
needed. This would look cleaner in the code/make sense architecturally,
but it would mean full nodes cannot share these shares with each other
during reconstruction.

(cherry picked from commit 196e849)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants