Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts on RFCBBL1201 #26

Open
hannahhoward opened this issue Nov 5, 2020 · 0 comments
Open

Thoughts on RFCBBL1201 #26

hannahhoward opened this issue Nov 5, 2020 · 0 comments

Comments

@hannahhoward
Copy link
Contributor

Codifying some thoughts I conveyed to @adlrocha in our conversation about RFCBBL1201.

  1. At a base level, I feel this is the right approach. Having a discovery layer, independent of Bitswap and Graphsync is key to mix/matching the protocols effectively. To effectively split a request between the two protocols, you have to know something about who has what.

  2. Some things to know about selectors, especially as it relates to WANTS:
    a. Absent cached information, determining if I can serve a CID + a selector carries a non-trivial CPU + disk cost. For example, to determine if I have CID QMxxx + a selector for the whole DAG beneath it, we must perform the traversal of the whole dag locally. Depending on the size of the DAG, this might carry a huge cost. As a baseline, we would probably never respond to an unbounded recursive selector in a WANT list purely for DOS protection.
    b. This then raise the question: how do we efficiently determine if we can serve a selector in a WANT? One obvious solution is to cache what selectors we've already served successfully. That probably isn't enough, and there are some addition questions we might look at:
    i. trying to develop logic, absent of data, about what selectors are contained in other selectors. For example, if we have already served a selector for a whole dag, we can safely say we can serve a selector for a subset of the dag. That example sounds obvious, but trying to determine the exact hueristic for this is something that may actually be quite complicated and involve some set theory :)
    ii. Perhaps we allow some kind of budgetting system for asking for selectors in WANTS + possibly a tit-for-tat system

  3. Once we get back information about a WANT for a selector, a big topic for exploration is how to split graphsync queries among multiple peers if multiiple peers have the selector. We may be able to do this by "splitting the selector" -- i.e. traversing different branchs of the DAG tree. However, it may also be more efficient to split at the block level. At the moment, selector traversal is deterministic. So if we have 3 peers, we might say to the one peer, that, given a selector traversal that results in blocks indexed 0...n, send blocks where index modulo 3 = 0, and then to another send blocks where index modulo = 1, and to the third send blocks where index modulo 3 = 2. Etc. This may be more efficient for a broader set of dag structures (as opposed to tree splitting which works best on a balanced, wide tree). There are lots of areas to explore here -- just flagging this as a non-trivial implementation detail that does not yet exist.

@daviddias daviddias transferred this issue from protocol/ResNetLab Jan 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant