-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Selector resource budgets #27
base: main
Are you sure you want to change the base?
Conversation
3a11d33
to
0b8dbad
Compare
I suspect this would be valuable outside Filecoin context as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together @warpfork. Interesting! Per comments, before engaging deeper on this one, I think it would be helpful to have some more specific usecases that customers are asking for. Let me know if help is needed to gather any of this.
(They're sorta like regexps for DAGs, if that's a useful comparison for you.) | ||
We want to expose these | ||
|
||
The problem is: if a service wants to accept Selectors which are user-specified, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have more use-case examples? Basically:
- I assume anyone with go-ipfs binary installed can do anything they want with selectors and they can hose their machine and we can't fully stop them (but we do have to ensure the network is safe).
- What kind of needs does FileCoin have?
- What are some example service uses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For go-ipfs: right, I could care less if someone uses a local API to hose themselves. But people seem to want to expose these things publicly. For example, #1 seems to imply we're going to have APIs oriented around Selector queries, and does not say much about not letting these be exposed remotely. This is representative of most conversations I've ever overheard about Selectors and what people want to do with them. (So, if nothing else, this proposal needed to be made to track the situation!)
For filecoin: I taaaag.... @magik6k ? (I have repeatedly heard this is wanted, that's the depth limit of my knowing.)
In general: it seems like it's almost a law of human nature that people want to ask arbitrarily complex questions without concern for the costs on the answerer 😆 / 😢 The Selectors system seems to be no exception, heh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big use-case I can't believe I forgot in the earlier comment: graphsync.
We seem to talk about the intention to use graphsync between untrusted peers who might be exchanging data without a fee mechanism. If that's true, then it will be important for such peers to have at least some cost estimation mechanism and cutoff options.
#### Impact | ||
_How directly important is the outcome to web3 dev stack product-market fit?_ | ||
|
||
However important Selectors are to web3 dev stack PMF, this is that times about 0.95. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any insight as to how important Selectors are to web3 dev stack PMF? Should we solicit anecdotes/data from the PM team?
Two: It's possible to work around this in some cases by building APIs around selectors, | ||
but then only accept a known, pre-specified set of selectors. | ||
(If I understand correctly, this is how several pieces of Filecoin currently around around this issue.) | ||
This is not a general workaround, though, and ruins most of the point of Selectors -- they're *supposed* to be user-specifiable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the specific, usecase but I assume a gimped selector syntax could still be useful potentially for some users depending on their needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we would need to invent such a gimped syntax.
That's probably harder than planning and implementing budgets within the current syntax.
My general experience with this topic is: you cannot, no matter how big of a gavel you wave and how energetically you wave it, convince people to stop asking for features that would make a system accidentally turing-complete (and thus an unbounded DoS vector). (This is doubly true when it comes to tree or graph processing, which, in a sudden flash hindsight that only occurs to me fully now, probably ought not be a surprise.) Therefore: monotonically decreasing budgets, often aka "gas", is the only real way to unambiguously communicate the problem, and thus the only real practical way to solve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other route is "invent a gimped syntax and a compiler that verifies its non-TC-ness" -- and that's possible; that's what eBPF is, if I understand correctly -- but it's a huge amount of engineering work...
... and I sorta wouldn't bet very favorably on if that approach would work for tree/DAG processing scenarios anyway. I'd rather bet money that it would end up with people wanting to apply the eBPF-like thing repeatedly on every block they visit.
Which would get us back to approximately the same problem with Selectors right now: since those things would "restart" their budget on every block visited, we'd need some... bigger, holistic, monotonically decreasing budget.
#### Alternatives | ||
_How might this project’s intent be realized in other ways (other than this project proposal)? What other potential solutions can address the same need?_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume, but is there any concept of pagination/partial results that can be applied for selectors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has been discussed, but no such thing is implemented nor shipped at present.
Resuming that discussion would probably be a part of the work that would go on while engaging on this project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice, it's my understanding that at present, systems using Selectors get in the habit of launching small queries (depth limited, or, constructed to favor left-leaning trees for example) to get started with exploring the data, and use more queries subsequently.
This "works" but obviously leaves some load to the brain of the human crafting the Selector, which isn't really the most desired outcome. (It's maybe fine if you're a human, splunking interactively -- but it's not so great as a basis for APIs if we want programs to be built which generate Selectors automatically in response to some higher-level user actions.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some prior discussion involved ideas like "what if we could ask the selector to return every (N % 3 == 1) blocks?" and similar ideas. The aim there was to end up with something that you could imagine a system generating automatically in order to fan out queries for data to multiple peers and start getting different fractions of the data back from them in parallel.
This only got to the discussion phase. There may be neat ideas here, but they trend towards getting complicated, so we pushed them out of the first round of Selector work.
0b8dbad
to
25a6671
Compare
tl;dr Selectors currently don't have a resource budgeting system, and this means accepting user-defined selectors is a DoS vector. Work is needed to fix this in order to make Selectors usable in more scenarios. These scenarios include things wanted in Filecoin implementations, to my understanding.