-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Efficient Querier Pagination #6191
Comments
@aaronc even type A will change the result type, since we need a way to return the after []byte type. Am I wrong? The only issue I see maybe is the way pagination will work in CLI command line, maybe is a little bit annoying to provide the last element ID as a param. |
Well ideally that is included as the key of the last item in the result set. But the cleaner way would probably be to include it directly in the result set as |
Ok! What we do then about this, I can provide an example on how to do case B in one PR for a specific case, and if we like implement it for all, or we wait to gRPC? |
Gonna read it. This is the PR your are using right? #5953 |
Yes, that's the init PR |
Here's what I'm thinking, just take Ex: message BalancesRequest {
bytes address = 1;
bytes page = 2;
uint32 limit = 3;
}
message BalancesResponse {
repeated cosmos_sdk.v1.Coin balances = 1;
bytes next_page = 2;
} |
I see, and I think is a good approach, I wonder how user friendly would be this from the point of view of the command line. Is not the same --page 1 that --page 0x1234567890ABCD but I can't come with other idea that does not need to do a traversal of the full set. |
One way would be to support both -- efficient and lazy (current) pagination: message BalancesRequest {
bytes address = 1;
uint32 page = 2;
bytes page_key = 3;
uint32 limit = 4;
} The idea is that a client can either provide |
Hmm... I'd rather not complicate things by supporting both options on the querier level. Most usage I imagine will actually be programmatic. I can't see any other use case beside the CLI, and if we really want to support the simple |
Yeah, maybe we can just accept this tradeoff. |
Sorry, I am just thinking, but I see with the approach suggested other problem with the REST. This approach only allows pagination in a webapp in the style of: <- prev next -> Since we always need at least the first call to get the next page key. So exact paginations on the style of: 1 2 3 4 5 6 ... etc wont be possible without many tricks. This is just some thoughts, I am not advocating against, just to bring them to the table and have them in mind. |
Why is this a tradeoff or complication? It's pretty trivial (if/else check). Clients that don't want to pass around a byte/hex key should not have to. |
I think what @aaronc says is that from the point of view of the API having two types of pagination is not optimal, not because of the coding complexity but from API design. |
Mhhh, I respectfully disagree. Can you please explain how it's non-optimal? The client can choose to use which ever method it wants. One value is omitted over the other. |
I understand you perfectly, with non optimal I wanted to say that is less simple than having only one way of pagination, which from a client perspective it just makes it a little bit harder but just looking at uint32 page = 2; to understand the difference of both. Don't get me wrong @alexanderbez! I think is not a bad idea to have both, I did not think about it and I think it helps to bring the --page 1 use case. Whatever we decide is ok for me. :D |
Maybe it's the right thing to do, I just think it would be simpler if we could avoid it. It's added complexity for every request type and every querier method. I'll see if I can find a simpler way... but maybe there isn't. |
ACK whatever you guys wish to go with... |
Did a quick spike on this with for #5975 I think this approach will address both concerns: https://github.com/cosmos/cosmos-sdk/pull/5953/files#diff-bc1bef9ecc3076635102b2d152b5449c https://github.com/cosmos/cosmos-sdk/pull/5953/files#diff-8ca04691fa5aa1d8ede405ec840d9f21 Using page numbers can be made more efficient if the iteration only unmarshals the values that are actually going to be returned. The |
IMHO we should focus on one way to query and keep the api simple and self explaining. Either cursor or pages. The cursor is more efficient on the server side as we would not have to read and skip data. It is also more correct with data being inserted while iterating. An insert before the cursor would not have any effect on the result set and after the cursor would just be appended. Maybe we can learn more about the common client use cases. |
So I'd like us to come to a decision on this either way. My current proposal as represented in #6353 (specifically https://github.com/cosmos/cosmos-sdk/pull/6353/files#diff-bc1bef9ecc3076635102b2d152b5449c) is to handle both keys as @alpe has suggested or standard limit/offset. I don't think it's that complex to handle both and I think it would account for both:
I understand @alpe's concerns about just wanting to support one approach, but at least on the server side we can abstract over the complexity with helper functions. In #6353, I have a helper that works over If we had to choose between one of those approaches, I would advocate for @alpe's more efficient key approach. But if we don't want to have the opinionated either/or battle, I present #6353 as a middle ground. A few details on the approach in #6353:
Whether we choose this approach or settle on supporting just one of offset/pagenum or key, I'd prefer to settle on something rather quickly because other than this, gRPC module queriers are unblocked. If we decide on something, I think #6353 can be completed rather quickly and then each gRPC querier will get pagination from the start. |
+1 #6353 seems to be a better approach considering all these points. Querier is blocked on this and we should move quickly. I would advocate to support both for now and handle the complexity with helper function on server side. We are planning to resume #6353 and start working on querier stuff. |
Any other opinions here? @alexanderbez @jgimeno ? It we're fine with the dual-approach #6353, it would be great to green-light it and get it moving. |
My opinion still stands from my previous comment -- as long as we can support both methods 👍 |
Okay great, seems like we're on the same page. Let's move forward with the dual approach in #6353 and we can discuss any details in that PR. |
@aaronc we might need to use One workaround can be to use |
I don’t understand what the problem is. If offset and key are both unset the behavior should be identical. If key is set and offset is zero we use key. There is no ambiguity. If both offset and key are set there should be a validation error. |
Okay, as discussed over chat, we'll be using Edit: |
@aaronc @alexanderbez @clevinson I suggest we enforce a We can have a hard coded value of Any thouthts? |
I'm in favor of a max limit, although idk if 25 is the ideal limit. Why not just keep it at 100? |
100 makes sense to me |
Sure, we'll keep the max limit to 100 |
@aaronc how do we apply this for custom filtered querying. Lets say for proposals we have I think we can fix this by returning But we'll have a problem with |
Summary
It has been brought to my attention that the current approach to pagination is inefficient because it still requires the full result set to be retrieved by the querier so it will slow down under really huge result sets.
Proposal
@alpe proposed two approaches here: regen-network/cosmos-modules#63 (review) which I am quoting below
Approach B (adding a type agnostic
Cursor
field) may be the better approach going forward and would be the approach I recommend as a starting point. Approach A, however, might be easier to adapt to existing query methods without changing the result types (i.e. for #6163).I would propose this be done from the start for gRPC queriers (#5894).
I would not recommend changing existing querier methods unless clients complain of UX problems in order to not break backwards compatibility of existing clients during the migration to protobuf.
The text was updated successfully, but these errors were encountered: