-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimising querying for validator delegations #15162
Comments
Another point to be optimised: querying validators delegations with pagination and counting them. If I'm correct, for example if a validator has 1000 delegations and I am querying for first 5 while asking to count all of them, it'll take all of them into the database then return only the first 5, but will use all the entries to calculate the total delegations count. Maybe it's worth it to store delegators count somewhere as a separate key in store and use it for calculation? |
Okay I've researched a bit more, so I was partially right. cosmos-sdk/x/staking/keeper/grpc_query.go Line 87 in 747c29e
It'll get the store and use GenericFilteredPaginate on that, which will query for all the records until it finds N records which are matching the predicate (N = limit). In my case I have 1 delegation on my validator (validator itself), so it'll have to iterate through all the records to check there are no more records to return, as 1 is not enough for a full batch. Also if there's a validator where there are more delegations than the limit in pagination, it'll have to iterate through all entries to count them. It may be fixed by 1) storing all delegations not in the same store key, but by validator, so when you need to get validator delegations, you just take the right key and receive only those with needed validator, and 2) maybe storing delegations count per validator somewhere as a separate key, so no need to iterate over all validator entries to get the count, which would make requests with count_total=true faster. |
thanks for raising this item. even otherwise, the delegations API has been gradually slowing down over the last two years. it has gone from near instant response times to high response times to the current stage where for Cosmos and Terra, it just dies and is impractical). The workarounds have somewhat helped but it is still an everyday operational work item for smart stake dashboard needs (even for syncing all delegations just once a day). Hoping to get some short term improvements as outlined on this ticket and more permanent improvements in longer term. |
I don't think this make the query faster, assume there is only one delegation from a delegator_address and he is querying the state with page_limit 10 and count_total = false (still this query can iterate over all the delegations in state in the worst case, if it is at the end of the state). The only solution that I can see here is add an extra index which can iterate over validator address (which is already discussed here). for ex the key can look like:
with the above key we can iterate over validator_address and decode the complete key to get the delegator address, now we have both delegator_address and validator_address, we can simply form the key to get delegation entry from delegation key prefix using these keys. and also can we close this issue and discuss on the older one, since there is already an issue present for this #10516. |
It may improve the speed of queries with count_total=true though, and I am pretty sure this would be useful, as it can be a use case to query for delegators amount and not the delegators themselves (for example I am using such queries for my validator's website).
Fine by me, but is it possible to put this older issue somewhere higher in the team backlog or resume discussion? The older issue seems quite stale, and the issue is quite present, moreover, with the current indexing it may be used as an attack vector on public nodes, as if you do a lot of such requests (asking for all delegations of a validator with a default limit, while this validator has 1 delegation, which will cause the fullnode to iterate over all delegations) to a public node, it'll apparently run out of resources and will stop being in touch with the rest of the network, keeping staying on one block (I had this behaviour on multiple nodes when doing a lot of such requests constantly). |
You may be right, but if we rely on the FilteredPagination the query won't be faster, since it is unmarshalling the delegations and checking delegation info. Assume the delegation entry is at the end of the state. |
agree for the case where it's at the end, but still it'll be faster if it's in the beginning or in the middle, as then we won't need to iterate over all entries for getting total count |
Hmm, but I feel that's may not be best way to do it. Let's wait to grab more eyes on this. Thanks for the thoughts @freak12techno |
Storing the count doesn't make sense IMO. However, I would agree that having a reverse index of validator address to delegator could make sense. The general idea that always comes up in these types of convos is that the gRPC query layer isn't meant to service all needs. It meant to serve basic representations of data from the state machine. It's the role of indexers and other 3rd party tools to do the heavy lifting. If your tool or service relies on this particular bit of info, you should ideally be using an service like Numia or something. We're trying to remove complexity from the gRPC layer not add more to it 👍 |
I'd love to work on that if possible (as it's something that I would love to be fixed asap, as it breaks my monitoring tools), but I'd need some guidance on that, as I never contributed to cosmos-sdk before so I may have some knowledge gaps. Can someone describe the steps to be done to get this working, if it's not difficult? |
agree with bez, i think we should strive to provide ways to do this off chain instead of on chain. Something like versionDB from crypto.com which will be adopted in the sdk would help with your issue and wouldnt need to use the underlying node. This is a good query to test with storage v2 once we have a working poc. I would guess that if youre not asking for proofs, the performance increase would magnitudes. |
What's being suggested is that you rely on an indexing service for your monitoring tool :) |
But having the index on validators for delegations isn't only affecting the api endpoints performance, but also the chain itself sometimes, doesn't it? I mean the fullnode needs to ask for validators delegations at least when slashing a validator. So this should be a good thing to add not only because the api is slow here. (If the delegators count part is better to be done with external services that's fine by me, but at least indexing delegations by validator would benefit a lot.) |
Nope! It's solely used for this one query. The key is used in the state machine yes, but no where in the state machine do we use this key. |
Okay it's still not clear for me if this would be fixed and if yes, when can it be fixed. My concern if that even if I'll switch my monitoring tools to use something like indexers, there are a lot of tools (namely, restake is the biggest one I guess) that do such queries and it can cause the node such request is done against to get out of sync with the rest of the chain (I had it on my own node until I switched domains on my public node so there's no requests from cosmos.directory: once in a while my node would consider itself to be caught up, while staying on a single block for more than an hour, then it somehow fixes itself, making a node kinda unusable during such periods, as you can't broadcast anything through that node and some different queries to that node fail. I highly doubt it's the hardware, as it doesn't happen with my validator node, which has lower specs, and the issue disappeared once there were no requests from cosmos.directory to my node). @alexanderbez can you please clarify my concerns? |
Since the state machine doesn't need it, it's solely a client UX concern. That being the case, in order to support it would require the state machine to track it and in addition, would require a migration. To me, it's simply not worth it IMO.
Those tools too should be using indexers. |
@tac0turtle I see this issue got closed due to merging PR, nice to hear that! |
yes!! @atheeshp killed it on finding a solution!! He is owed all the thanks!! It will be in the eden release, its unclear when this release will land on the hub. We are aiming for a end of june release in the sdk |
When querying for validator's delegations, if I am correct, it takes all the delegations from the database, and then only returns those where validator address matches the one queried for (x/staking/keeper/delegation.go). This might make it awfully slow if, for example, there's 10k delegations overall and validator I'm querying for has only 5 delegations.
It concerns me because I have my tools that are periodically querying my delegators count (by asking for 1st delegation and the count of all delegators), and such queries are basically killing the fullnode I'm querying it from.
I can likely make a PR on optimising it, but I'm not sure if we store delegations by validator somewhere (likely not, if not we need to index it separately to work. If yes, we just need to take the validator delegations and use it instead of asking for all delegations).
Do you think it's the right approach (the one I described) or do we have other reasons to do it the way it's done now?
The text was updated successfully, but these errors were encountered: