-
Notifications
You must be signed in to change notification settings - Fork 742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON-RPC: performance problem with chainHead_v1_storage
queries using descendantValues
#5589
Comments
Just to copy in my thought on approach too: Given that we have backpressure we could also just never emit the |
works for me!! |
This issue has been mentioned on Polkadot Forum. There might be relevant details there: https://forum.polkadot.network/t/polkadot-api-updates-thread/7685/15 |
Close #5589 This PR makes it possible for `rpc_v2::Storage::query_iter_paginated` to be "backpressured" which is achieved by having a channel where the result is sent back and when this channel is "full" we pause the iteration. The chainHead_follow has an internal channel which doesn't represent the actual connection and that is set to a very small number (16). Recall that the JSON-RPC server has a dedicate buffer for each connection by default of 64. #### Notes - Because `archive_storage` also depends on `rpc_v2::Storage::query_iter_paginated` I had to tweak the method to support limits as well. The reason is that archive_storage won't get backpressured properly because it's not an subscription. (it would much easier if it would be a subscription in rpc v2 spec because nothing against querying huge amount storage keys) - `query_iter_paginated` doesn't necessarily return the storage "in order" such as - `query_iter_paginated(vec![("key1", hash), ("key2", value)], ...)` could return them in arbitrary order because it's wrapped in FuturesUnordered but I could change that if we want to process it inorder (it's slower) - there is technically no limit on the number of storage queries in each `chainHead_v1_storage call` rather than the rpc max message limit which 10MB and only allowed to max 16 calls `chainHead_v1_x` concurrently (this should be fine) #### Benchmarks using subxt on localhost - Iterate over 10 accounts on westend-dev -> ~2-3x faster - Fetch 1024 storage values (i.e, not descedant values) -> ~50x faster - Fetch 1024 descendant values -> ~500x faster The reason for this is because as Josep explained in the issue is that one is only allowed query five storage items per call and clients has make lots of calls to drive it forward.. --------- Co-authored-by: command-bot <> Co-authored-by: James Wilson <james@jsdw.me>
Close #5589 This PR makes it possible for `rpc_v2::Storage::query_iter_paginated` to be "backpressured" which is achieved by having a channel where the result is sent back and when this channel is "full" we pause the iteration. The chainHead_follow has an internal channel which doesn't represent the actual connection and that is set to a very small number (16). Recall that the JSON-RPC server has a dedicate buffer for each connection by default of 64. - Because `archive_storage` also depends on `rpc_v2::Storage::query_iter_paginated` I had to tweak the method to support limits as well. The reason is that archive_storage won't get backpressured properly because it's not an subscription. (it would much easier if it would be a subscription in rpc v2 spec because nothing against querying huge amount storage keys) - `query_iter_paginated` doesn't necessarily return the storage "in order" such as - `query_iter_paginated(vec![("key1", hash), ("key2", value)], ...)` could return them in arbitrary order because it's wrapped in FuturesUnordered but I could change that if we want to process it inorder (it's slower) - there is technically no limit on the number of storage queries in each `chainHead_v1_storage call` rather than the rpc max message limit which 10MB and only allowed to max 16 calls `chainHead_v1_x` concurrently (this should be fine) - Iterate over 10 accounts on westend-dev -> ~2-3x faster - Fetch 1024 storage values (i.e, not descedant values) -> ~50x faster - Fetch 1024 descendant values -> ~500x faster The reason for this is because as Josep explained in the issue is that one is only allowed query five storage items per call and clients has make lots of calls to drive it forward.. --------- Co-authored-by: command-bot <> Co-authored-by: James Wilson <james@jsdw.me>
Close #5589 This PR makes it possible for `rpc_v2::Storage::query_iter_paginated` to be "backpressured" which is achieved by having a channel where the result is sent back and when this channel is "full" we pause the iteration. The chainHead_follow has an internal channel which doesn't represent the actual connection and that is set to a very small number (16). Recall that the JSON-RPC server has a dedicate buffer for each connection by default of 64. - Because `archive_storage` also depends on `rpc_v2::Storage::query_iter_paginated` I had to tweak the method to support limits as well. The reason is that archive_storage won't get backpressured properly because it's not an subscription. (it would much easier if it would be a subscription in rpc v2 spec because nothing against querying huge amount storage keys) - `query_iter_paginated` doesn't necessarily return the storage "in order" such as - `query_iter_paginated(vec![("key1", hash), ("key2", value)], ...)` could return them in arbitrary order because it's wrapped in FuturesUnordered but I could change that if we want to process it inorder (it's slower) - there is technically no limit on the number of storage queries in each `chainHead_v1_storage call` rather than the rpc max message limit which 10MB and only allowed to max 16 calls `chainHead_v1_x` concurrently (this should be fine) - Iterate over 10 accounts on westend-dev -> ~2-3x faster - Fetch 1024 storage values (i.e, not descedant values) -> ~50x faster - Fetch 1024 descendant values -> ~500x faster The reason for this is because as Josep explained in the issue is that one is only allowed query five storage items per call and clients has make lots of calls to drive it forward.. --------- Co-authored-by: command-bot <> Co-authored-by: James Wilson <james@jsdw.me>
We’ve encountered a performance issue when executing
chainHead_v1_storage
queries with thedescendantValues
option in the new JSON-RPC API.When performing such a query, the RPC node sends an
operationStorageItems
notification containing only 5 items. This is immediately followed by awaitingForContinue
notification. Upon receiving this, we immediately respond with achainHead_v1_continue
request, and this cycle repeats.This results in certain queries taking an excessively long time to resolve. For example, requesting the descendant values of
NominationPools.PoolMembers
on the Polkadot relay-chain can take 6 to 10 minutes to complete using the new JSON-RPC API, while the same request takes only a few seconds with the legacy RPC API.Expected Behavior
The node should efficiently return a larger number of items per
operationStorageItems
notification, especially when there is no sign of back-pressure (e.g., if thechainHead_v1_continue
response is received promptly). Ideally, the node could send hundreds of items at once, and dynamically adjust the number of items sent based on the system's responsiveness.Current Behavior
The node currently sends only 5 items per
operationStorageItems
notification (and always requesting awaitingForContinue
notification), significantly slowing down the resolution of large storage queries.Proposed Solution
operationStorageItems
notification, potentially sending several hundred at a time.chainHead_v1_continue
response is received quickly, send more items in the next notification).Logs
slowOperationStorageItems.log
The text was updated successfully, but these errors were encountered: