-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New query GetPoolDistr #3932
New query GetPoolDistr #3932
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm Requesting Changes, mostly to ensure that we have a good answer to whether calculatePoolDistr
is prohibitively expensive.
ouroboros-consensus-shelley/src/Ouroboros/Consensus/Shelley/Ledger/Query.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus-shelley/src/Ouroboros/Consensus/Shelley/Ledger/Query.hs
Show resolved
Hide resolved
@@ -273,6 +278,11 @@ instance ShelleyCompatible proto era => QueryLedger (ShelleyBlock proto era) whe | |||
, SL._retiring = Map.restrictKeys (SL._retiring dpsPState) poolIds | |||
} | |||
Nothing -> dpsPState | |||
GetPoolDistr mPoolIds -> | |||
let poolDistr = SL.calculatePoolDistr . SL._pstakeSet . SL.esSnapshots $ getEpochState st in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this calculatePoolDistr
perhaps too expensive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be, but I don't know that we have any clear requirements on what is too expensive. Note that SL._pstakeSet
is one of the data structure that we anticipate being moved to disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for query leadership-schedule
, so I think we have a good amount of headroom -- I wouldn't expect SPOs etc to be spamming this command (though it's easy enough to do that accidentally within a script).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could micro-benchmark the function, and see where we are at, if we are worried.
For a general idea as to what calculatePoolDistr
does:
- convert the very large stake distribution into an association list
- un-compacting the coins in the process
- lookup the stake creds in the delegation map in the process
Map.fromListWith (+)
on the big association listMap.intersectionWith
on the condensed map to add in the VRF keys
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to make this cheaper if we're only interested in a subset of pool ids?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great question! The following might be faster:
Make a new version of calculatePoolDistr
with signature
calculatePoolDistr' :: SnapShot crypto -> Set (KeyHash 'Staking crypto) -> PoolDistr crypto
which is identical to calculatePoolDistr
except that after looking up the credential in the delegation map (see here), it also looks up the pool ID in the set of pool IDs provided by the user (and discards the result if not found).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd look at use cases before optimising too much. I think the two likely cases are:
- all pools
- a specific pool (and then repeat for multi-pool operators)
The command is likely to be used occasionally rather than frequently, I expect
9f54955
to
c521d49
Compare
c521d49
to
f5d6a47
Compare
ouroboros-consensus-shelley/src/Ouroboros/Consensus/Shelley/Ledger/Query.hs
Outdated
Show resolved
Hide resolved
f5d6a47
to
5baa2fd
Compare
8628e8b
to
78fafbc
Compare
78fafbc
to
249bff8
Compare
I reverted back to use |
249bff8
to
16b4712
Compare
@@ -35,7 +35,7 @@ data NodeToClientVersion | |||
| NodeToClientV_13 | |||
-- ^ enabled @CardanoNodeToClientVersion9@, i.e., Babbage | |||
| NodeToClientV_14 | |||
-- ^ added @GetPoolState, @GetSnapshots | |||
-- ^ added @GetPoolDistr, @GetPoolState, @GetSnapshots |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 NodeToClientV_14
hasn't been yet released. node-1.35.x
comes with NodeToClientV_13
as the latest version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marcin's commented that the NTC version was correct. So that's good.
The question of calculatePoolDistr
's performance remains unresolved, but I don't think it's a blocker for this PR. We'll need to assess it, though. @newhoggy can you pursue that with some measurements etc after merging this (or before, if that's convenient)?
Yep. I will evaluate after merging. There is a follow up PR that improves the performance, but that's blocked on a refactoring PR. |
bors r+ |
I didn't notice much difference in CPU/memory usage of the new version of |
Description
This introduce a new query GetPoolDistr which extracts the stake snapshot. This query is required to improve the CPU and memory efficiency of the query
leadership-schedule
command in the CLI.Checklist
interface-CHANGELOG.md
interface-CHANGELOG.md
cardano-api
changesThere will be a new
GetPoolDistr
which uses a lot less CPU and memory:This query takes a set of pool ids so it is possible to query multiple pools at once.
cardano-cli
changesA existing command
query leadership-schedule
will be be functionally unchanged, but use much less CPU and memory.