-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Queriers for staking to increase Gaia-lite performance #2139
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #2139 +/- ##
==========================================
- Coverage 63.91% 63.1% -0.81%
==========================================
Files 134 135 +1
Lines 8194 8348 +154
==========================================
+ Hits 5237 5268 +31
- Misses 2604 2727 +123
Partials 353 353 |
Codecov Report
@@ Coverage Diff @@
## develop #2139 +/- ##
=========================================
Coverage ? 64.1%
=========================================
Files ? 142
Lines ? 8808
Branches ? 0
=========================================
Hits ? 5646
Misses ? 2758
Partials ? 404 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments @fedekunze -- I think it'll be good after this.
In the future, lets please not combine refactors with actual functionality changes. I'm having a very hard time distilling the actual functionality change. (Aside from the delegation stuff, It seems its some bech32 stuff, but there aren't comments anywhere motivating this, nor discussion in the issue / statement at the top of the PR? It probably makes sense, just not indicating it increases time to grok) Reviewing by diff breaks down as we grow PR size. To quote Anton, quality of PR review goes down as PR size grows. I.E. The bech32validator stuff is in the same PR, as something thats change the capitalization as some log messages, which is in the same PR as stuff that changes the name of a ton function parameters names. These all sound reasonable, but they don't need to be evaluated together with the delegation speedups. I saw I was requested to review this. Is it important that I do so? If so I think this should be split up into multiple PR's. Optimizing for quality of PR review is crucial, and something I think we need to start enforcing. I totally understand that slowness of merging things / annoyances of branching make including more refactors into the same PR something that naturally happens. I think as a team we just need to get better at pointing this out and avoiding it. (I'm guilty of it as well!) I'm not sure this is something thats important for me to review though, as I don't do anything on this part of the codebase, and rige, sunny and bez are reviewing this. |
@ValarDragon Second, I'm fine with splitting PRs into several issues and in general I'm completely in favour of it, that's why I opened #2182 and #2202. Nevertheless, when the PR has been sitting for more than a week waiting to be merged and with more than 3 reviews already, it's definitely a waste of time for the author of the PR. If you want to propose splitting the PR into several ones please do so in the first days that the PR has been labeled as R4R. I've already discussed with @faboweb that this slowness in merging PRs because of things like this is blocking our work in Voyager. This in particular fixes #2009 which is currently slowing GET requests on the validators page in Voyager. I'm literally working during weekends because we need to move forward with this PR. @alexanderbez and @cwgoes one of the only ones who're actually continuously reviewing PRs and speeding things up. Thank you so much @alexanderbez for reviewing even when sometimes you haven't been requested for review ! Also, you don't need to review the PR if you don't want. I requested a review from you because I think you are bright and smart and because your input can definitely be valuable here, specially in what's related to this comment #2139 (comment). Cheers |
Theres a lot of "clearly correct" parts of this PR, that if were in a separate PR would get merged extremely quickly. I'll link some examples. https://github.com/cosmos/cosmos-sdk/pull/2139/files#diff-b67911656ef5d18c4ae36cb6741b7965R163 All of these are clearly correct, and are just refactoring / minor changes that are independent from the Querier feature. If a PR was made of just those, it would likely get merged within 48 hours. But more importantly, it would reduce the diff here by like 200 - 400 lines, which makes it substantially easier to review. (And therefore get this merged faster) Normally including of the minor things isn't a big deal. It only starts being a problem when the PR size grows to be very large. You can even base your second PR on the refactoring PR. Something that would have made this immensely easier to review would have just been doing a find/replace for I totally understand you've put tons of work went into this. I also see why requests to split it up are frustrating. There isn't any need to do it this time, as people have already reviewed it, but note that when that comment was made, the PR was only R4R for 3 days. However further splitting up PR's similar to this in the future would make them get reviewed / merged much more quickly. (hence alleviating As a separate point of discussion, could we unblock voyager by making a "voyager branch" of the SDK. Then have voyager work of the voyager branch, and you could merge any PR you needed into that branch immediately. (This can be done via command line, no github PR required) When relevant aspects get merged to develop, delete the voyager branch and remake it off of develop again? (e.g. proposed flow is: Create voyager branch, create feature branch off develop. PR feature branch into develop, just merge it into voyager branch immediately. Once feature is in develop, reset voyager branch to develop.) |
x/stake/keeper/validator.go
Outdated
} | ||
iterator.Close() | ||
return validators | ||
return validators[:i] // trim |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only adds i items to the validators array, is there a reason to trim it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alllllllright! - I went through this PR in reasonable detail for design (I didn't go through the new tests in detail however - but they look clean which is good)
There are a number of structural changes which I'd requested in my review. My biggest quarrel is that there is a lot of code bloat introduced into the keeper which I am windly against - a lot of bloat can be moved into functional attributes of the the objects (and kept in x/stake/types/
maybe we will want to keep some things in the stake keeper but we should add that to a query_utils.go
.
Additionally I request that the all changes to x/gov/
be moved to a new PR just to keep this one clean.
Lastly, there is a bit of refactor stuff which I request is removed from this PR (I've made specific comments throughout)
x/stake/keeper/validator.go
Outdated
if !iterator.Valid() { | ||
break | ||
} | ||
for ; iterator.Valid() && (!retrieve || (retrieve && i < int(maxRetrieve[0]))); iterator.Next() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm quite sure we don't want to combine GetValidators
and GetAllValidators
- in reality GetAllValidators
should rarely be getting called and is kind of an edge case for exporting the state of the entire blockchain, everything else should retrieve predetermined maximum number of validators
x/stake/keeper/validator.go
Outdated
// Get the set of all validators, retrieve a maxRetrieve number of records | ||
func (k Keeper) GetValidators(ctx sdk.Context, maxRetrieve int16) (validators []types.Validator) { | ||
// Get the set of all validators. If maxRetrieve is supplied, the respective amount will be returned. | ||
func (k Keeper) GetBechValidators(ctx sdk.Context, maxRetrieve ...int16) (validators []types.BechValidator) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per my previous comment we should rarely be using GetAllValidators so it should be its own edge case. But also, let's remove use of this function altogether, it bloats the keeper - what we should really be doing is just converting the validators to bech once we've retrieved them from GetValidators
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With that implementation we'd have to iterate the validator array twice, one to retrieve the validators and another to convert each of them into BechValidator
...
* A new bech32 prefix has been introduced for Tendermint signing keys and | ||
addresses, `cosmosconspub` and `cosmoscons` respectively. | ||
|
||
* A new bech32 prefix has been introduced for Tendermint signing keys and addresses, `cosmosconspub` and `cosmoscons` respectively. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary/ undesirable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/cosmos/cosmos-sdk/pull/2103/files if we do introduce new bech32's, it would be cosmosval
and cosmosvalpub
, not cosmoscons*
just fyi.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe there's something wrong with the sdk impl right now... i don't think the validator operator address should be anything different than a normal address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaekwon @rigelrozanski @ValarDragon @alexanderbez Currently BechValidator
is not using bech32
prefix for sdk.ValAddress
, just hex. Also its PubKey
value uses cosmosconspub
prefix. Is that ok ? Because I thought it was supposed to be using cosmosvalpub
instead
err2 := keeper.cdc.UnmarshalJSON(req.Data, proposalID) | ||
if err2 != nil { | ||
return []byte{}, sdk.ErrUnknownRequest(fmt.Sprintf("incorrectly formatted request data - %s", err2.Error())) | ||
errRes := keeper.cdc.UnmarshalJSON(req.Data, proposalID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please undo all these refactor changes (for now) to use errRes
and res
it's bloating this PR - maybe make a refactor PR after this? - Although, I also don't really agree with this modification either - but we can discuss in the next PR
p.s. I just assigned myself to this issue, I'll gladly merge once comments addressed |
Also note @fedekunze @faboweb - Voyager should not be blocked on things like merging a PR - The voyager team should work off this branch / their own updated branch which contains any new features they want to test out. Rushing merging PR's like this is a big nono - there are it a LOT going on here and requires thorough review. Previously there were staking rest bugs which were introduced with an LCD refactor PR (#1880) which I'm pretty sure was "rushed" to merge - in this previous PR I had not actually finished reviewing it yet but it got merged. Subsequently I had to fix dem bugs, which will happen from time to time and IS OKAY and expected, but you know, I would have rather caught them in review. |
@rigelrozanski what do you think of my comment ?
|
I don't think that's currently an option. We are behind schedule with staking in Voyager and we don't have capacity to maintain another branch. Besides, if we want those changes to be on |
@rigelrozanski @ValarDragon @alexanderbez FYI I'll update the PR here and split it once I've addressed Rigel's comments. Thanks for your reviews |
This PR I think is poking at some structural issues we have with the SDK/LCD system. Before, the LCD handlers were querying the store (a cryptographically verifiable operation using merkle proofs...) so it would have been possible for the LCD to verify everything along the way... but this refactor is trying to push all the logic into keepers. I'm not sure why this PR makes things faster -- as the comments say, there are a lot of concerns that are mixed in here and it makes for a big PR that is hard to reason about. Why does this PR increase performance? Anyways, I'm starting to think that the LCD should maybe be replaced with a full node w/ queriers, so that Voyager etc can query any full node directly. There's no point to the LCD if it's just going to proxy the result of a full node's Querier. If we need this for performance reasons, then we might as well bypass the LCD for now until we figure out how to efficiently call the Querier from an LCD (which I think we can do by moving logic into the Keeper as this PR does, and then calling the keeper with a cliCtx which has Stores that are RemoteStores which handles Merkle proofs). So I'm in favor of moving logic into keepers, and I think we want to have Voyager just query the full node to avoid unnecessary proxying logic that defeats the point of an LCD in the first place. Also, @fedekunze why is this PR an improvement in speed? Is it because we're avoiding multiple REST query calls inside of a loop or something? Could Voyager just query the full node instead of proxying through LCD? We have an SDK standup at 10am PDT M/W/F. Can you make that, or someone familiar with Voyager's requirements of the LCD? |
Ref: #2078
Fixes #2009
Targeted PR against correct branch (see CONTRIBUTING.md)
Linked to github-issue with discussion and accepted design OR link to spec that describes this work.
Wrote tests
Updated relevant documentation (
docs/
)Added entries in
PENDING.md
with issue #reviewed
Files changed
in the github PR explorerFor Admin Use: