Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsider RC Process #7738

Closed
4 tasks
aaronc opened this issue Oct 29, 2020 · 44 comments
Closed
4 tasks

Reconsider RC Process #7738

aaronc opened this issue Oct 29, 2020 · 44 comments

Comments

@aaronc
Copy link
Member

aaronc commented Oct 29, 2020

Summary

Consider a lighter weight RC process where we release off master and reserve cherry-picks for maintenance mode releases.

Problem Definition

As I understand it we are attempting to follow a process known as “trunk based development”. I understand there are many different viewpoints on branching strategies (here’s a debate thread between two teams at Microsoft: Stop merging if you need to cherry-pick | The Old New Thing). But without trying to pick sides philosophically, I just want to document what I’m seeing in practice.

Basically, I’m seeing two issues:

  • the backport/cherry-picking process is quite laborious and error prone
  • it is hard for external testers to track what is happening between RC’s. All the PRs on on master but different cherry-picked PRs bundle a bunch of them together, possibly out of order, onto the release branch

Some of the problems this causes for users are:

  • It’s hard to predict what the next rc will include so they can’t test fixes as soon as they on master and update their code accordingly
  • Bug fixes that land on master may actually have a different effected when cherry-picked out of order

@jackzampolin @ethanfrey and @clevinson maybe you could comment a bit?

Frankly, I feel that we are cherry picking too much and have likely branched off of master too soon.

Proposal

I propose the following as an alternative:

  • RCs for major releases happen off of master
  • When we are in a major release RC cycle, we don’t accept merges into master that we don’t want in the release
  • Point releases can even happen off of master if they haven’t introduce any breaking changes
  • We start backporting using cherry picks on release branches when that release enters maintenance mode

Looking at the development happening on master, I can say that our team explicitly isn’t trying to merge stuff that we know isn’t ready for stargate. Other stuff we are doing in feature branches and leaving in draft mode. I actually can’t think of any PRs that have been merged that really shouldn’t be in stargate But with this process, many of those PRs will be excluded even if that doesn’t really make sense. If you look at the backport PRs, there are a lot of cherry picks and often it is hard for @clevinson to know what should and shouldn’t be cherry picked and in what order.

I’m not saying we shouldn’t use cherry picking for release maintenance, but I think we’re splitting off too early and it’s actually making this harder.

So I would like us to consider reserving backports for when master really has diverged from a past release. We could probably release all the RCs off of master and even a patch release within a week or two and things would be fine.

This just involves discipline around master when we are in an RC cycle and I think we’ve actually been doing this okay the past month. I’m not saying we should avoid breaking changes in an RC cycle - we’ve been doing it a bit and I think it’s okay if helps us get the release right. We just need to be disciplined and know when to include something and when not to. So this may mean that certain PRs stay in longer lived feature branches for a while until after the RC cycle, but I think that’s an okay trade off.

@odeke-em if you could comment on the golang release process that might be useful context for us.


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@jackzampolin
Copy link
Member

Big fan of the proposal. Minor change to current process which should save significant time for dev team and offer benefits to users testing.

@clevinson
Copy link
Contributor

i'm leaning in favor of this approach. IMO the reason we are dealing with these issues with our cherry picking process is more likely to do with cutting our first RC too early, and/or not being diligent enough about restricting the RC line to strictly bug fixes, docs, and tests. Even with that restriction though, there are a lot of bug fixes coming up, both on core SDK and IBC side... so it seems like this above approach may be better.

What would be the designated timeline for this feature freeze / release cycle? I guess this offers a good incentive for us to publish the release as soon as feasible :)

@clevinson
Copy link
Contributor

Btw- I had just finished up drafting backport PR for RC2 here: #7355

It once again is quite a behemoth... And was fraught with a handful of potentially missed PRs due to forgetting to label some PRs for "backport", or due to githubs inability to properly sort PRs by merge date. I ended up having to do a side byside comparison with RC1 commit logs to verify that i wasn't missing anything extra... In the end there were no big merge conflicts during the cherry-pick process, but it does feel a bit anxiety inducing to have our major release drifting this far from master, especially when we are in the middle of a release cycle.

@aaronc
Copy link
Member Author

aaronc commented Oct 29, 2020

IMO the reason we are dealing with these issues with our cherry picking process is more likely to do with cutting our first RC too early, and/or not being diligent enough about restricting the RC line to strictly bug fixes, docs, and tests.

Thanks @clevinson. I'm not sure I agree with this. We have to release at some point. If we had released rc0 later we would have gotten feedback later.

I'm also not sure we needed to be more diligent - I think we were diligent and only put stuff on master and backports that made it easier to support the maintenance process. I think the criteria should be getting to a release that includes only things we feel comfortable maintaining in their current state for some time. I just don't think a separate RC line makes sense for our type of development. Instead of saying we should only do X, Y, Z on the RC branch, I'd rather say we should only do stuff on master which makes stargate more maintainable until it's released.

What would be the designated timeline for this feature freeze / release cycle? I guess this offers a good incentive for us to publish the release as soon as feasible :)

I don't know. I think we will know when we get feedback that it's ready. Also I didn't say total feature freeze - just keep everything we're not ready to maintain in its current state off of master in a longer lived DNM PR.

@odeke-em
Copy link
Collaborator

Thanks for the tag @aaronc!

So for the Go release process, we make a release two times a year, and to accomplish this, we triage by a couple of guides:
a) We ask folks to roughly estimate or talk about what they plan on working on (this is important because it means what folks will try to commit to, and not what they wish for)
b) For filed issues, we perform a triaging process in which the core maintainers look at issues, evaluate and estimate complexity then triage them either for the upcoming release or unplanned
c) We perform development for 3 months per release (the tree is open for code submissions without restriction) and then 3 months to stabilize, fix bugs and cut release candidate releases
image

The process isn't perfect, but it massively helps reduce complexity, confusion, too many breaking changes, too. The freeze for 3 months massively helps with testing, bug fixing, documentation, catching regressions, reporting noteworthy changes, finding vulnerabilities.

However, governing all this is the Go1 compatibility promise which we take very seriously and promise that correct programs that ran in earlier versions won't be mostly broken in newer releases https://golang.org/doc/go1compat.

Usually about 2 to 3 release candidates are cut before the finally release, and for each we get a branch with cumulative updates e.g. 1.14beta1->1.14beta2->1.14rc1->1.14rc2, where the betas are released much more often and for those we get lots of feedback.

References:

@aaronc
Copy link
Member Author

aaronc commented Oct 30, 2020

Thanks for sharing @odeke-em !

@alexanderbez
Copy link
Contributor

alexanderbez commented Oct 30, 2020

Thanks for writing this up @aaronc. I just want to state that the current release process should, but does not necessarily currently reflect, have a long-living branch where PRs are continuously cherry-picked into.

The problem now is we cherry-pick in a batch type of way. What should really be happening is as a PR lands on master, the issue or PR should be labelled and immediately cherry-picked into the long-living branch...not at the very end of the RC process. This is the PR author's responsibility. Trust me, I know how tedious and laborious the process is as I was the only one doing it for quite some time.

The process of having a long-living release branch where PRs are continuously cherry-picked in is exactly the same process that HashiCorp follows, where all their cloud-based products are completely OSS and they have enterprise offerings as well.

When we are in a major release RC cycle, we don’t accept merges into master that we don’t want in the release

I think we should specifically specify features here.

@alessio
Copy link
Contributor

alessio commented Oct 30, 2020

The current release management workflow has been working pretty well for the 0.39 Release Series. It must be said that 0.39 is very stable and in fact is in maintenance mode only. I'll share some of my thoughts about the problems that seem be affecting the users:

It’s hard to predict what the next rc will include so they can’t test fixes as soon as they on master and update their code accordingly.

True that. This problem could be solved by managing users expectations better, for instance we should make clear that breaking changes are to be expected between RCs. Users would be warned at least.

different cherry-picked PRs bundle a bunch of them together, possibly out of order, onto the release branch

That's not unusual. If PRs bring consistent incremental changes, the order should not matter I suppose, e.g. if a feature branch affected by bug a is cherry-picked into the would-be 0.40.0 final release branch then testers and devs should still be able to use git bisect to find and fix the bug.

Bug fixes that land on master may actually have a different effected when cherry-picked out of order

You have a point. I have a proposal: so far we always fix bugs in master first and then we cherry-pick them into stable branches. This has worked very well for Launchpad so far. It seems it's not working as well for the 0.40 release series.
Thus considering that shipping 0.40 is top priority for us yet the branch is not enough stable to make a release, how about fixing bugs affecting 0.40 RC directly in the 0.40 branch first?

@ethanfrey
Copy link
Contributor

The current release management workflow has been working pretty well for the 0.39 Release Series. It must be said that 0.39 is very stable and in fact is in maintenance mode only. I'll share some of my thoughts about the problems that seem be affecting the users:

I agree, this "cherrypick and backport" approach works very well for stable release maintenance.

Thus considering that shipping 0.40 is top priority for us yet the branch is not enough stable to make a release, how about fixing bugs affecting 0.40 RC directly in the 0.40 branch first?

It makes most sense to me to actually fix the bugs on the branch they will be released from. The downstream users can immediately test the PR and approve it or request more changes, and then merge it. We must be careful to "forward port" these all into the development branch, but that is no more work than the backport process and you actually fix bugs where they were found.


My big question is what is the difference between releases/0.40.x and master? I don't think anyone knows if there is something in master that shouldn't be in 0.40.x and if so, what. (I have asked) Once 0.40.0 is out and stable, and work is started on 0.41/breaking changes, I think moving towards the launchpad release approach makes a lot of sense. If there is no difference between 2 branches and we just add lots of work for everyone involved, why have them?

I would propose:

  1. Just do 0.40.0-rcX on master until these are actually stable. 30 PRs in a week is far from stable. Tag on master, all dev focused on getting this stable.
  2. When it is stable (and has held up in a testnet) and the dev team wants to start on the new version, moving this to a release branch makes a lot of sense.
  3. If a bug is found when working on master and fixed, the "backport and cherrypick" technique we use for launchpad is quite good.
  4. However, if there is a bug found in production on say v0.40.1, I suggest "hot fixing" it there on the release branch and ensuring the regression is fixed. Then cherry-picking / merging this into master.

Especially with production bugs which are not clear unit tests, you need to fix the broken binary quickly and you cannot be sure you fixed the bug until you use the same version, so "trying to reproduce and fix a similar problem on master, get approvals, merge it, then backport it to the older release, then (at the end) test if it actually fixed the user's issue" seems very hard. These are the kinds of issues that are popping up for 0.40.0-rcX... inability to connect relayer, issues with genesis, etc. Much more agile would be "reproduce the issue on the same software version the user reported it on, prepare a fix for the issue, allow the user to test the fix, then get approvals on the code, merge it. And with this confidence, forward port it to master".

@ethanfrey
Copy link
Contributor

TL;DR:

Launchpad-style is great when we are stable. When all development is really focused on release, there is no need for 2 branches.

-rc0 was just "first time we suggest external people test this" and far from stable. I assume it will be 1-2 months til we all have very high confidence in 0.40.0 stability. Until we are confident enough to propose this for a hub upgrade, let us chose a workflow that maximizes velocity in getting stargate on the hub.

@ethanfrey
Copy link
Contributor

The problem now is we cherry-pick in a batch type of way. What should really be happening is as a PR lands on master, the issue or PR should be labelled and immediately cherry-picked into the long-living branch...not at the very end of the RC process. This is the PR author's responsibility. Trust me, I know how tedious and laborious the process is as I was the only one doing it for quite some time.

I agree with this as well. The release branches need to be up to date (on a daily basis). And no one is really doing it.

But I still hear no reason why the Cosmos SDK team needs 2 branches right now. Who decided that a release branch was made before anyone had actually tested the code on a testnet? I think it just kind of happened. There is a whole ecosystem of tools that need to be tested against 0.40.0 before it can be called stable (and many cannot even start testing as we cannot get a stable enough testnet up yet).

@clevinson
Copy link
Contributor

@ethanfrey the perhaps lazy reason that we have two branches right now, as I see it, is historical. I don't know of any particular PRs int master from the last week that were for a post-Stargate feature, but the flip side of that is- there are a more PRs into master than the Stargate release team can actively track, and I'm not aware if we currently have branch protection rules in place to make those assurances.

That being said, I just went through and looked at all the PRs into master since RC0, and indeed it looks like the ones that weren't included in backports were all quite minor improvements, and no big new breaking changes that explicitly don't belong in a release.

It sounds like most folks here are in favor with @aaronc's proposal here. Shall we move to decision to accept this proposal as is? Or are there concrete amendments that others would like to see?

Main question I have is- do we want to temporarily update branch protection rules on master in order to have better assurance during a "release cycle" that non-release related PRs aren't merged into master? I don't have access to see branch protection rules, so I can't really tell if waht we have at the moment ins sufficient.

In the meantime, I can go ahead create a tracking issue to replace my backport PR, which will serve as a checkbox list for RC2 since there are a few specific issues we are waiting on for that.

@alessio
Copy link
Contributor

alessio commented Oct 31, 2020

Yes, I agree with @aaronc's proposal. There is only one concern that I'd like to bring up:

When we are in a major release RC cycle, we don’t accept merges into master that we don’t want in the release

This risks to slow down the development of new features/ADRs. Teams that are not directly involved in the development/finalization process of a particular major release (I'm assuming here that with major release we are referring to vMAJ.MIN.0 releases) may get stuck and have their PRs blocked until the major release is out.

That is mainly the rationale behind my proposal of fixing bugs affecting the major release being finalised directly in its branch first. We could always forward port fixes to master as suggested by @ethanfrey.

This is a bit like how Gitflow release branches work, i.e. you branch out a release branch when you want to start finalizing it and you merge bugfixes in it. The main difference with Gitflow here is that out hotfix branches will target the release branch that is currently being finalised first - still hope it sounds like a good compromise?

@aaronc @clevinson wdyt?

@robert-zaremba
Copy link
Collaborator

robert-zaremba commented Nov 2, 2020

Cherry picking makes sense if we already released a new version. In that case we should only cherry-pick critical bug fixes (or as @ethanfrey suggested, do the hotfix in release branch and cherry pick to master).
Otherwise, like we mentioned many times, the current process is not efficient for anyone (the release manager, other teams who relay on the release code branch, ...).

@robert-zaremba
Copy link
Collaborator

Point releases can even happen off of master if they haven’t introduce any breaking changes

According to the semantic versioning, in our case point releases (the x in 0.a.x) shouldn't reflect any features. Should be limited as much as possible. Ideally only important bugs.

0.x is a normal release (new features).

Later, once we have 1.0, we basically shift. a.b.x is a bug fix, a.x is a "normal release". x.0 is a breaking release.

@robert-zaremba
Copy link
Collaborator

When we are in a major release RC cycle, we don’t accept merges into master that we don’t want in the release

This risks to slow down the development of new features/ADRs.

@alessio I don't see it as a risk. It's a benefit. During the RC cycle we should focus to tag the release as fast as possible.

@alessio
Copy link
Contributor

alessio commented Nov 2, 2020

During the RC cycle we should focus to tag the release as fast as possible.

Yes, if work is done on a separate release branch then tagging is still possible I suppose.

I don't see it as a risk. It's a benefit.

Well, I disagree. There are teams that are working on ADRs and feature that quite certainly won't make it in time for 0.40. Why should we block them?

@robert-zaremba
Copy link
Collaborator

robert-zaremba commented Nov 2, 2020

Well, I disagree. There are teams that are working on ADRs and feature that quite certainly won't make it in time for 0.40. Why should we block them?

No need to block them. They will be asked to target their changeset to a different branch, eg develop.

  • develop will be the default branch for new things during the RC release
  • master will be a branch for advancing the RC
  • we should also have a bot to merge master into develop whenever master gets updated
  • once a final release is tagged, we switch master to release-xyz, and develop to master.

My original idea, when we were discussing release process, was to be able to merge directly into release-xyz with PRs related to RC and then merge release-xyz into master automatically if possible.

@alessio
Copy link
Contributor

alessio commented Nov 2, 2020

I've never seen a branch develop being mentioned before in this issue. All in all, your proposal seems quite more complicated than what we've been discussing here so far? I may be wrong, but I'd like to hear from @aaronc @clevinson @alexanderbez

@alexanderbez
Copy link
Contributor

No, we cannot go back to using develop. We did in the past and it was a nightmare. We've tried many solutions in the past @robert-zaremba and using a single canonical branch is the best approach IMHO.

Can we close this issue @aaronc? I thought we agreed on a process last SDK call? The process is really quite simple and doesn't deviate much from the current process:

  • A single canonical branch (master/main)
  • We enter a "feature-freeze" (loosely defined)
  • RCs are created off of master/main
  • A final release is cut and a long living branch is created (e.g. release/0.40.x) in which all subsequent point/non-breaking releases are cherry-picked into (in addition to any applicable LTS versions) by the respective PR author/contributor.

It's that simple.

@robert-zaremba
Copy link
Collaborator

I'm fine with that 👍 . My understanding was that we need some place to merge all things from community which we are not considering as a part of RC.

@alexanderbez
Copy link
Contributor

I think we can accomplish that with have a rapid and iterative RC cadence + a feature freeze. This means RCs cannot take weeks upon weeks though.

@aaronc
Copy link
Member Author

aaronc commented Nov 2, 2020

I agree we shouldn't create a separate develop branch. And my understanding is pretty similar to what you're sharing @alexanderbez in that we agreed to a process on Friday and it's basically what you outlined, with maybe some small changes to branch protection rules which I'll outline in a minute.


@alessio I hear your concern about blocking other teams from merging into master. I would note that we are ourselves holding off on a bunch of stuff that we want to merge into master. While, I don't think we should create a separate develop branch, maybe a git flow like process like you propose could work where work on the release gets merged directly into release/X.X and then back into master. The challenge with that is that currently we only allow squash merges and so I'm not sure how we would cleanly merge release/X.X -> master.

My alternative proposal to this is simply that teams working on features that aren't quite ready for master merge those into long-lived feature PRs, and we can even pre-approve those PRs but put a do-not-merge label until after release. Would this be a viable alternative to the git flow approach?

@aaronc
Copy link
Member Author

aaronc commented Nov 2, 2020

Also if we get to a place where the only thing between an RC and release is say a Tendermint release, i.e. no big bugs, then we definitely shouldn't block master and cherry pick onto release/X.X. But currently 95% of the stuff on master is related to RCs.

@alessio
Copy link
Contributor

alessio commented Nov 2, 2020

I am not a big fan of Gitflow - just to be clear :) and I like @alexanderbez's proposal, I'd just add a proposal:

We enter a "feature-freeze" (loosely defined)

I like the feature freeze concept, but if it is only loosely defined then we need some sort of hard freeze too, i.e. a point in time starting from which we block PRs from being merged because they don't meet some release-critical criteria (these should be very well defined instead) - else we'll risk to keep merging features that would increase the risk of delaying the cut of a release. In other words, at some point we should enforce a hard rule like from now onwards, only bugfixes please, no exception shall be made.

Maybe at that stage we can fork out a branch dedicated to release stabilization so that ordinary development can go on on master?

@alexanderbez
Copy link
Contributor

To not have a too disruptive of an RC process, we can prohibit any "breaking" changes, instead of a full-fledged feature freeze. This way we don't have PRs standing idle for weeks.

@robert-zaremba
Copy link
Collaborator

RC cadence + a feature freeze

Good, this is still an improvement and helps all SDK users. I have a feeling that with future releases, it will be challenging to have RC very short. Big stakeholders (Gaia, ...) will like to run "big bang" test nets during a RC. Moreover, (I hope) we may have more contributors developing new things.
But we can handle it later once this will be important and iterate ✌️ .

@aaronc
Copy link
Member Author

aaronc commented Nov 2, 2020

To not have a too disruptive of an RC process, we can prohibit any "breaking" changes, instead of a full-fledged feature freeze. This way we don't have PRs standing idle for weeks.

Exactly, that's what I've been saying. IMHO we have a pretty thorough multi-stakeholder review process in the SDK. We don't just merge stuff randomly. So no half-baked features or unintended breaking changes (I say unintended because sometimes we need to "break" something because of RC feedback). But if somebody delivers a complete additive, non-breaking piece, I'm not sure how it's a problem to deploy that to users. It got reviewed, tested, etc. Why not "release early, release often"? A few non-breaking additive "features" that have gotten in were things users felt were missing.

@clevinson
Copy link
Contributor

clevinson commented Nov 2, 2020

Sounds like there's enough alignment here to proceed with tagging Stargate RC2 off of master (which will likely happen today).

Before closing this issue, and to align on any final details, I can summarize what I see as this convo's consensus as a PR into CONTRIBUTING.md's release procedure. Should we also update STABLE_RELEASES.md to reflect some of the pieces discussed here?

@alexanderbez
Copy link
Contributor

Yes please @clevinson 🙏. Thank you

@alessio
Copy link
Contributor

alessio commented Nov 4, 2020

Sounds like there's enough alignment here to proceed with tagging Stargate RC2 off of master (which will likely happen today).

What happens to PRs that implement ADRs (e.g. Rosetta API support)? Will they be put on hold until Stargate's final is out?

@aaronc
Copy link
Member Author

aaronc commented Nov 4, 2020

Sounds like there's enough alignment here to proceed with tagging Stargate RC2 off of master (which will likely happen today).

What happens to PRs that implement ADRs (e.g. Rosetta API support)? Will they be put on hold until Stargate's final is out?

This is what I personally think should be the litmus test:

  • does the PR break anything?
  • does the implementation feel complete enough that we feel comfortable handing it to users?
  • are there any potential negative security implications to releasing it "early"? i.e. has it been tested, reviewed thoroughly enough?

If all of those are yes then I personally see no problem with them getting merged.

Also, in protobuf we can version different packages separately some things can be alpha or experimental. I think it's pretty common in software for features to be marked as "experimental" if we're not quite sure but it's more or less ready to be tried by users.

Where it doesn't have negative security implications or break things, IMHO I think we and users benefit from "release early release often".

Does that help @alessio ?

@ethanfrey
Copy link
Contributor

I agree with merging in non breaking changes, especially if they add new functionality that has been planned. Not random new ideas (to keep some limit) and ideally not modifying core functionality (to keep that as stable as possible, converging on a release).

Both the cosmovisor and Rosetta api support are very valuable pieces that make 0.40.0 much more useful in practice. And they are built around the system more than tinkering with the internals. Both great things to merge in now in my opinion, even if they do include a fair bit of code.

@alessio
Copy link
Contributor

alessio commented Nov 4, 2020

Aaron, I'm fine with anything yet I disagree that this is the optimal process as I'm still failing to understand what's the problem with having master and a release branch.

We used to have a similar loosely defined process as the one you described (i.e. the three questions). It generally works until you discover that allowing developers to merge things at the 11th hour comes with increased 1. regression risk 2. risk of never cutting the release.

This process may appear simple because it's easy to handle it. Yet it comes with risks that delayed releases in the past. Again, we should have staging branch which should host a would-be new release being finalised. Ordinary development should go on master. Patches can be either backported or forward ported (why do we bother too much about this detail anyways? git helps a lot with branch management).

People will come begging to include things at the very last minute. And no one will be in the position to push back because the process is not enough strict. Effective release management needs rules written in stone and little to no room for exceptions.

I neither can nor want to block progress though - I'm outnumbered apparently. So please press ahead.

@ethanfrey
Copy link
Contributor

This was managed very poorly since rc0. And release and master diverged in unclear ways.

I would rather one canonical branch with a limit of random fixes than two branches with certainty that some key fixes are forgotten in release branch (the process one week ago). I agree there could be a better process, especially if there was a very clear release owner, which I think is essential in any case.

There were a slew of major issues with rc1. Once there is some vaguely stable rc that doesn't have major hiccups, I would love to have a slower, focused release process on it.

Alessio, can you point out some prs pouring into master that are not release related? I agree too many changes to the core will delay release indefinitely

@aaronc
Copy link
Member Author

aaronc commented Nov 4, 2020

@alessio I am not opposed to staging or develop. Although I imagine others are. I believe @alexanderbez has expressed that. Also, I would ask how we get around squash merges from staging -> master? Squashing seems suboptimal if we're doing something like that.

I want to note that we did not delay rc2 indefinitely - we quickly cut the release Monday once all must haves were merged

@alessio
Copy link
Contributor

alessio commented Nov 4, 2020

I would ask how we get around squash merges from staging -> master?

Once you've got a commit in staging (which I assume is the result of a PR squashed into it), you cherry-pick it to a branch based on master, fix conflicts and open the PR. Or maybe I'm missing some context here?

@aaronc
Copy link
Member Author

aaronc commented Nov 4, 2020

So your main tweak to the process @alessio would be to just develop on release/* and cherry pick to master when we're in a release phase? Instead of always master -> release/*?

@alessio
Copy link
Contributor

alessio commented Nov 4, 2020

So your main tweak to the process @alessio would be to just develop on release/* and cherry pick to master when we're in a release phase? Instead of always master -> release/*?

Yes that's correct.

But really hold on. I want to delay release as much as I want to block the work on this issue here: not in the slighest :)
Please go ahead. I understand these are special circumstances and as such they may require special handling.

I'm happy to help either way!

@ethanfrey
Copy link
Contributor

ethanfrey commented Nov 4, 2020

Once you've got a commit in staging (which I assume is the result of a PR squashed into it), you cherry-pick it to a branch based on master, fix conflicts and open the PR. Or maybe I'm missing some context here?

So you don't just wait for 2 weeks, then try to sort out what belongs where? (The previous process addressed by this issue). The master = 0.40 strategy works as it is simple, and everyone follows it. No one was actually backporting their PRs to the release branch, although all seemed to think that was how it should work.

I think the key here is a release owner who works at least half time focusing on this release grooming, so all other devs can keep up high dev speeds, while we keep regressions out of the release.

@clevinson was kind of fulfilling that role before, maybe he wants to, but I think it was not a very clear, active role. And I never had the feeling he actually had a say as to what belonged in the release and what not?

It would be good to have someone with a clear opinion there. Ideally they can just ensure all the proper PRs are in the release (by backporting them within 24 hours) and provide a clear person to talk to if you think your PR should be in the release. Ideally the two can resolve this alone. Of course, this could be escalated to a larger discussion if there is an unresolvable disagreement, but I think that should be less than 5% of the PRs and the rest can quickly be in "release" or "not release" buckets.

One questions, however. If we make the release/0.40.x branch again, what is master targetting? 0.40.1? 0.41? That is an essential question. I think we have plenty non-breaking improvements that we will want in 0.40.1 and it may make sense to stop pushing those into a release branch to just get bugfixes in there. However, if we are throwing both 0.40.1 and 0.41 code into master, it will never make it into 0.40.1

@alessio
Copy link
Contributor

alessio commented Nov 4, 2020

I think the key here is a release owner who works at least half time focusing on this release grooming, so all other devs can keep up high dev speeds, while we keep regressions out of the release.

Or a couple of release owners/managers, that's fine too. More importantly we need release critical criteria, by which release managers could make justified decisions, e.g. PR #nnnn meets the release critical criteria, thus it can go in or PR #nnnn does not meet release critical criteria - we don't merge it.

That makes the process predictable, testing effort's outcome reliable and debugging via git bisect super easy.

@aaronc
Copy link
Member Author

aaronc commented Nov 4, 2020

It would be good to have someone with a clear opinion there. Ideally they can just ensure all the proper PRs are in the release (by backporting them within 24 hours) and provide a clear person to talk to if you think your PR should be in the release. Ideally the two can resolve this alone. Of course, this could be escalated to a larger discussion if there is an unresolvable disagreement, but I think that should be less than 5% of the PRs and the rest can quickly be in "release" or "not release" buckets.

I think this is pointing to the need for greater coordination between teams. We do that in our Friday calls and in general that is where we are establishing release criteria as @alessio is mentioning. I'm not sure that there's a problem with that process. But there are still things that are blocking release. And I would note that from my perspective I do not think it has anything to do with scope creep which this discussion appears to be about. There are still core pieces that need to be fixed as I understand it. (@clevinson maybe you can comment more on what those are as I believe you have the context).

One questions, however. If we make the release/0.40.x branch again, what is master targetting? 0.40.1? 0.41? That is an essential question. I think we have plenty non-breaking improvements that we will want in 0.40.1 and it may make sense to stop pushing those into a release branch to just get bugfixes in there. However, if we are throwing both 0.40.1 and 0.41 code into master, it will never make it into 0.40.1

Our team is currently trying to prioritize 0.40 and 0.40.1. We are also working on stuff slated for 0.41, but that stuff is non-breaking (x/authz, x/fee_grants, and x/group). Either way we are currently holding off merging until we're done with Stargate.

@odeke-em
Copy link
Collaborator

odeke-em commented Nov 4, 2020 via email

@aaronc
Copy link
Member Author

aaronc commented Nov 4, 2020

I'm still trying to understand if that's actually a problem. @alessio is your team feeling blocked from merging into master at this moment? Is that a problem today?

(And as I said, I personally don't see a problem if Rosetta got into a Stargate release if it's ready... I don't see how a process of intentionally delaying Rosetta for instance will get to Stargate quicker. It seems orthogonal, nice to have, not blocking IMHO. Other things are blocking Stargate not feature creep IMHO)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants