Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plans for v3.4.19 release #14105

Closed
lavacat opened this issue Jun 11, 2022 · 17 comments
Closed

Plans for v3.4.19 release #14105

lavacat opened this issue Jun 11, 2022 · 17 comments

Comments

@lavacat
Copy link

lavacat commented Jun 11, 2022

As per discussion during community meeting, this issue is to estimate potential work and timeline for 3.4.19 release.

3.4.18 release was on Oct 15, 2021. Here is the table of all commits from release-3.5 branch (except 'tests: *', 'scripts: *' and 'Merge pull request *') since Oct 6, 2021. I'm assuming that fixes were backported before that.

To proceed, let's agree on:

  1. What version of go to use for 3.4? I suggest 1.17 since 1.15 is EOL. We can keep go.1.12 in go.mod for compatibility. See [1] and [2]. @ahrtr @ptabor @lilic @hexfusion
  2. Priority of issues to be backported. Feel free to comment in the table above. @serathius @ahrtr

related discussions about go version:
[1] #12840
[2] #13912

@ahrtr
Copy link
Member

ahrtr commented Jun 11, 2022

Thanks @lavacat. Please see my comments below.

Go version

1.17 is definitely not an option for release-3.4. One breaking change introduced in Golang 1.15 is the deprecation of the legacy behavior of treating the CommonName field on X.509 certificates as a host name when no Subject Alternative Names are present. The workaround is to add the value x509ignoreCN=0 to the GODEBUG environment variable to re-enable it again. see go1.15#commonname. Note that GODEBUG=x509ignoreCN=0 flag is removed in Golang 1.17. It means applications which are still use the legacy CommonName field in certificate will run into issue, unless they update their certificates.

So Golang 1.17 isn't accepted for 3.4. Instead, both Golang 1.15 & 1.16 are OK for 3.4. Since Golang supports N-2 versions (1.16, 1.17 and 1.18 for now), so Golang 1.16 is better than 1.15. We still need to make sure there is NO any impact on the existing applications which are using or depending on 3.4.

We also need to evaluate the effort to support Golang 1.16 for 3.4.

Making the pipeline green is the priority

Just I mentioned in issuecomment-1119280894, the top priority is to fix all the test failures and make the pipeline green. Please let's get this done before talking about the release plan for 3.4.19.

Backport

Again, fixing the pipeline issue is the top priority for now. With regarding to the backporting, I'd suggest to only backport security and major bug fixes. Of course, it's open to discuss.

@lavacat
Copy link
Author

lavacat commented Jun 18, 2022

@ahrtr here is what I have right now #14134

Running into issues with integration tests in golang:1.16 docker container. Can't get a clean run even after removing tests that timeout. There is always another test that timeouts on next run.

@ahrtr
Copy link
Member

ahrtr commented Jun 23, 2022

Just as I mentioned in issues/14135, the 3.4 pipeline had never been green since its first day (Jun 24, 2021) being created. It's really a serious problem to me, so I just jumped in and spent about two whole days to get it resolved in pull/14136.

Although there are still some flaky test failures, but the pipeline can be green after about 1 ~ 2 retries. So we have a good start for now for 3.4 pipeline.

There are still lots of work to do before releasing 3.4.19. The rough plan is something like below,

Milestone 1: Stabilize the pipeline

Milestone 2: cherry pick PRs from 3.5 to 3.4

I will try to figure out a list later, the table provided by @lavacat is a good reference . The high level thought is we should only cherry pick bug fix and security changes. I think we might need to do milestone 2 and milestone 1 at the same time, because cherry picking some bug fixes may also can stabilize the pipeline.

Issues/PRs not required for 3.4.19
We should only backport security fix and major/critical bug fixes.

Milestone 3: release 3.4.19

Once we finish milestone 1 and milestone 2, then we can kick off the release of 3.4.19. It would be very helpful if other experienced maintainers can jump in here. cc @hexfusion who used to maintain the stable releases.


Please feel free to chime in if you think any PR/issues need to be investigated or included in 3.4.19. Please also feel free to let us know if anyone has any concerns or comments. cc @serathius @ptabor @spzala @hexfusion @lavacat @endocrimes.

I will cherry pick 14087 and 13932 to 3.4 sometime later. Anyone feels free to work on any item, just drop a message. The task Add some pipelines with RACE enabled is a priority, if nobody works on it in the following 1~2 weeks, then I may jump in to do it.

@lavacat
Copy link
Author

lavacat commented Jun 27, 2022

FYI, added #14168
I've run into this flakiness locally

@ahrtr
Copy link
Member

ahrtr commented Jun 27, 2022

Thanks @lavacat , the test failure is fixed in 14151, which isn't merged yet.

@lavacat
Copy link
Author

lavacat commented Jun 29, 2022

Item 2 o the list #14179

@ahrtr
Copy link
Member

ahrtr commented Jun 30, 2022

@lavacat would you have bandwidth and be interested in having a deep dive into 14158 and/or 14159 ? I ran into 14158 multiple times, but do not get time to have a deep dive so far.

@lavacat
Copy link
Author

lavacat commented Jun 30, 2022

@ahrtr yes, will take a look tomorrow.

@ahrtr
Copy link
Member

ahrtr commented Jul 3, 2022

Talked to @serathius & @spzala , and also after second thought, I think we should only backport security fixes and major/critical bug fixes to 3.4.19, so I removed some items from the list. Please see the list in the milestone 2. Reasons:

  1. It isn't good to stay on 3.4 longer because it's 3 years old release. Users are recommended to upgrade to 3.5.4+ if they need any new features or all known bug fixes;
  2. We still need to support 3.4.x, otherwise we will break release maintenance promises. So we should still release 3.4.19 although users are recommended to upgrade to 3.5.4+;
  3. To minimize the impact, we should only backport security fixes and major/critical bug fixes to 3.4.19. It seems that there is no any major/critical bug on 3.4.18 so far, so we should only backport security fixes. For any bug fixes which have already been cherry picked to 3.4.19, let's keep it as it's. Please anyone feel free to feedback if you really need any PR to be cherry picked to 3.4.19.

@serathius @spzala @ptabor @hexfusion @mitake @dims and anyone please feel free to comment if you have concerns.

@spzala
Copy link
Member

spzala commented Jul 5, 2022

Talked to @serathius & @spzala , and also after second thought, I think we should only backport security fixes and major/critical bug fixes to 3.4.19, so I removed some items from the list. Please see the list in the milestone 2. Reasons:

1. It isn't good to stay on 3.4 longer because it's 3 years old release. Users are recommended to upgrade to 3.5.4+ if they need any new features or all known bug fixes;

2. We still need to support 3.4.x, otherwise we will break [release maintenance promises](https://github.com/etcd-io/etcd/issues/13912). So we should still release 3.4.19 although users are recommended to upgrade to 3.5.4+;

3. To minimize the impact, we should only backport security fixes and major/critical bug fixes to 3.4.19.  It seems that there is no any major/critical bug on 3.4.18 so far, so we should only backport security fixes. For any bug fixes which have already been cherry picked to 3.4.19, let's keep it as it's. Please anyone feel free to feedback if you really need any PR to be cherry picked to 3.4.19.

@serathius @spzala @ptabor @hexfusion @mitake @dims and anyone please feel free to comment if you have concerns.

@ahrtr @lavacat thanks for the discussion in this issue. I agree on both - 1) we should support 3.4.x and 2) going with needed fixes (e.g. security fixes or needed fixes requested by the Kubernetes project/other users as @ahrtr mentioned) in the 3.4.19. For new features/improvements, etcd users should try to move to the latest release.

@ahrtr
Copy link
Member

ahrtr commented Jul 8, 2022

All items included in milestone 1 and milestone 2 are basically done.

There are two unresolved issues in milestone 1, and @lavacat is still progress of investigation. But both of them should only be test issues, and the pipeline can be green after about 1~2 retries when running into the issues. So I don't think they are blockers. If @lavacat can get them resolved soon, then I am OK to merge the PR(s). @lavacat could you update on the issues?

I moved #13895 out of milestone 2, because it may have big impact, and etcd isn't subject to the CVE. Please see my comment in #14191 (comment). So it should be safe. All items in milestone 2 are done.

@endocrimes have you finished the Jepsen test on 3.4? I recall that you mentioned you only reproduced a couple of @aphyr 's bugs. Have you found any new issues?

I think we are ready for the milestone 3. cc @serathius @hexfusion @spzala @ptabor

@serathius
Copy link
Member

Looks great! Thanks for all the help. I think we can move forward with the release.

@spzala
Copy link
Member

spzala commented Jul 11, 2022

All items included in milestone 1 and milestone 2 are basically done.

There are two unresolved issues in milestone 1, and @lavacat is still progress of investigation. But both of them should only be test issues, and the pipeline can be green after about 1~2 retries when running into the issues. So I don't think they are blockers. If @lavacat can get them resolved soon, then I am OK to merge the PR(s). @lavacat could you update on the issues?

I moved #13895 out of milestone 2, because it may have big impact, and etcd isn't subject to the CVE. Please see my comment in #14191 (comment). So it should be safe. All items in milestone 2 are done.

@endocrimes have you finished the Jepsen test on 3.4? I recall that you mentioned you only reproduced a couple of @aphyr 's bugs. Have you found any new issues?

I think we are ready for the milestone 3. cc @serathius @hexfusion @spzala @ptabor
+1 Great work here! Thank you!

@serathius
Copy link
Member

Sorry for being late, however only now had time to look through production issues.
It struct me that there were fixed in etcd, however never backported to v3.4.
Maybe it would make sense to look at them again if they should be backported.
List:

As this is late, feel free to skip them for this release. However I would recommend we consider to them for next one.

@endocrimes
Copy link
Contributor

finished up my jepsen testing, I managed to replicate aphyr's findings, but didn't come across anything too different. Seems like no "new" issues, so 👍

@ahrtr
Copy link
Member

ahrtr commented Jul 12, 2022

Thanks all for the feedback.

As this is late, feel free to skip them for this release. However I would recommend we consider to them for next one.

Agreed. Let's consider to cherry pick them in 3.4.20 so as to minimize the impact, just as we discussed in #14105 (comment) . The biggest change against the last release (3.4.18 released on Oct 15, 2021) is we bumped golang from 1.12 to 1.16, and also some system packages.

I will kick off releasing 3.4.19 once my last PR #14210 is approved & merged.

@ahrtr
Copy link
Member

ahrtr commented Jul 12, 2022

v3.4.19 is just released! Thanks everyone!

https://github.com/etcd-io/etcd/releases/tag/v3.4.19

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants