-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propose project roadmap #15499
Propose project roadmap #15499
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
# etcd roadmap | ||
|
||
This document defines high level goals for project. | ||
|
||
## Milestones | ||
|
||
* [P0] Etcd releases are qualified by rigorous robustness testing | ||
* [P0] Etcd can reliably detect data corruption | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How does the above
How does the stalled write due to slow disk fit into the Milestone? @ahrtr What about lease redesign? #14094 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I would want to avoid touching random parts of apply code without better testing and clear goal. Removal of v2 API and following cleanup should already improve the situation and takes priority.
This should be covered by
Documentation is done, we just need to add test. I think we should file an issue as important but goal milestone itself should be tracked as part of improvements to testing.
This is somewhat new effort that is still not well defined. For me it comes under reliability, which is important, but as it relates to hardware failures it's not something etcd tackled yet. However with recently reported #15498 I would want to propose "etcd is resilient to hardware failures" soon.
Correctness should be our top priority, however leases have been broken for long time and no-one cared (K8s also doesn't). As so I would treat it second priority to KV API. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the scope for this item clear? IIRC there was a discussion on corruption detection per key/value, then there were some discussions around merkle trees and partitioning the keyspace. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If you mean corruption detection scope, then not. I didn't have time to define it. It's pretty large issue to tackle and are multiple ways to approach it. Main challenge is balancing breaking changes and short term vs long term improvements. I have couple of ideas that I discussed with @ptabor, but didn't have time to write them down as I want to focus on finishing robustness tests first (not too long). Happy to make the scope clearer if someone is interested in working on it. Still would like to encourage people to work on robustness tests first, as they also help v3.4 and v3.5 releases. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is there any tracking issue for changes to catch corruption per key/value? |
||
* [P1] Experimental features are graduated or removed | ||
* [P1] Etcd testing is high quality, easy to maintain and expand | ||
* [P1] Etcd v2 API and storage is removed and code cleaned up | ||
* [P1] Etcd supports zero-downtime downgrade | ||
* [P2] Etcd can automatically recover from data corruption | ||
|
||
Each listed milestone should have a corresponding | ||
[issue](https://github.com/etcd-io/etcd/issues) or | ||
[milestone](https://github.com/etcd-io/etcd/milestones) on GitHub. | ||
If it doesn't please [let us know](https://github.com/etcd-io/etcd#contact). | ||
|
||
### Priorities | ||
|
||
* P0 - Critical for reliability of the v3.5 and v3.4 releases. Should be prioritized this over all other work and back-ported. | ||
* P1 - Important for long term success of the project. Blocks v3.6 release. | ||
* P2 - Stretch goals that would be nice to have for v3.6, however should not be blocking. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be careful about redefining meaning of the words.
So far in etcd we were using milestones = future minor release: v3.5, v3.6.
Here we naming as milestone a focus area we want to invest.
Let's use consistent terms. My proposal it to think about this as 3 level hierarchy:
Milestones -> are milestones as defined in https://github.com/etcd-io/etcd/milestones. Let's keep them as publicly visible releases (might be patch).
Efforts
Still I would represent project as umbrella issue -> as it seems that project cannot be assigned to a milestone.
Issues - for individual work items.
Now the question remains:
If we have a tool to dynamically track the milestones with attached efforts / items, do we need to redundantly track it in a markdown doc ?
And I would say - we don't. I assume that the purpose of the doc is different. It's a statement of intent what we want to focus in following releases. And thanks to being submitted by maintainers and reviewed, it forces them to be on the same page (as opposed to an individual maintainer assigning an issue to a milestone). But If that's the goal, let's call it explicitly in the preamble to this doc.
Then let's have:
Milestones:
release-v3.6
The main focus of the v3.6 is the reduction of technical debts. The explicit goal is to avoid new features.
The focus will be on:
release-v3.5.x
The same as release-v3.4.x.
release-v3.4.x
The release focuses on stability. Etcd maintainers are going to backport: