-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-3352: Aggregated Discovery KEP to Alpha #3364
Conversation
Jefftree
commented
Jun 8, 2022
- One-line PR description: Add KEP for publishing an aggregated discovery document. This will reduce the number of requests by clients when fetching all api resources from the server.
- Issue link: Aggregated Discovery #3352
- Other comments:
fd391c2
to
553bc5d
Compare
@Jefftree forgot a push maybe? I see some comments resolved, but not an update on the PR. |
fbb455e
to
ffccd6a
Compare
|
||
### Aggregation | ||
|
||
For the aggregation layer on the server, a new controller will be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will likely need to have an input to readiness and you'll need to filter the discovery results by applicable group/version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's real bad if the first few clients get an empty doc because this controller hasn't run yet, but it's also bad to delay the cluster startup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If api server signals it is ready even when discovery aggregation is not completed, this will not only affects clients but also causes flakiness of some e2e tests depends on discovery?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If api server signals it is ready even when discovery aggregation is not completed, this will not only affects clients but also causes flakiness of some e2e tests depends on discovery?
More importantly, that causes bad behavior of the namespace lifecycle controller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yeah that's another thing to consider, how does this affect that controller.
- will clients be able to know that e.g. a specific GV's backing apiserver hasn't been responsive for a while?
- if an apiserver restarts in that situation, how does it build the cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added that this will be an input to readiness.
will clients be able to know that e.g. a specific GV's backing apiserver hasn't been responsive for a while?
Yes, added a lastContacted time for all group versions
Is this still WIP or do you want it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all my comments are minor except for the one about what do we do on apiserver startup.
|
||
### Aggregation | ||
|
||
For the aggregation layer on the server, a new controller will be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's real bad if the first few clients get an empty doc because this controller hasn't run yet, but it's also bad to delay the cluster startup.
9e947ba
to
62b02c2
Compare
currently be found under `apis/<group>/<version>` and `/api/v1` for | ||
the legacy group version. This discovery endpoint will support | ||
publishing an ETag so clients who already have the latest version of | ||
the aggregated discovery can avoid redownloading the document. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean that client will no longer cache documents as files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or caching just one file with TTL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we will skip the serverresources.json
caching mechanism that is provided by client-go. Instead, we will rely on the httpcache library to automatically cache and use ETags. kubectl uses the diskcache provided by http cache so we should still be writing the cached document to a file.
|
||
This is approximately what the new API will look like (conflicting names will be renamed) | ||
|
||
```go |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good in concept.
There will be some details to work out either here or during implementation of the controller and APIs. @lavalamp is here next week, so I'll leave where that ends up. PRR lgtm too. I volunteer to review the implementation for this one. Please ping me on slack as PRs get opened up. /approve |
8d9b7a5
to
6f45e97
Compare
type APIGroupList struct { | ||
TypeMeta `json:",inline"` | ||
// groups is a list of APIGroup. | ||
Groups []APIGroup `json:"groups" protobuf:"bytes,1,rep,name=groups"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason to have the group level? Why not just have all group versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the separation so we can indicate the priority of group-versions within a group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That field is commented out below, is that intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, per David's comment, we'll remove the preferredVersion field and sort the Versions list based on priority. The preferredVersion should be at the top of the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, if you mean that the GroupVersion field is commented out, this was based on #3364 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, you can still keep it sorted even if the list was flat, but we can work this out in the code.
6459070
to
669cdc2
Compare
|
||
// LastContacted is the last time that the apiserver has successfully reached the | ||
// corresponding group version's discovery document. This will be nil if the group-version | ||
// has not been aggregated yet. To maintain consistency across scenarios with multiple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is nil, does that imply that APIResources above is an empty list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, added a note
669cdc2
to
fcec4e1
Compare
/lgtm thanks! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deads2k, Jefftree, lavalamp The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
api servers may take longer to respond and we do not want to delay | ||
cluster startup, the health check will only block on the local api | ||
servers (built-ins and CRDs) to have their discovery ready. For api | ||
servers that have not been aggregated, their group-versions will be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a typo? "have not been aggregated"? I expected to see a description of what client visible behavior we'd see for the discovery doc on initialization:
- Built-ins - always visible?
- CRDs - only visible once CRDs are loaded, so might be visible?
- Aggregated APIs - could take materially longer to be visible?
I'd like to see some examples added to the KEP on the outcome for clients to clarify what they see during startup and how they know discovery is incomplete. Will an incomplete discovery doc return a non 200 error message? Will clients be expected to retry if the document is not complete?
Asked in the #octant channel, poked the OpenShift UI folks for what their feedback on watch was (Sam said poll with etag would be enough). I looked at a couple of places that do discovery:
I suspect that the aggregated api endpoint would be very valuable for removing some code, but latency and the need not to have a complex cache would be the big win. Will poke at others to see if we have any examples of dynamic discovery clients in the ecosystem that would benefit from better discovery as well who would potentially be impacted if we hit unrepresentable things. |
For OpenShift UI, we have the following wants:
The proposal definitely addresses (1). I think polling with ETags is sufficient for (2). Happy to see this 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kep.yaml needs to be updated to reflect 1.26.
needs to be refreshed after 6 hours, even if it hasn’t expired. | ||
|
||
This not only impacts kubectl, but all clients of kubernetes. We can | ||
do better. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the nuances we discussed in comments deserve to be in the KEP:
- Which clients are we targeting primarily: clients that need to frequently request or maintain a discovery cache, like kubectl or web interfaces
- We want all clients to benefit from this, but our alpha is targeted at solving these problems for those key consumers
- We have some use cases that may require additional work to correctly support, we should identify those before entry to beta / reach consensus on their utility:
- Namespace controller needs to ensure that discovery documents are refreshed after the namespace goes into the terminating phase (needs to guarantee that it sees the list of resources that could have been created within the namespace before the namespace is deleted). The controller needs to be able to guarantee discovery doc list happens-after that, which is best solved by letting the namespace controller "wait" for the next refresh. What we can't allow is for the namespace controller to not observe that resource
- The use of server side caching may break some client side batch loops done with kubectl that need happens-before semantics, we should consider that usecase in kubectl and ensure we have a mitigation
- Polling being sufficient over watch for most users
- The API types in meta/v1 should be meta/v1alpha1 to start, not meta/v1